论文部分内容阅读
GPU最初是专为图形渲染而设计的,近年来已经演化为高并行度、多线程、具有强大计算能力和极高存储器带宽的通用多核处理器,目前主流GPU的峰值计算能力通常可达CPU的数10倍。这提供了1种解决大计算量难题的新的可能。分子动力学模拟需要极强的计算能力,故使用GPU来进行分子动力学模拟的尝试是很自然的选择。本文基于NVIDIA的GeForce GTX295GPU和CUDA2.3开发环境实现了范德华力计算、范德华势能计算和基于网格的邻居搜索。在邻居搜索算法实现中,对于不同计算能力的GPU给出了不同的实现策略。对36万粒子规模的高分子聚乙烯体系算例的测试表明:1个时间步的计算结果与计算性能突出的分子动力学软件GROMACS相应的计算结果一致(运行在工作站Intel Xeon E 5405上),相对于CPU单核计算性能有大幅提高,其中邻居搜索加速了17倍,范德华力计算加速了47倍;并且解决了邻居搜索时的边界问题。虽然本文是针对范德华力的计算,但是策略是通用的,其他方向的研究人员也可以参考。测试结果表明,使用GPU来加速较大规模计算量的计算是可取的。
GPU was originally designed for graphics rendering. In recent years, it has evolved into a general-purpose multi-core processor with high parallelism, multi-threading, powerful computing power and very high memory bandwidth. At present, the mainstream GPU’s peak computing power is usually up to CPU Number 10 times. This provides a new possibility to solve large computational problems. Molecular dynamics simulations require extreme computational power, so it is a natural choice to experiment with molecular dynamics simulations using GPUs. This article based on NVIDIA’s GeForce GTX295GPU and CUDA2.3 development environment to achieve Van der Waals forces computing, Van der Waals potential computing and grid-based neighbor search. In the implementation of neighbor search algorithm, different implementation strategies are given for GPU with different computing power. The test on a sample of 360,000 particle-scale polyethylene system shows that the calculation result of one time step is consistent with the calculated result of GROMACS, a molecular dynamics software with outstanding performance (running on workstation Intel Xeon E 5405) Relative to the single-core CPU computing performance has been greatly improved, including the neighbor search accelerated 17 times, van der Waals calculations accelerated 47 times; and to solve the border search problems. Although this article is for van der Waals calculations, the strategy is generic and can be consulted by researchers in other directions. Test results show that the use of GPU to accelerate the calculation of large-scale calculation is desirable.