Dual-level Parallelism for ab-initio Molecular Dynamics: Reaching Teraflop Performance with the CPMD Code

We show teraflop performance of the fully featured ab-initio molecular dynamics code CPMD on an IBM pSeries 690 cluster. A mixed distributed-memory, coarse-grained parallel approach using the MPI library and shared-memory, fine-grained parallelism using OpenMP directives is used to optimally map the algorithms on the available hardware. The top performance achieved is approx. 20% of the peak performance and an estimated parallel efficiency of approx. 45% on 1024 processors for a system of 1000 atoms. The main limiting factor of parallel efficiency was found to be the latency of the interconnect.

By: J. Hutter and A. Curioni

Published in: Parallel Computing, volume 31, (no 1), pages 1-17 in 2005

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .