Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-01-25T22:00:08.199Z Has data issue: false hasContentIssue false

RECFMM: Recursive Parallelization of the Adaptive Fast Multipole Method for Coulomb and Screened Coulomb Interactions

Published online by Cambridge University Press:  21 July 2016

Bo Zhang*
Affiliation:
Center for Research in Extreme Scale Technologies, Indiana University, Bloomington, IN, 47404, USA
Jingfang Huang*
Affiliation:
Department of Mathematics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
Nikos P. Pitsianis*
Affiliation:
Department of Electrical and Computer Engineering, Aristotle University of Thessaloniki, GR-54124, Greece Department of Computer Science, Duke University, Durham, NC, 27708, USA
Xiaobai Sun*
Affiliation:
Department of Computer Science, Duke University, Durham, NC, 27708, USA
*
*Corresponding author. Email addresses:zhang416@indiana.edu (B. Zhang), huang@amath.unc.edu (J. Huang), nikos@cs.duke.edu (N. P. Pitsianis), xiaobai@cs.duke.edu (X. Sun)
*Corresponding author. Email addresses:zhang416@indiana.edu (B. Zhang), huang@amath.unc.edu (J. Huang), nikos@cs.duke.edu (N. P. Pitsianis), xiaobai@cs.duke.edu (X. Sun)
*Corresponding author. Email addresses:zhang416@indiana.edu (B. Zhang), huang@amath.unc.edu (J. Huang), nikos@cs.duke.edu (N. P. Pitsianis), xiaobai@cs.duke.edu (X. Sun)
*Corresponding author. Email addresses:zhang416@indiana.edu (B. Zhang), huang@amath.unc.edu (J. Huang), nikos@cs.duke.edu (N. P. Pitsianis), xiaobai@cs.duke.edu (X. Sun)
Get access

Abstract

We present RECFMM, a program representation and implementation of a recursive scheme for parallelizing the adaptive fast multipole method (FMM) on shared-memory computers. It achieves remarkable high performance while maintaining mathematical clarity and flexibility. The parallelization scheme signifies the recursion feature that is intrinsic to the FMM but was not well exploited. The program modules of RECFMM constitute a map between numerical computation components and advanced architecture mechanisms. The mathematical structure is preserved and exploited, not obscured nor compromised, by parallel rendition of the recursion scheme. Modern software system—CILK in particular, which provides graph-theoretic optimal scheduling in adaptation to the dynamics in parallel execution—is employed. RECFMM supports multiple algorithm variants that mark the major advances with low-frequency interaction kernels, and includes the asymmetrical version where the source particle ensemble is not necessarily the same as the target particle ensemble. We demonstrate parallel performance with Coulomb and screened Coulomb interactions.

Type
Computational Software
Copyright
Copyright © Global-Science Press 2016 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Blumofe, R., Joerg, C., Kuszmaul, B., Leiserson, C., Randall, K., and Zhou, Y.. Cilk: An Efficient Multithreaded Runtime System. J. Parallel Distr. Com., 37:5569, 1996.CrossRefGoogle Scholar
[2] Carrier, J., Greengard, L., and Rokhlin, V.. A fast adaptive multipole algorithm for particle simulations. SIAM J. Sci. Stat. Comp., 9:669686, 1988.Google Scholar
[3] Cheng, H., Greengard, L., and Rokhlin, V.. A fast adaptive multipole algorithm in three dimensions. J. Comput. Phys., 155:468498, 1999.Google Scholar
[4] Frigo, M., Leiserson, C. E., and Randall, K.H.. The implementation of the Cilk-5 multithreaded language. SIGPLAN Notices, 33:212223, 1998.Google Scholar
[5] Greengard, L. and Huang, J.. A new version of the fast multipole method for screened Coulomb interactions in three dimensions. J. Comput. Phys., 180:642658, 2002.Google Scholar
[6] Greengard, L. and Rokhlin, V.. A fast algorithm for particle simulations. J. Comput. Phys., 73:325348, 1987.CrossRefGoogle Scholar
[7] Greengard, L. and Rokhlin, V.. The rapid evaluation of potential fields in three dimensions. Lect. Notes Math., 1360:121141, 1988.Google Scholar
[8] Greengard, L. and Rokhlin, V.. A new version of the fast multipole method for the Laplace equation in three dimensions. Acta Numer., 6:229269, 1997.Google Scholar
[9] Towns, J., Cockerill, T., Dahan, M., Foster, I., Gaither, K., Grimshaw, A., Hazlewood, V., Lathrop, S., Lifka, D., Peterson, G. D., Roskies, R., Scott, J. R., and Wilkens-Diehr, N.. XSEDE: Accelerating Scientific Discovery. Comput. Sci. Eng., 16:6274, 2014.Google Scholar
[10] Zhang, B.. Asynchronous task scheduling of the fast multipole method using various runtime systems. In Proceedings of the Forth Workshop on Data-Flow Execution Models for Extreme Scale Computing, Edmonton, Canada, 2014.Google Scholar
[11] Zhang, B., Huang, J., Pitsianis, N. P., and Sun, X.. Dynamic prioritization for parallel traversal of irregularly structured spatio-temporal graphs. In Proceedings of 3rd USENIX Workshop on Hot Topics in Parallelism, 2011.Google Scholar
[12] Zhang, B., Peng, B., Huang, J., Pitsianis, N. P., Sun, X., and Lu, B.. Parallel AFMPB solver with automatic surface meshing for calculation of molecular solvation free energy. Comput. Phys. Commun., 190:173181, 2015.Google Scholar