Cross-Platform Optimization and Benchmarking of the Lattice Boltzmann Method on Heterogeneous Architectures
The Lattice Boltzmann Method (LBM) has gained attention for its ability to handle complex fluid dynamics simulations, making it suitable for large-scale industrial applications. However, maximizing the performance of LBM on advanced heterogeneous architectures remains a challenge. In this work, we i...
Saved in:
| Published in: | 2025 IEEE 11th International Conference on High Performance and Smart Computing (HPSC) pp. 37 - 47 |
|---|---|
| Main Authors: | , , , , , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
09.05.2025
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | The Lattice Boltzmann Method (LBM) has gained attention for its ability to handle complex fluid dynamics simulations, making it suitable for large-scale industrial applications. However, maximizing the performance of LBM on advanced heterogeneous architectures remains a challenge. In this work, we introduce a comprehensive software framework specifically designed to support large-scale LBM simulations for industrial applications. Our framework integrates essential components, including a mesh generator, pre-processing and post-processing modules, an efficient LBM solver, and additional features like domain partitioning, parallel I/O, and visualization interfaces. This end-to-end solution aims to streamline large-scale LBM simulations and promote its application in industrial contexts.To achieve high performance and scalability, we propose several optimization techniques tailored to the new generation of heterogeneous supercomputing platforms, including the Sunway supercomputer and the Sugon supercomputer. Our approach includes a customized multi-level parallelization strategy, fusion of multiple kernels with different performance constraints, and code optimization techniques, to fully exploit the computational power of these many-core processors. We achieve 2.76 PFLOPS sustained performance encompassing 4.2 trillion lattice cells coupled with 81.4% memory bandwidth on the new Sunway supercomputer. Scaling from a baseline of 128 MPI processes and 8 DCUs to 8,192 processes and 512 DCUs, the strong scaling efficiency reached 74.2% on a Sugon supercomputer. Our results demonstrate the framework's scalability and performance, highlighting its potential for enabling efficient, large-scale LBM simulations in industrial applications. |
|---|---|
| AbstractList | The Lattice Boltzmann Method (LBM) has gained attention for its ability to handle complex fluid dynamics simulations, making it suitable for large-scale industrial applications. However, maximizing the performance of LBM on advanced heterogeneous architectures remains a challenge. In this work, we introduce a comprehensive software framework specifically designed to support large-scale LBM simulations for industrial applications. Our framework integrates essential components, including a mesh generator, pre-processing and post-processing modules, an efficient LBM solver, and additional features like domain partitioning, parallel I/O, and visualization interfaces. This end-to-end solution aims to streamline large-scale LBM simulations and promote its application in industrial contexts.To achieve high performance and scalability, we propose several optimization techniques tailored to the new generation of heterogeneous supercomputing platforms, including the Sunway supercomputer and the Sugon supercomputer. Our approach includes a customized multi-level parallelization strategy, fusion of multiple kernels with different performance constraints, and code optimization techniques, to fully exploit the computational power of these many-core processors. We achieve 2.76 PFLOPS sustained performance encompassing 4.2 trillion lattice cells coupled with 81.4% memory bandwidth on the new Sunway supercomputer. Scaling from a baseline of 128 MPI processes and 8 DCUs to 8,192 processes and 512 DCUs, the strong scaling efficiency reached 74.2% on a Sugon supercomputer. Our results demonstrate the framework's scalability and performance, highlighting its potential for enabling efficient, large-scale LBM simulations in industrial applications. |
| Author | Liu, Zhao Yu, Hongkun Zhu, Guanghui Shang, Jiandong Fan, Yujing Lv, Xiaojing Liu, Tao Zhang, Wusheng Gao, Zhanyun |
| Author_xml | – sequence: 1 givenname: Guanghui surname: Zhu fullname: Zhu, Guanghui email: jn_zgh@126.com organization: School of Computer and Artificial Intelligence Zhengzhou University Michael Levitt Research Institute for Life Sciences and Digital Convergence Zhengzhou University of Technology National Supercomputing Center in Zhengzhou Zhengzhou University,Zhengzhou,China – sequence: 2 givenname: Xiaojing surname: Lv fullname: Lv, Xiaojing email: jing3704@126.com organization: National Supercomputing Center in Wuxi,China Ship Scientific Research Center,Wuxi,China – sequence: 3 givenname: Zhao surname: Liu fullname: Liu, Zhao email: liuz18@tsinghua.org.cn organization: Michael Levitt Research Institute for Life Sciences and Digital Convergence Zhengzhou University of Technology,National Supercomputing Center in Wuxi,Wuxi,China – sequence: 4 givenname: Tao surname: Liu fullname: Liu, Tao email: liut_nsccwx@163.com organization: National Supercomputing Center in Wuxi,Wuxi,China – sequence: 5 givenname: Wusheng surname: Zhang fullname: Zhang, Wusheng email: zws@tsinghua.edu.cn organization: Tsinghua University,Department of Computer Science and Technology,Beijing,China – sequence: 6 givenname: Yujing surname: Fan fullname: Fan, Yujing email: fanyujing0310@outlook.com organization: National Supercomputing Center in Wuxi,Wuxi,China – sequence: 7 givenname: Hongkun surname: Yu fullname: Yu, Hongkun email: yhk15@mails.tsinghua.edu.cn organization: Tsinghua University,Department of Computer Science and Technology,Beijing,China – sequence: 8 givenname: Zhanyun surname: Gao fullname: Gao, Zhanyun email: jh_g2y@gs.zzu.edu.cn organization: Zhengzhou University National Supercomputing Center in Zhengzhou Zhengzhou University,School of Computer and Artificial Intelligence,Zhengzhou,China – sequence: 9 givenname: Jiandong surname: Shang fullname: Shang, Jiandong email: sjd@zzu.edu.cn organization: Zhengzhou University National Supercomputing Center in Zhengzhou Zhengzhou University,School of Computer and Artificial Intelligence,Zhengzhou,China |
| BookMark | eNotkMtOAjEYhWuiC0XegEVfYLCdMr0sYaJigoFEXZNe_jINTEs6ZSFP7ySyOpvznXw5T-g-pggIzSiZU0rUy3r31XJOeDOvSd3MCSG1ukNTJZRkjDaKcyYe0bHNaRiq3UkXn3KPt-cS-nDVJaSIdXR4BdF2vc7HEA84eVw6wBtdSrCAV-lUrr2OEX9C6ZLDI7OGAjkdIEK6DHiZbRcK2HLJMDyjB69PA0xvOUE_b6_f7brabN8_2uWmClTIUjkLxlDpiFTCMStBLaCWXtiFccz4pgGniaHaSTG2vKfCaiOoE4aamnDGJmj2vxsAYH_OYdT_3Y-nMCm4YH9CPFlL |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/HPSC66065.2025.00029 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798331596637 |
| EndPage | 47 |
| ExternalDocumentID | 11038767 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-i178t-dcebb18d0897d3c8e94e28f7c4bd3bf55eda0b1ad8718dff17cab71d7b1b20633 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001548132200007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Jun 25 06:00:26 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i178t-dcebb18d0897d3c8e94e28f7c4bd3bf55eda0b1ad8718dff17cab71d7b1b20633 |
| PageCount | 11 |
| ParticipantIDs | ieee_primary_11038767 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-May-9 |
| PublicationDateYYYYMMDD | 2025-05-09 |
| PublicationDate_xml | – month: 05 year: 2025 text: 2025-May-9 day: 09 |
| PublicationDecade | 2020 |
| PublicationTitle | 2025 IEEE 11th International Conference on High Performance and Smart Computing (HPSC) |
| PublicationTitleAbbrev | HPSC |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 1.9080205 |
| Snippet | The Lattice Boltzmann Method (LBM) has gained attention for its ability to handle complex fluid dynamics simulations, making it suitable for large-scale... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 37 |
| SubjectTerms | Computational modeling Heterogeneous (hybrid) systems Kernel Lattice Boltzmann Method Lattice Boltzmann methods Memory management Next generation networking Numerical Algorithms and Problems Numerical models Optimization Program processors Scalability Supercomputers |
| Title | Cross-Platform Optimization and Benchmarking of the Lattice Boltzmann Method on Heterogeneous Architectures |
| URI | https://ieeexplore.ieee.org/document/11038767 |
| WOSCitedRecordID | wos001548132200007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELVoxcAEiCK-5YHVNE7SnjPSiqpDKZH4ULfKn6KiTVCbMvDrObsBujCwRFGkUxQ7yb13fs9HyDXHf5zjVrE40SlLU9NhiDIiBnhU0FXaBbX7ywjGYzGZZHltVg9eGGttEJ_ZG38a1vJNqde-VNbmfjdv6EKDNABgY9aq7XA8ytrD_LHfRUDeQdoX-1JJ5IHjVtOUkDMG-_-82wFp_brvaP6TVw7Jji2OyFvfpzOWz2XlYSZ9wE99UXsoqSwM7WHc60KG0jctHUVgR0ey8to22ivn1edCFgW9Dw2jKcYMvQ6mxNfHIvent1vrCasWeR7cPfWHrG6UwGYcRMWMtkpxYSKRgUm0sFlqY-FAp8okynU61shIcWmQHQnjHActFXADiqsYMUpyTJpFWdgTQiOM4QYfUCFRUlGmdCw1IOhwAomVdqek5Udq-r7ZC2P6PUhnf1w_J3t-MoJEMLsgzWq5tpdkV39Us9XyKszgFwPTod4 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgIMEEiCK-8cAaGuejdkZaUQWRlkgU1K3yp6hoE9SmDPx6zm6ALgwsURTpFMVOcu-d3_MhdE3gH2eIFl4QysiLIhV7gDJ8j8JR0LaQxqndXzI6GLDRKMlrs7rzwmitnfhM39hTt5avSrm0pbIWsbt50zbdRFtxFAVkZdeqDXHET1pp_tRtAySPgfgFtljiW-i41jbFZY3e3j_vt4-av_47nP9klgO0oYtD9Na1Cc3Lp7yyQBM_wsc-q12UmBcKdyDudcZd8RuXBgO0wxmvrLoNd8pp9TnjRYH7rmU0hpjUKmFKeIE0sH98u7aisGii597dsJt6dasEb0IoqzwltRCEKZ8lVIWS6STSATNURkKFwsSxVtwXhCvgR0wZQ6jkghJFBREBoJTwCDWKstDHCPsQQxQ8oACqJPxEyIBLCrDDMKBW0pygph2p8ftqN4zx9yCd_nH9Cu2kw342zu4HD2do106MEwwm56hRzZf6Am3Lj2qymF-62fwCl2WlJQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2025+IEEE+11th+International+Conference+on+High+Performance+and+Smart+Computing+%28HPSC%29&rft.atitle=Cross-Platform+Optimization+and+Benchmarking+of+the+Lattice+Boltzmann+Method+on+Heterogeneous+Architectures&rft.au=Zhu%2C+Guanghui&rft.au=Lv%2C+Xiaojing&rft.au=Liu%2C+Zhao&rft.au=Liu%2C+Tao&rft.date=2025-05-09&rft.pub=IEEE&rft.spage=37&rft.epage=47&rft_id=info:doi/10.1109%2FHPSC66065.2025.00029&rft.externalDocID=11038767 |