Parallel time integration using Batched BLAS (Basic Linear Algebra Subprograms) routines

We present an approach for integrating the time evolution of quantum systems. We leverage the computation power of graphics processing units (GPUs) to perform the integration of all time steps in parallel. The performance boost is especially prominent for small to medium-sized quantum systems. The d...

Full description

Saved in:
Bibliographic Details
Published in:Computer physics communications Vol. 270; p. 108181
Main Authors: Herb, Konstantin, Welter, Pol
Format: Journal Article
Language:English
Published: Elsevier B.V 01.01.2022
Subjects:
ISSN:0010-4655, 1879-2944
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract We present an approach for integrating the time evolution of quantum systems. We leverage the computation power of graphics processing units (GPUs) to perform the integration of all time steps in parallel. The performance boost is especially prominent for small to medium-sized quantum systems. The devised algorithm can largely be implemented using the recently-specified batched versions of the BLAS routines, and can therefore be easily ported to a variety of platforms. Our PARAllelized Matrix Exponentiation for Numerical Time evolution (PARAMENT) implementation runs on CUDA-enabled graphics processing units. Program Title: PARAMENT CPC Library link to program files:https://doi.org/10.17632/zy5v4xs89d.1 Developer's repository link:https://github.com/parament-integrator/parament Licensing provisions: Apache 2.0 Programming language: C / CUDA / Python Nature of problem: Time-integration of the Schrödinger equation with a time-dependent Hamiltonian for quantum systems with a small Hilbert space but many time-steps. Solution method: A 4th order Magnus integrator, highly parallelized on a GPU, implemented using a small subset of BLAS functions for improved portability.
AbstractList We present an approach for integrating the time evolution of quantum systems. We leverage the computation power of graphics processing units (GPUs) to perform the integration of all time steps in parallel. The performance boost is especially prominent for small to medium-sized quantum systems. The devised algorithm can largely be implemented using the recently-specified batched versions of the BLAS routines, and can therefore be easily ported to a variety of platforms. Our PARAllelized Matrix Exponentiation for Numerical Time evolution (PARAMENT) implementation runs on CUDA-enabled graphics processing units. Program Title: PARAMENT CPC Library link to program files:https://doi.org/10.17632/zy5v4xs89d.1 Developer's repository link:https://github.com/parament-integrator/parament Licensing provisions: Apache 2.0 Programming language: C / CUDA / Python Nature of problem: Time-integration of the Schrödinger equation with a time-dependent Hamiltonian for quantum systems with a small Hilbert space but many time-steps. Solution method: A 4th order Magnus integrator, highly parallelized on a GPU, implemented using a small subset of BLAS functions for improved portability.
ArticleNumber 108181
Author Welter, Pol
Herb, Konstantin
Author_xml – sequence: 1
  givenname: Konstantin
  surname: Herb
  fullname: Herb, Konstantin
  email: science@rashbw.de
– sequence: 2
  givenname: Pol
  orcidid: 0000-0002-3666-7400
  surname: Welter
  fullname: Welter, Pol
BookMark eNp9kE1LAzEQhoNUsK3-AG856mFrskl2s3hqxS9YUKiCt5DNTmrKdrckqeC_N7WePPQ0zLw8A-8zQaN-6AGhS0pmlNDiZj0zWzPLSU7TLqmkJ2hMZVllecX5CI0JoSTjhRBnaBLCmhBSlhUbo49X7XXXQYej2wB2fYSV19ENPd4F16_wQkfzCS1e1PMlvlro4AyuXQ_a43m3gsZrvNw1Wz8kbBOusR92McXhHJ1a3QW4-JtT9P5w_3b3lNUvj8938zozjJOYWQOiYJoZU0grWkmgZYxZyY0UQnMrGlsVUjBelUBYa6tcNzy3hKRbAY1lU1Qe_ho_hODBKuPib4HotesUJWovSK1VEqT2gtRBUCLpP3Lr3Ub776PM7YGBVOnLgVfBOOgNtM6Diaod3BH6B3XYgBk
CitedBy_id crossref_primary_10_1038_s41467_025_55956_1
crossref_primary_10_1016_j_future_2024_06_004
Cites_doi 10.1016/j.cpc.2018.02.019
10.1090/S0025-5718-1955-0071856-0
10.1023/A:1022311628317
10.1016/j.cpc.2012.11.019
10.1016/j.jmr.2004.11.004
10.1016/j.procs.2017.05.138
10.1002/cpa.3160070404
10.1016/j.parco.2010.01.006
10.1016/j.jpdc.2004.03.021
10.1016/j.jcp.2011.04.006
10.1016/j.physrep.2008.11.001
10.1137/S00361445024180
10.1088/1361-6633/aa5170
10.1063/1.448136
10.1145/322217.322232
ContentType Journal Article
Copyright 2021 The Authors
Copyright_xml – notice: 2021 The Authors
DBID 6I.
AAFTH
AAYXX
CITATION
DOI 10.1016/j.cpc.2021.108181
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 1879-2944
ExternalDocumentID 10_1016_j_cpc_2021_108181
S0010465521002939
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1RT
1~.
1~5
29F
4.4
457
4G.
5GY
5VS
6I.
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AARLI
AAXUO
AAYFN
ABBOA
ABFNM
ABMAC
ABNEU
ABQEM
ABQYD
ABXDB
ABYKQ
ACDAQ
ACFVG
ACGFS
ACLVX
ACNNM
ACRLP
ACSBN
ACZNC
ADBBV
ADECG
ADEZE
ADJOM
ADMUD
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFZHZ
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AI.
AIALX
AIEXJ
AIKHN
AITUG
AIVDX
AJBFU
AJOXV
AJSZI
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
ATOGT
AVWKF
AXJTR
AZFZN
BBWZM
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FLBIZ
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HLZ
HME
HMV
HVGLF
HZ~
IHE
IMUCA
J1W
KOM
LG9
LZ4
M38
M41
MO0
N9A
NDZJH
O-L
O9-
OAUVE
OGIMB
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCB
SDF
SDG
SES
SEW
SHN
SPC
SPCBC
SPD
SPG
SSE
SSK
SSQ
SSV
SSZ
T5K
TN5
UPT
VH1
WUQ
ZMT
~02
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABJNI
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c340t-fce563a3cc68f5d80ed333f84c855a4f5bf96853497e03df92ab42f008536ebf3
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000708648400007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0010-4655
IngestDate Sat Nov 29 07:00:17 EST 2025
Tue Nov 18 22:29:45 EST 2025
Fri Feb 23 02:44:52 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords GPU programming
Batched BLAS
Magnus integrators
Parallel time integration
Schrödinger equation
Exponential integrators
Language English
License This is an open access article under the CC BY license.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c340t-fce563a3cc68f5d80ed333f84c855a4f5bf96853497e03df92ab42f008536ebf3
ORCID 0000-0002-3666-7400
OpenAccessLink https://dx.doi.org/10.1016/j.cpc.2021.108181
ParticipantIDs crossref_citationtrail_10_1016_j_cpc_2021_108181
crossref_primary_10_1016_j_cpc_2021_108181
elsevier_sciencedirect_doi_10_1016_j_cpc_2021_108181
PublicationCentury 2000
PublicationDate January 2022
2022-01-00
PublicationDateYYYYMMDD 2022-01-01
PublicationDate_xml – month: 01
  year: 2022
  text: January 2022
PublicationDecade 2020
PublicationTitle Computer physics communications
PublicationYear 2022
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Silveri, Tuorila, Thuneberg, Paraoanu (br0160) 2017; 80
Moler, Loan (br0030) 2003; 45
Blanes, Casas, Oteo, Ros (br0120) 2009; 470
Alvermann, Fehske (br0150) 2011; 230
Clenshaw (br0070) 1955; 9
Tal-Ezer, Kosloff (br0050) 1984; 81
Anderson, Bai, Bischof, Blackford, Demmel, Dongarra, Du Croz, Greenbaum, Hammarling, McKenney, Sorensen (br0040) 1999
Dongarra, Hammarling, Higham, Relton, Valero-Lara, Zounon (br0020) 2017; 108
Magnus (br0110) 1954; 7
Blanes, Casas, Ros (br0140) 2000; 40
Ladner, Fischer (br0090) 1980; 27
Johansson, Nation, Nori (br0180) 2013; 184
Creffield (br0190) 2003; 67
Reuther, Kepner, Byun, Samsi, Arcand, Bestor, Bergeron, Gadepally, Houle, Hubbell, Jones, Klein, Milechin, Mullen, Prout, Rosa, Yee, Michaleas (br0170) 2018
Irony, Toledo, Tiskin (br0100) 2004; 64
Slichter (br0010) 1990
Auckenthaler, Bader, Huckle, Spörl, Waldherr (br0080) 2010; 36
Khaneja, Reiss, Kehlet, Schulte-Herbrüggen, Glaser (br0200) 2005; 172
Auer, Einkemmer, Kandolf, Ostermann (br0130) 2018; 228
Lubich (br0060) 2008
Slichter (10.1016/j.cpc.2021.108181_br0010) 1990
Blanes (10.1016/j.cpc.2021.108181_br0140) 2000; 40
Tal-Ezer (10.1016/j.cpc.2021.108181_br0050) 1984; 81
Alvermann (10.1016/j.cpc.2021.108181_br0150) 2011; 230
Auckenthaler (10.1016/j.cpc.2021.108181_br0080) 2010; 36
Reuther (10.1016/j.cpc.2021.108181_br0170) 2018
Anderson (10.1016/j.cpc.2021.108181_br0040) 1999
Lubich (10.1016/j.cpc.2021.108181_br0060) 2008
Magnus (10.1016/j.cpc.2021.108181_br0110) 1954; 7
Moler (10.1016/j.cpc.2021.108181_br0030) 2003; 45
Silveri (10.1016/j.cpc.2021.108181_br0160) 2017; 80
Creffield (10.1016/j.cpc.2021.108181_br0190) 2003; 67
Blanes (10.1016/j.cpc.2021.108181_br0120) 2009; 470
Irony (10.1016/j.cpc.2021.108181_br0100) 2004; 64
Johansson (10.1016/j.cpc.2021.108181_br0180) 2013; 184
Khaneja (10.1016/j.cpc.2021.108181_br0200) 2005; 172
Dongarra (10.1016/j.cpc.2021.108181_br0020) 2017; 108
Clenshaw (10.1016/j.cpc.2021.108181_br0070) 1955; 9
Ladner (10.1016/j.cpc.2021.108181_br0090) 1980; 27
Auer (10.1016/j.cpc.2021.108181_br0130) 2018; 228
References_xml – volume: 172
  start-page: 296
  year: 2005
  end-page: 305
  ident: br0200
  publication-title: J. Magn. Res.
– volume: 108
  start-page: 495
  year: 2017
  end-page: 504
  ident: br0020
  publication-title: Proc. Comput. Sci.
– volume: 9
  start-page: 118
  year: 1955
  end-page: 120
  ident: br0070
  publication-title: Math. Comput.
– volume: 45
  start-page: 3
  year: 2003
  end-page: 49
  ident: br0030
  publication-title: SIAM Rev.
– volume: 64
  start-page: 1017
  year: 2004
  end-page: 1026
  ident: br0100
  publication-title: J. Parallel Distrib. Comput.
– volume: 40
  start-page: 434
  year: 2000
  end-page: 450
  ident: br0140
  publication-title: BIT Numer. Math.
– volume: 67
  year: 2003
  ident: br0190
  publication-title: Phys. Rev. B
– volume: 81
  start-page: 3967
  year: 1984
  end-page: 3971
  ident: br0050
  publication-title: J. Chem. Phys.
– year: 2008
  ident: br0060
  article-title: From Quantum to Classical Molecular Dynamics: Reduced Models and Numerical Analysis
– volume: 36
  start-page: 359
  year: 2010
  end-page: 369
  ident: br0080
  publication-title: Parallel Comput.
– volume: 7
  start-page: 649
  year: 1954
  end-page: 673
  ident: br0110
  publication-title: Commun. Pure Appl. Math.
– year: 2018
  ident: br0170
  publication-title: 2018 IEEE High Performance Extreme Computing Conference (HPEC)
– volume: 27
  start-page: 831
  year: 1980
  end-page: 838
  ident: br0090
  publication-title: J. ACM
– volume: 470
  start-page: 151
  year: 2009
  end-page: 238
  ident: br0120
  publication-title: Phys. Rep.
– volume: 184
  start-page: 1234
  year: 2013
  end-page: 1240
  ident: br0180
  publication-title: Comput. Phys. Commun.
– year: 1990
  ident: br0010
  article-title: Principles of Magnetic Resonance
– volume: 228
  start-page: 115
  year: 2018
  end-page: 122
  ident: br0130
  publication-title: Comput. Phys. Commun.
– volume: 80
  year: 2017
  ident: br0160
  publication-title: Rep. Prog. Phys.
– year: 1999
  ident: br0040
  article-title: LAPACK Users' Guide
– volume: 230
  start-page: 5930
  year: 2011
  end-page: 5956
  ident: br0150
  publication-title: J. Comput. Phys.
– volume: 228
  start-page: 115
  year: 2018
  ident: 10.1016/j.cpc.2021.108181_br0130
  publication-title: Comput. Phys. Commun.
  doi: 10.1016/j.cpc.2018.02.019
– volume: 9
  start-page: 118
  issue: 51
  year: 1955
  ident: 10.1016/j.cpc.2021.108181_br0070
  publication-title: Math. Comput.
  doi: 10.1090/S0025-5718-1955-0071856-0
– volume: 67
  year: 2003
  ident: 10.1016/j.cpc.2021.108181_br0190
  publication-title: Phys. Rev. B
– year: 1990
  ident: 10.1016/j.cpc.2021.108181_br0010
– volume: 40
  start-page: 434
  issue: 3
  year: 2000
  ident: 10.1016/j.cpc.2021.108181_br0140
  publication-title: BIT Numer. Math.
  doi: 10.1023/A:1022311628317
– volume: 184
  start-page: 1234
  issue: 4
  year: 2013
  ident: 10.1016/j.cpc.2021.108181_br0180
  publication-title: Comput. Phys. Commun.
  doi: 10.1016/j.cpc.2012.11.019
– volume: 172
  start-page: 296
  issue: 2
  year: 2005
  ident: 10.1016/j.cpc.2021.108181_br0200
  publication-title: J. Magn. Res.
  doi: 10.1016/j.jmr.2004.11.004
– year: 1999
  ident: 10.1016/j.cpc.2021.108181_br0040
– volume: 108
  start-page: 495
  year: 2017
  ident: 10.1016/j.cpc.2021.108181_br0020
  publication-title: Proc. Comput. Sci.
  doi: 10.1016/j.procs.2017.05.138
– volume: 7
  start-page: 649
  issue: 4
  year: 1954
  ident: 10.1016/j.cpc.2021.108181_br0110
  publication-title: Commun. Pure Appl. Math.
  doi: 10.1002/cpa.3160070404
– volume: 36
  start-page: 359
  issue: 5–6
  year: 2010
  ident: 10.1016/j.cpc.2021.108181_br0080
  publication-title: Parallel Comput.
  doi: 10.1016/j.parco.2010.01.006
– year: 2008
  ident: 10.1016/j.cpc.2021.108181_br0060
– volume: 64
  start-page: 1017
  issue: 9
  year: 2004
  ident: 10.1016/j.cpc.2021.108181_br0100
  publication-title: J. Parallel Distrib. Comput.
  doi: 10.1016/j.jpdc.2004.03.021
– volume: 230
  start-page: 5930
  issue: 15
  year: 2011
  ident: 10.1016/j.cpc.2021.108181_br0150
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2011.04.006
– volume: 470
  start-page: 151
  issue: 5–6
  year: 2009
  ident: 10.1016/j.cpc.2021.108181_br0120
  publication-title: Phys. Rep.
  doi: 10.1016/j.physrep.2008.11.001
– volume: 45
  start-page: 3
  issue: 1
  year: 2003
  ident: 10.1016/j.cpc.2021.108181_br0030
  publication-title: SIAM Rev.
  doi: 10.1137/S00361445024180
– volume: 80
  issue: 5
  year: 2017
  ident: 10.1016/j.cpc.2021.108181_br0160
  publication-title: Rep. Prog. Phys.
  doi: 10.1088/1361-6633/aa5170
– volume: 81
  start-page: 3967
  issue: 9
  year: 1984
  ident: 10.1016/j.cpc.2021.108181_br0050
  publication-title: J. Chem. Phys.
  doi: 10.1063/1.448136
– year: 2018
  ident: 10.1016/j.cpc.2021.108181_br0170
– volume: 27
  start-page: 831
  issue: 4
  year: 1980
  ident: 10.1016/j.cpc.2021.108181_br0090
  publication-title: J. ACM
  doi: 10.1145/322217.322232
SSID ssj0007793
Score 2.4023752
Snippet We present an approach for integrating the time evolution of quantum systems. We leverage the computation power of graphics processing units (GPUs) to perform...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 108181
SubjectTerms Batched BLAS
Exponential integrators
GPU programming
Magnus integrators
Parallel time integration
Schrödinger equation
Title Parallel time integration using Batched BLAS (Basic Linear Algebra Subprograms) routines
URI https://dx.doi.org/10.1016/j.cpc.2021.108181
Volume 270
WOSCitedRecordID wos000708648400007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1879-2944
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0007793
  issn: 0010-4655
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtZ3Pa9swFMfFlm6wy-h-0XZb0WGH_cDF0Q9bOiajpRuj5NBBbsaWpbFinJCko39-35NkxyttWQ-7GGNsReijPD89PX8fIR9c7cY2tSoxFYcFSq15olmpEi4yU0mDof7aF5vIz87UfK5nsWbr2pcTyNtWXV3p5X9FDdcANn46-wDcfaNwAc4BOhwBOxz_CfysXGF9lMaXje_lIJDypY8LTEsEVX-Z_pjgGl5NSwCFa3OU9Jk0v3AnGe1JzNxaY9xgtYAOtzHdsNM1iPUgYnBkjdnp229Nelf9FLh5axIc0U7p228GNbEqyGzRDKMPjA2iD9Gigh1HDbahRWWhFki0iWNUzRvfaq5D5ODiyCxRTZKNj7b3_i2NfeOV1ScSdjlqFwU0UWATRWjiMdlhudRqRHYm347n3_u3c55HIebY726n2-f83ejH7b7KwP843yXP48KBTgLwF-SRbV-Sp7Mw9q_IvMNOETsdYKceO43YKWKnHz10GqDTCJ0OoH-iHfLX5OfJ8fnX0yRWzUgMF-kmccbKjJfcmEw5WavU1pxzp4RRUpbCycrpDJw0oXOb8trBX7MSzKHvzTNbOf6GjNpFa_cIVVK4UsOKt7ZGKM1UZrliNtdaSO14uU_SbnwKEyXlsbJJU9zJZZ987h9ZBj2V-24W3aAX0SEMjl4BE-juxw4e8htvybPtvH5HRpvVpX1Pnpg_m9_r1WGcPddeg37b
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Parallel+time+integration+using+Batched+BLAS+%28Basic+Linear+Algebra+Subprograms%29+routines&rft.jtitle=Computer+physics+communications&rft.au=Herb%2C+Konstantin&rft.au=Welter%2C+Pol&rft.date=2022-01-01&rft.issn=0010-4655&rft.volume=270&rft.spage=108181&rft_id=info:doi/10.1016%2Fj.cpc.2021.108181&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_cpc_2021_108181
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0010-4655&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0010-4655&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0010-4655&client=summon