SLEEF: A Portable Vectorized Library of C Standard Mathematical Functions

In this article, we present techniques used to implement our portable vectorized library of C standard mathematical functions written entirely in C language. In order to make the library portable while maintaining good performance, intrinsic functions of vector extensions are abstracted by inline fu...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on parallel and distributed systems Jg. 31; H. 6; S. 1316 - 1327
Hauptverfasser: Shibata, Naoki, Petrogalli, Francesco
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.06.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1045-9219, 1558-2183
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract In this article, we present techniques used to implement our portable vectorized library of C standard mathematical functions written entirely in C language. In order to make the library portable while maintaining good performance, intrinsic functions of vector extensions are abstracted by inline functions or preprocessor macros. We implemented the functions so that they can use sub-features of vector extensions such as fused multiply-add, mask registers, and extraction of mantissa. In order to make computation with SIMD instructions efficient, the library only uses a small number of conditional branches, and all the computation paths are vectorized. We devised a variation of the Payne-Hanek argument reduction for trigonometric functions and a floating point remainder, both of which are suitable for vector computation. We compare the performance with our library to Intel SVML.
AbstractList In this article, we present techniques used to implement our portable vectorized library of C standard mathematical functions written entirely in C language. In order to make the library portable while maintaining good performance, intrinsic functions of vector extensions are abstracted by inline functions or preprocessor macros. We implemented the functions so that they can use sub-features of vector extensions such as fused multiply-add, mask registers, and extraction of mantissa. In order to make computation with SIMD instructions efficient, the library only uses a small number of conditional branches, and all the computation paths are vectorized. We devised a variation of the Payne-Hanek argument reduction for trigonometric functions and a floating point remainder, both of which are suitable for vector computation. We compare the performance with our library to Intel SVML.
Author Shibata, Naoki
Petrogalli, Francesco
Author_xml – sequence: 1
  givenname: Naoki
  orcidid: 0000-0002-9430-5555
  surname: Shibata
  fullname: Shibata, Naoki
  email: n-sibata@is.naist.jp
  organization: Graduate School of Information Science, Nara Institute of Science and Technology, Nara, Japan
– sequence: 2
  givenname: Francesco
  orcidid: 0000-0001-8375-3638
  surname: Petrogalli
  fullname: Petrogalli, Francesco
  email: Francesco.Petrogalli@arm.com
  organization: ARM 110, Cambridge, United Kingdom
BookMark eNp9kL1OwzAURi0EEm3hARCLJeYU_8dmq0oLlYKo1MJqOY4tUrVxcdwBnp5ErRgYmO4dvnPvpzME501oHAA3GI0xRup-vXxcjQnCakyUQJTSMzDAnMuMYEnPux0xnimC1SUYtu0GIcw4YgOwWBWz2fwBTuAyxGTKrYPvzqYQ629XwaIuo4lfMHg4hatkmsrECr6Y9OF2JtXWbOH80NhUh6a9AhfebFt3fZoj8DafrafPWfH6tJhOisxSKlLGulZMSMckMox6472jXFaMlSp3ljMlnOF5aZWnQnpFvZCEl5JYqiqGhKcjcHe8u4_h8-DapDfhEJvupSaUSYypUKpL5ceUjaFto_Pa1sn0RVM09VZjpHtvuveme2_65K0j8R9yH-tdZ-Ff5vbI1M6537xUVLCc0B948Hjy
CODEN ITDSEO
CitedBy_id crossref_primary_10_1137_22M1478847
crossref_primary_10_1051_0004_6361_202451214
crossref_primary_10_3389_fphys_2022_904648
crossref_primary_10_1007_s11390_021_1203_5
crossref_primary_10_1016_j_addma_2024_104380
crossref_primary_10_25209_2079_3316_2022_13_1_63_129
crossref_primary_10_1093_mnras_stab1032
crossref_primary_10_25209_2079_3316_2022_13_1_131_194
Cites_doi 10.1145/103147.103151
10.1109/MM.2016.25
10.1145/359327.359336
10.1007/s00450-010-0108-2
10.1109/HPCA.2007.346199
10.1109/HPCSim.2016.7568423
10.1145/2990194
10.1145/1057600.1057602
10.1145/3297858.3304062
10.5194/gmdd-8-4375-2015
10.1145/1133981.1133997
10.1088/1742-6596/513/5/052027
10.1109/ACSSC.2016.7869070
10.1007/PL00009321
10.1145/2370036.2145825
10.1109/CGO.2011.5764683
10.1007/3-540-12868-9_95
10.1109/CGO.2004.1281665
10.1109/TC.2008.223
10.1007/978-3-642-55224-3_9
10.1109/CAMP.2000.875989
10.1117/12.505591
10.1145/1460361.1460365
10.1007/BF01397083
10.1109/ISPASS.2006.1620789
10.1007/978-1-4757-2646-6
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TPDS.2019.2960333
DatabaseName IEEE Xplore (IEEE)
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 1327
ExternalDocumentID 10_1109_TPDS_2019_2960333
8936472
Genre orig-research
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
ESBDL
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
TWZ
UHB
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c336t-4155468e480a43faffe358d44b97ec5496ea57bc9f368f93f6825b82c39d406f3
IEDL.DBID RIE
ISICitedReferencesCount 18
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000589308100005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1045-9219
IngestDate Sun Jun 29 15:27:32 EDT 2025
Sat Nov 29 06:06:47 EST 2025
Tue Nov 18 21:39:57 EST 2025
Wed Aug 27 01:57:02 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c336t-4155468e480a43faffe358d44b97ec5496ea57bc9f368f93f6825b82c39d406f3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0001-8375-3638
0000-0002-9430-5555
OpenAccessLink https://ieeexplore.ieee.org/document/8936472
PQID 2348113699
PQPubID 85437
PageCount 12
ParticipantIDs crossref_citationtrail_10_1109_TPDS_2019_2960333
ieee_primary_8936472
proquest_journals_2348113699
crossref_primary_10_1109_TPDS_2019_2960333
PublicationCentury 2000
PublicationDate 2020-06-01
PublicationDateYYYYMMDD 2020-06-01
PublicationDate_xml – month: 06
  year: 2020
  text: 2020-06-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2020
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References chevillard (ref50) 2010
ref12
ref15
ref14
ref52
ref17
lee (ref36) 2017
ref16
ref18
(ref31) 0
cody (ref47) 1980
tian (ref29) 2002; 6
(ref32) 2004
naishlos (ref2) 2004
ref51
abrahams (ref53) 2003
ref45
ref48
ref42
ref44
(ref10) 2013
ref49
ref8
ref9
cody (ref46) 1980
ref5
(ref22) 2017
(ref34) 2018
krzikalla (ref25) 2012
dukhan (ref13) 2013
ref33
(ref7) 2018
ref1
ref39
ref38
fog (ref41) 0
(ref11) 2015
leißa (ref19) 2012; 47
tian (ref37) 2015
(ref4) 2010
ref23
ref26
(ref43) 2018
ref21
(ref6) 2011
(ref35) 2013
(ref30) 1984
ref28
ref27
(ref20) 2014
muller (ref24) 2009
(ref40) 2016
(ref3) 2019
References_xml – year: 0
  ident: ref31
  article-title: Arm compiler for HPC
– year: 2009
  ident: ref24
  publication-title: Handbook of Floating-Point Arithmetic
– ident: ref8
  doi: 10.1145/103147.103151
– ident: ref38
  doi: 10.1109/MM.2016.25
– ident: ref27
  doi: 10.1145/359327.359336
– ident: ref52
  doi: 10.1007/s00450-010-0108-2
– year: 1984
  ident: ref30
  article-title: GCC, the GNU compiler collection
– year: 2004
  ident: ref32
– year: 2013
  ident: ref35
  article-title: OpenMP application program interface
– start-page: 24
  year: 1980
  ident: ref47
  article-title: Implementation and testing of function software
  publication-title: Proc Problems Methodologies Math Softw Prod
– year: 2018
  ident: ref7
  article-title: The GNU C library (glibc)
– year: 1980
  ident: ref46
  article-title: Software manual for the elementary functions
– ident: ref18
  doi: 10.1109/HPCA.2007.346199
– ident: ref17
  doi: 10.1109/HPCSim.2016.7568423
– start-page: 105
  year: 2004
  ident: ref2
  article-title: Autovectorization in GCC
  publication-title: Proc GCC Developers Summit
– ident: ref16
  doi: 10.1145/2990194
– year: 2010
  ident: ref4
  article-title: Sun freely distributable libm version 5.3.
– ident: ref48
  doi: 10.1145/1057600.1057602
– year: 2012
  ident: ref25
  article-title: Auto-vectorization techniques for modern SIMD architectures
  publication-title: Proc Workshop Compilers for Parallel Computers
– year: 0
  ident: ref41
  article-title: Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs
– year: 2019
  ident: ref3
  article-title: Intel short vector math library
– ident: ref42
  doi: 10.1145/3297858.3304062
– volume: 6
  start-page: 36
  year: 2002
  ident: ref29
  article-title: Intel OpenMP C++/fortran compiler for hyper-threading technology: Implementation and performance
  publication-title: Intel Technol J
– year: 2011
  ident: ref6
– ident: ref23
  doi: 10.5194/gmdd-8-4375-2015
– ident: ref28
  doi: 10.1145/1133981.1133997
– year: 2003
  ident: ref53
  article-title: Boost Software License 1.0 (BSL-1.0)
– year: 2015
  ident: ref11
  article-title: Libmvec in glibc
– year: 2017
  ident: ref22
  article-title: ARM C language extensions for SVE documentation
– ident: ref15
  doi: 10.1088/1742-6596/513/5/052027
– ident: ref5
  doi: 10.1109/ACSSC.2016.7869070
– year: 2016
  ident: ref40
  article-title: Intel 64 and IA-32 architectures optimization reference manual
– ident: ref45
  doi: 10.1007/PL00009321
– start-page: 62
  year: 2017
  ident: ref36
  article-title: Extending OpenMP SIMD support for target specific code and application to ARM SVE
  publication-title: Proceedings of the International Workshop on OpenMP
– volume: 47
  start-page: 65
  year: 2012
  ident: ref19
  article-title: Extending a c-like language for portable SIMD programming
  publication-title: ACM SIGPLAN Notices
  doi: 10.1145/2370036.2145825
– year: 2013
  ident: ref13
  article-title: Yeppp! library
– ident: ref26
  doi: 10.1109/CGO.2011.5764683
– year: 2014
  ident: ref20
  publication-title: ARM NEON Intrinsics Reference
– ident: ref49
  doi: 10.1007/3-540-12868-9_95
– start-page: 28
  year: 2010
  ident: ref50
  article-title: Sollya: An environment for the development of numerical codes
  publication-title: Proceedings of the 2nd International Congress on Mathematical Software
– year: 2018
  ident: ref34
  article-title: Vector function application binary interface specification for AArch64
– year: 2013
  ident: ref10
  article-title: AMD core math library
– year: 2018
  ident: ref43
  article-title: llvm-exegesis: Automatic measurement of instruction latency/uops
– ident: ref33
  doi: 10.1109/CGO.2004.1281665
– ident: ref12
  doi: 10.1109/TC.2008.223
– ident: ref14
  doi: 10.1007/978-3-642-55224-3_9
– ident: ref1
  doi: 10.1109/CAMP.2000.875989
– ident: ref9
  doi: 10.1117/12.505591
– ident: ref51
  doi: 10.1145/1460361.1460365
– ident: ref44
  doi: 10.1007/BF01397083
– ident: ref39
  doi: 10.1109/ISPASS.2006.1620789
– year: 2015
  ident: ref37
  article-title: Vector function application binary interface
– ident: ref21
  doi: 10.1007/978-1-4757-2646-6
SSID ssj0014504
Score 2.3963125
Snippet In this article, we present techniques used to implement our portable vectorized library of C standard mathematical functions written entirely in C language....
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1316
SubjectTerms Computer architecture
elementary functions
Feature extraction
Floating point arithmetic
Language preprocessors
Libraries
Macros
Mathematical analysis
Mathematical functions
Open source software
Optimization
Parallel and vector implementations
Portability
Program processors
Registers
SIMD processors
Trigonometric functions
Title SLEEF: A Portable Vectorized Library of C Standard Mathematical Functions
URI https://ieeexplore.ieee.org/document/8936472
https://www.proquest.com/docview/2348113699
Volume 31
WOSCitedRecordID wos000589308100005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2183
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014504
  issn: 1045-9219
  databaseCode: RIE
  dateStart: 19900101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3BSgMxEB20eNCD1VaxWiUHT-LWbZPdzXgr2qKgIrRKb0uSTaBQWtHWg19vkqaLogje9pCQJZPJzJvMzAM4RTRtxSVGmdAWoLRTEyGXIpIcGUMqpJTKk01kDw98NMLHNTgva2G01j75TLfcp3_LL2Zq4UJlF9a2um7n67CeZemyVqt8MWCJpwq06CKJ0KpheMFsx3gxfLweuCQubHWsv04p_WaDPKnKj5vYm5d-9X8_tgPbwY0k3aXcd2FNT2tQXVE0kKCxNdj60m-wDreDu16vf0m6xCeQyokmzz5qP_7QBQklDGRmyBUZhBADuS_7utr1-tYK-oO6B0_93vDqJgpcCpGiNJ1H3m9IuWY8FowaYYymCS8Yk5hpZUFiqkWSSYWGptwgNamFjpJ3FMXC2nxD96EynU31ARBmXSzh9FzFkinFBdeJYZwlwt4VheINiFe7m6vQaNzxXUxyDzhizJ1AcieQPAikAWfllJdll42_BtedBMqBYfMb0FyJMA96-JZ3XJ1xm6aIh7_POoLNjkPQPq7ShMr8daGPYUO9z8dvryf-iH0CG1PMXA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB58gXrwURWrVXPwJK7dbrLbjDfRFsW2FFrF25JkExCkFa0e_PUmabooiuBtDwlZMpnMfJOZ-QCOEU1DcYlRU2gLUBqZiZBLEUmOjCEVUkrlySaavR5_eMD-HJyWtTBaa598ps_cp3_LL8bqzYXK6ta2um7n87CYMpbE02qt8s2ApZ4s0OKLNEKriOENsxFjfdi_Grg0LjxLrMdOKf1mhTytyo-72BuY9vr_fm0D1oIjSS6mkt-EOT2qwPqMpIEEna3A6peOg1twM-i0Wu1zckF8Cql80uTex-0fP3RBQhEDGRtySQYhyEC6ZWdXu17b2kF_VLfhrt0aXl5HgU0hUpRmk8h7DhnXjMeCUSOM0TTlBWMSm1pZmJhpkTalQkMzbpCazIJHyRNFsbBW39AdWBiNR3oXCLNOlnCarmLJlOKC69QwzlJhb4tC8SrEs93NVWg17hgvnnIPOWLMnUByJ5A8CKQKJ-WU52mfjb8GbzkJlAPD5lehNhNhHjTxNU9cpXGDZoh7v886guXrYbeTd256t_uwkjg87aMsNViYvLzpA1hS75PH15dDf9w-Ael6z6M
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SLEEF%3A+A+Portable+Vectorized+Library+of+C+Standard+Mathematical+Functions&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Shibata%2C+Naoki&rft.au=Petrogalli%2C+Francesco&rft.date=2020-06-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=31&rft.issue=6&rft.spage=1316&rft_id=info:doi/10.1109%2FTPDS.2019.2960333&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon