SLEEF: A Portable Vectorized Library of C Standard Mathematical Functions
In this article, we present techniques used to implement our portable vectorized library of C standard mathematical functions written entirely in C language. In order to make the library portable while maintaining good performance, intrinsic functions of vector extensions are abstracted by inline fu...
Uložené v:
| Vydané v: | IEEE transactions on parallel and distributed systems Ročník 31; číslo 6; s. 1316 - 1327 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
IEEE
01.06.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 1045-9219, 1558-2183 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | In this article, we present techniques used to implement our portable vectorized library of C standard mathematical functions written entirely in C language. In order to make the library portable while maintaining good performance, intrinsic functions of vector extensions are abstracted by inline functions or preprocessor macros. We implemented the functions so that they can use sub-features of vector extensions such as fused multiply-add, mask registers, and extraction of mantissa. In order to make computation with SIMD instructions efficient, the library only uses a small number of conditional branches, and all the computation paths are vectorized. We devised a variation of the Payne-Hanek argument reduction for trigonometric functions and a floating point remainder, both of which are suitable for vector computation. We compare the performance with our library to Intel SVML. |
|---|---|
| AbstractList | In this article, we present techniques used to implement our portable vectorized library of C standard mathematical functions written entirely in C language. In order to make the library portable while maintaining good performance, intrinsic functions of vector extensions are abstracted by inline functions or preprocessor macros. We implemented the functions so that they can use sub-features of vector extensions such as fused multiply-add, mask registers, and extraction of mantissa. In order to make computation with SIMD instructions efficient, the library only uses a small number of conditional branches, and all the computation paths are vectorized. We devised a variation of the Payne-Hanek argument reduction for trigonometric functions and a floating point remainder, both of which are suitable for vector computation. We compare the performance with our library to Intel SVML. |
| Author | Shibata, Naoki Petrogalli, Francesco |
| Author_xml | – sequence: 1 givenname: Naoki orcidid: 0000-0002-9430-5555 surname: Shibata fullname: Shibata, Naoki email: n-sibata@is.naist.jp organization: Graduate School of Information Science, Nara Institute of Science and Technology, Nara, Japan – sequence: 2 givenname: Francesco orcidid: 0000-0001-8375-3638 surname: Petrogalli fullname: Petrogalli, Francesco email: Francesco.Petrogalli@arm.com organization: ARM 110, Cambridge, United Kingdom |
| BookMark | eNp9kL1OwzAURi0EEm3hARCLJeYU_8dmq0oLlYKo1MJqOY4tUrVxcdwBnp5ErRgYmO4dvnPvpzME501oHAA3GI0xRup-vXxcjQnCakyUQJTSMzDAnMuMYEnPux0xnimC1SUYtu0GIcw4YgOwWBWz2fwBTuAyxGTKrYPvzqYQ629XwaIuo4lfMHg4hatkmsrECr6Y9OF2JtXWbOH80NhUh6a9AhfebFt3fZoj8DafrafPWfH6tJhOisxSKlLGulZMSMckMox6472jXFaMlSp3ljMlnOF5aZWnQnpFvZCEl5JYqiqGhKcjcHe8u4_h8-DapDfhEJvupSaUSYypUKpL5ceUjaFto_Pa1sn0RVM09VZjpHtvuveme2_65K0j8R9yH-tdZ-Ff5vbI1M6537xUVLCc0B948Hjy |
| CODEN | ITDSEO |
| CitedBy_id | crossref_primary_10_1137_22M1478847 crossref_primary_10_1051_0004_6361_202451214 crossref_primary_10_3389_fphys_2022_904648 crossref_primary_10_1007_s11390_021_1203_5 crossref_primary_10_1016_j_addma_2024_104380 crossref_primary_10_25209_2079_3316_2022_13_1_63_129 crossref_primary_10_1093_mnras_stab1032 crossref_primary_10_25209_2079_3316_2022_13_1_131_194 |
| Cites_doi | 10.1145/103147.103151 10.1109/MM.2016.25 10.1145/359327.359336 10.1007/s00450-010-0108-2 10.1109/HPCA.2007.346199 10.1109/HPCSim.2016.7568423 10.1145/2990194 10.1145/1057600.1057602 10.1145/3297858.3304062 10.5194/gmdd-8-4375-2015 10.1145/1133981.1133997 10.1088/1742-6596/513/5/052027 10.1109/ACSSC.2016.7869070 10.1007/PL00009321 10.1145/2370036.2145825 10.1109/CGO.2011.5764683 10.1007/3-540-12868-9_95 10.1109/CGO.2004.1281665 10.1109/TC.2008.223 10.1007/978-3-642-55224-3_9 10.1109/CAMP.2000.875989 10.1117/12.505591 10.1145/1460361.1460365 10.1007/BF01397083 10.1109/ISPASS.2006.1620789 10.1007/978-1-4757-2646-6 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TPDS.2019.2960333 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1558-2183 |
| EndPage | 1327 |
| ExternalDocumentID | 10_1109_TPDS_2019_2960333 8936472 |
| Genre | orig-research |
| GroupedDBID | --Z -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD ESBDL HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS TN5 TWZ UHB AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c336t-4155468e480a43faffe358d44b97ec5496ea57bc9f368f93f6825b82c39d406f3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 18 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000589308100005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1045-9219 |
| IngestDate | Sun Jun 29 15:27:32 EDT 2025 Sat Nov 29 06:06:47 EST 2025 Tue Nov 18 21:39:57 EST 2025 Wed Aug 27 01:57:02 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| License | https://creativecommons.org/licenses/by/4.0/legalcode |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c336t-4155468e480a43faffe358d44b97ec5496ea57bc9f368f93f6825b82c39d406f3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-8375-3638 0000-0002-9430-5555 |
| OpenAccessLink | https://ieeexplore.ieee.org/document/8936472 |
| PQID | 2348113699 |
| PQPubID | 85437 |
| PageCount | 12 |
| ParticipantIDs | crossref_citationtrail_10_1109_TPDS_2019_2960333 ieee_primary_8936472 proquest_journals_2348113699 crossref_primary_10_1109_TPDS_2019_2960333 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-06-01 |
| PublicationDateYYYYMMDD | 2020-06-01 |
| PublicationDate_xml | – month: 06 year: 2020 text: 2020-06-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on parallel and distributed systems |
| PublicationTitleAbbrev | TPDS |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | chevillard (ref50) 2010 ref12 ref15 ref14 ref52 ref17 lee (ref36) 2017 ref16 ref18 (ref31) 0 cody (ref47) 1980 tian (ref29) 2002; 6 (ref32) 2004 naishlos (ref2) 2004 ref51 abrahams (ref53) 2003 ref45 ref48 ref42 ref44 (ref10) 2013 ref49 ref8 ref9 cody (ref46) 1980 ref5 (ref22) 2017 (ref34) 2018 krzikalla (ref25) 2012 dukhan (ref13) 2013 ref33 (ref7) 2018 ref1 ref39 ref38 fog (ref41) 0 (ref11) 2015 leißa (ref19) 2012; 47 tian (ref37) 2015 (ref4) 2010 ref23 ref26 (ref43) 2018 ref21 (ref6) 2011 (ref35) 2013 (ref30) 1984 ref28 ref27 (ref20) 2014 muller (ref24) 2009 (ref40) 2016 (ref3) 2019 |
| References_xml | – year: 0 ident: ref31 article-title: Arm compiler for HPC – year: 2009 ident: ref24 publication-title: Handbook of Floating-Point Arithmetic – ident: ref8 doi: 10.1145/103147.103151 – ident: ref38 doi: 10.1109/MM.2016.25 – ident: ref27 doi: 10.1145/359327.359336 – ident: ref52 doi: 10.1007/s00450-010-0108-2 – year: 1984 ident: ref30 article-title: GCC, the GNU compiler collection – year: 2004 ident: ref32 – year: 2013 ident: ref35 article-title: OpenMP application program interface – start-page: 24 year: 1980 ident: ref47 article-title: Implementation and testing of function software publication-title: Proc Problems Methodologies Math Softw Prod – year: 2018 ident: ref7 article-title: The GNU C library (glibc) – year: 1980 ident: ref46 article-title: Software manual for the elementary functions – ident: ref18 doi: 10.1109/HPCA.2007.346199 – ident: ref17 doi: 10.1109/HPCSim.2016.7568423 – start-page: 105 year: 2004 ident: ref2 article-title: Autovectorization in GCC publication-title: Proc GCC Developers Summit – ident: ref16 doi: 10.1145/2990194 – year: 2010 ident: ref4 article-title: Sun freely distributable libm version 5.3. – ident: ref48 doi: 10.1145/1057600.1057602 – year: 2012 ident: ref25 article-title: Auto-vectorization techniques for modern SIMD architectures publication-title: Proc Workshop Compilers for Parallel Computers – year: 0 ident: ref41 article-title: Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs – year: 2019 ident: ref3 article-title: Intel short vector math library – ident: ref42 doi: 10.1145/3297858.3304062 – volume: 6 start-page: 36 year: 2002 ident: ref29 article-title: Intel OpenMP C++/fortran compiler for hyper-threading technology: Implementation and performance publication-title: Intel Technol J – year: 2011 ident: ref6 – ident: ref23 doi: 10.5194/gmdd-8-4375-2015 – ident: ref28 doi: 10.1145/1133981.1133997 – year: 2003 ident: ref53 article-title: Boost Software License 1.0 (BSL-1.0) – year: 2015 ident: ref11 article-title: Libmvec in glibc – year: 2017 ident: ref22 article-title: ARM C language extensions for SVE documentation – ident: ref15 doi: 10.1088/1742-6596/513/5/052027 – ident: ref5 doi: 10.1109/ACSSC.2016.7869070 – year: 2016 ident: ref40 article-title: Intel 64 and IA-32 architectures optimization reference manual – ident: ref45 doi: 10.1007/PL00009321 – start-page: 62 year: 2017 ident: ref36 article-title: Extending OpenMP SIMD support for target specific code and application to ARM SVE publication-title: Proceedings of the International Workshop on OpenMP – volume: 47 start-page: 65 year: 2012 ident: ref19 article-title: Extending a c-like language for portable SIMD programming publication-title: ACM SIGPLAN Notices doi: 10.1145/2370036.2145825 – year: 2013 ident: ref13 article-title: Yeppp! library – ident: ref26 doi: 10.1109/CGO.2011.5764683 – year: 2014 ident: ref20 publication-title: ARM NEON Intrinsics Reference – ident: ref49 doi: 10.1007/3-540-12868-9_95 – start-page: 28 year: 2010 ident: ref50 article-title: Sollya: An environment for the development of numerical codes publication-title: Proceedings of the 2nd International Congress on Mathematical Software – year: 2018 ident: ref34 article-title: Vector function application binary interface specification for AArch64 – year: 2013 ident: ref10 article-title: AMD core math library – year: 2018 ident: ref43 article-title: llvm-exegesis: Automatic measurement of instruction latency/uops – ident: ref33 doi: 10.1109/CGO.2004.1281665 – ident: ref12 doi: 10.1109/TC.2008.223 – ident: ref14 doi: 10.1007/978-3-642-55224-3_9 – ident: ref1 doi: 10.1109/CAMP.2000.875989 – ident: ref9 doi: 10.1117/12.505591 – ident: ref51 doi: 10.1145/1460361.1460365 – ident: ref44 doi: 10.1007/BF01397083 – ident: ref39 doi: 10.1109/ISPASS.2006.1620789 – year: 2015 ident: ref37 article-title: Vector function application binary interface – ident: ref21 doi: 10.1007/978-1-4757-2646-6 |
| SSID | ssj0014504 |
| Score | 2.3962266 |
| Snippet | In this article, we present techniques used to implement our portable vectorized library of C standard mathematical functions written entirely in C language.... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1316 |
| SubjectTerms | Computer architecture elementary functions Feature extraction Floating point arithmetic Language preprocessors Libraries Macros Mathematical analysis Mathematical functions Open source software Optimization Parallel and vector implementations Portability Program processors Registers SIMD processors Trigonometric functions |
| Title | SLEEF: A Portable Vectorized Library of C Standard Mathematical Functions |
| URI | https://ieeexplore.ieee.org/document/8936472 https://www.proquest.com/docview/2348113699 |
| Volume | 31 |
| WOSCitedRecordID | wos000589308100005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library customDbUrl: eissn: 1558-2183 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014504 issn: 1045-9219 databaseCode: RIE dateStart: 19900101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH-44UEPfovzixw8iXVd0iaNN9EVBR2DqXgrbfICg7GJbh78602yrCiK4K2HhJb88vq-fw_ghJqUYyZ0VKpSRAmNVSQlVVGaOLZyVna0VH7YhOj1sudn2V-Cs7oXBhF98Rmeu0efy9cTNXOhsrbVrY7tvAENIfi8V6vOGCSpHxVovYs0klYMQwazE8v2Q_964Iq45Dm19jpj7JsO8kNVfvyJvXrJ1__3YRuwFsxIcjnHfROWcLwF64sRDSRI7BasfuEb3IbbwV23m1-QS-ILSKsRkicftR9-oCahhYFMDLkigxBiIPc1r6t9X261oL-oO_CYdx-ubqIwSyFSjPFp5O0GnmGSxWXCTGkMsjTTSVJJgco6iRzLVFRKGsYzI5nh1nWsMqqY1FbnG7YLzfFkjHtAhNCO5V0LWtJEcV4xrmSJWiDXvIqxBfHidAsViMbdvItR4R2OWBYOkMIBUgRAWnBab3mZs2z8tXjbIVAvDIffgsMFhEWQw7eCuj7jDuNS7v--6wBWqPOgfVzlEJrT1xkewbJ6nw7fXo_9FfsEzD_MKw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NSyMxFH9oFdSD32LXrxw8iWOnSSaZ7E20RbEWoVW8DTPJCwjSirZ72L_eJE0HZZeFvc0hyQz55c3L-_o9gFNqM4G5NEmpS5lwmupEKaqTjHu2cla2jdKh2YTs9_PnZ_WwAOd1LQwihuQzvPCPIZZvxnrqXWUtp1s92_kiLGXcLTur1qpjBjwLzQKdfZElyglijGG2U9UaPlwPfBqXuqDuxs4Y-6aFQluVP_7FQcF0N_7v0zZhPV4kyeUM-S1YwNE2bMybNJAos9uw9oVxcAduB71Op_uTXJKQQlq9InkKfvuX32hILGIgY0uuyCA6Gch9zezq3td1ejAc1V147HaGVzdJ7KaQaMbEJAk3B5Ejz9OSM1taiyzLDeeVkqidmSiwzGSllWUit4pZ4YzHKqeaKeO0vmV70BiNR7gPRErjed6NpCXlWoiKCa1KNBKFEVWKTUjnu1voSDXuO168FsHkSFXhASk8IEUEpAln9ZS3Gc_GvwbveATqgXHzm3A4h7CIkvhRUF9p3GZCqR9_n3UCKzfD-17Ru-3fHcAq9fZ08LIcQmPyPsUjWNa_Ji8f78fhuH0CluvPcg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=SLEEF%3A+A+Portable+Vectorized+Library+of+C+Standard+Mathematical+Functions&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Shibata%2C+Naoki&rft.au=Petrogalli%2C+Francesco&rft.date=2020-06-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=31&rft.issue=6&rft.spage=1316&rft_id=info:doi/10.1109%2FTPDS.2019.2960333&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon |