A 280mV-to-1.1V 256b reconfigurable SIMD vector permutation engine with 2-dimensional shuffle in 22nm CMOS

Energy-efficient SIMD permutation operations are key for maximizing high-performance microprocessor vector datapath utilization in multimedia, graphics, and signal processing workloads [1-3]. A wide SIMD vector permutation engine is required to achieve high-throughput data rearrangement operations o...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2012 IEEE International Solid-State Circuits Conference s. 178 - 180
Hlavní autoři: Hsu, S., Agarwal, A., Anders, M., Mathew, S., Kaul, H., Sheikh, F., Krishnamurthy, R.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.02.2012
Témata:
ISBN:1467303763, 9781467303767
ISSN:0193-6530
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Energy-efficient SIMD permutation operations are key for maximizing high-performance microprocessor vector datapath utilization in multimedia, graphics, and signal processing workloads [1-3]. A wide SIMD vector permutation engine is required to achieve high-throughput data rearrangement operations on large data sets, with scaled supply voltages to deliver high energy efficiency. An ultra-low-voltage reconfigurable 4-way to 32-way SIMD vector permutation engine consisting of a 32-entry × 256b 3-read/1-write ported register file with a 256b byte-wise any-to-any permute crossbar for 2-dimensional shuffle is fabricated in 22nm CMOS. The register file integrates a vertical shuffle across multiple entries into read/write operations, and includes clockless static reads with shared P/N dual-ended transmission gate (DETG) writes, improving register file V MIN by 250mV across PVT variations with a wide dynamic operating range of 280mV-1.1V. The permute crossbar implements an interleaved folded byte-wise multiplexer layout forming an any-to-any fully-connected tree to perform a horizontal shuffle with permute accumulate circuits, and includes vector flip-flops, stacked min-delay buffers, shared gates to average min-sized transistor variation, and ultra-low-voltage split-output (ULVS) level shifters improving logic V MIN by 150mV, while enabling peak energy efficiency of 585GOPS/W measured at 260mV, 50°C. The permutation engine occupies a dense layout of 0.048mm 2 (Fig. 10.1.7) while achieving: (i) nominal register file performance of 1.8GHz, 106mW measured at 0.9V, 50°C; (ii) robust register file functionality measured down to 280mV (subthreshold) with peak energy efficiency of 154GOPS/W; (iii) scalable permute crossbar performance of 2.9GHz, 69mW measured at 1.1V, 50°C with deep sub-threshold operation at 240mV, 10MHz consuming 19μW; and (iv) a 64b 4×4 matrix transpose algorithm with 53% energy savings and 42% improved peak throughput of 263Gbps measured at 1.8GHz, 0.9V.
AbstractList Energy-efficient SIMD permutation operations are key for maximizing high-performance microprocessor vector datapath utilization in multimedia, graphics, and signal processing workloads [1-3]. A wide SIMD vector permutation engine is required to achieve high-throughput data rearrangement operations on large data sets, with scaled supply voltages to deliver high energy efficiency. An ultra-low-voltage reconfigurable 4-way to 32-way SIMD vector permutation engine consisting of a 32-entry × 256b 3-read/1-write ported register file with a 256b byte-wise any-to-any permute crossbar for 2-dimensional shuffle is fabricated in 22nm CMOS. The register file integrates a vertical shuffle across multiple entries into read/write operations, and includes clockless static reads with shared P/N dual-ended transmission gate (DETG) writes, improving register file V MIN by 250mV across PVT variations with a wide dynamic operating range of 280mV-1.1V. The permute crossbar implements an interleaved folded byte-wise multiplexer layout forming an any-to-any fully-connected tree to perform a horizontal shuffle with permute accumulate circuits, and includes vector flip-flops, stacked min-delay buffers, shared gates to average min-sized transistor variation, and ultra-low-voltage split-output (ULVS) level shifters improving logic V MIN by 150mV, while enabling peak energy efficiency of 585GOPS/W measured at 260mV, 50°C. The permutation engine occupies a dense layout of 0.048mm 2 (Fig. 10.1.7) while achieving: (i) nominal register file performance of 1.8GHz, 106mW measured at 0.9V, 50°C; (ii) robust register file functionality measured down to 280mV (subthreshold) with peak energy efficiency of 154GOPS/W; (iii) scalable permute crossbar performance of 2.9GHz, 69mW measured at 1.1V, 50°C with deep sub-threshold operation at 240mV, 10MHz consuming 19μW; and (iv) a 64b 4×4 matrix transpose algorithm with 53% energy savings and 42% improved peak throughput of 263Gbps measured at 1.8GHz, 0.9V.
Author Sheikh, F.
Kaul, H.
Mathew, S.
Hsu, S.
Anders, M.
Agarwal, A.
Krishnamurthy, R.
Author_xml – sequence: 1
  givenname: S.
  surname: Hsu
  fullname: Hsu, S.
  organization: Intel, Hillsboro, OR, USA
– sequence: 2
  givenname: A.
  surname: Agarwal
  fullname: Agarwal, A.
  organization: Intel, Hillsboro, OR, USA
– sequence: 3
  givenname: M.
  surname: Anders
  fullname: Anders, M.
  organization: Intel, Hillsboro, OR, USA
– sequence: 4
  givenname: S.
  surname: Mathew
  fullname: Mathew, S.
  organization: Intel, Hillsboro, OR, USA
– sequence: 5
  givenname: H.
  surname: Kaul
  fullname: Kaul, H.
  organization: Intel, Hillsboro, OR, USA
– sequence: 6
  givenname: F.
  surname: Sheikh
  fullname: Sheikh, F.
  organization: Intel, Hillsboro, OR, USA
– sequence: 7
  givenname: R.
  surname: Krishnamurthy
  fullname: Krishnamurthy, R.
  organization: Intel, Hillsboro, OR, USA
BookMark eNo1kMtuwjAURF2VSgXKD7Qb_4DTazvxY4nSFxKIRRBb5CTXYEQclIRW_ftWKl2N5khzFjMho9hGJOSRQ8I52OdFUeR5IoCLRHGtrFI3ZGa14anSEqTW6S2Z_BclR2QM3EqmMgn3ZNL3RwDIrDJjcpxTYaDZsqFlPOFbKjJV0g6rNvqwv3SuPCEtFqsX-onV0Hb0jF1zGdwQ2kgx7kNE-hWGAxWsDg3G_pe7E-0PF-9_lyFSIWJD89W6eCB33p16nF1zSjZvr5v8gy3X74t8vmTBwsCcS12muFKZN6kSOq1kqasa0cvagCkRwAkrtUlReo3GOGd1JXztRelBGzklT3_agIi7cxca133vrjfJH7giWg4
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ISSCC.2012.6176966
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781467303774
1467303771
9781467303743
1467303755
1467303747
9781467303750
EndPage 180
ExternalDocumentID 6176966
Genre orig-research
GroupedDBID 29G
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i90t-aa4a561665f846274c3b7cdeef3d808be00a293784e3f7e88aa97c2fdf2bf0783
IEDL.DBID RIE
ISBN 1467303763
9781467303767
ISSN 0193-6530
IngestDate Wed Aug 27 03:50:30 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i90t-aa4a561665f846274c3b7cdeef3d808be00a293784e3f7e88aa97c2fdf2bf0783
PageCount 3
ParticipantIDs ieee_primary_6176966
PublicationCentury 2000
PublicationDate 2012-Feb.
PublicationDateYYYYMMDD 2012-02-01
PublicationDate_xml – month: 02
  year: 2012
  text: 2012-Feb.
PublicationDecade 2010
PublicationTitle 2012 IEEE International Solid-State Circuits Conference
PublicationTitleAbbrev ISSCC
PublicationYear 2012
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0005968
ssj0000703945
Score 1.9944746
Snippet Energy-efficient SIMD permutation operations are key for maximizing high-performance microprocessor vector datapath utilization in multimedia, graphics, and...
SourceID ieee
SourceType Publisher
StartPage 178
SubjectTerms Energy efficiency
Energy measurement
Engines
Frequency measurement
Registers
Vectors
Voltage measurement
Title A 280mV-to-1.1V 256b reconfigurable SIMD vector permutation engine with 2-dimensional shuffle in 22nm CMOS
URI https://ieeexplore.ieee.org/document/6176966
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LbwIhECZqemgvfWjTdzj0WJTdRR7HxtbUg9ZEY7wZYKG1iatR199fYH20SS-9sSS7EJhlPmbmmwHgMY60JZQrRCgmyB94SClDUMoTLKmJhNAqFJtgvR4fj0W_BJ72XBhjTAg-M3XfDL78dK5zbyprOG1LHTwvgzJjtOBq7e0pXnQFOUDfpgg0OIdgEkSbCQ6kLurk2f9Ru1xP22e2Y9Ng0egMBq2WD_mK69vhftVdCWqnffq_CZ-B2oG_B_t7zXQOSia7ACc_Ug9WwdczjDmejdB6jqJ6NIIOlCgYrsd2-pEvPaMKDjrdF7gJdn24cEd4XvjtoQlfgt6GC2OU-gIBRXIPuPrMrXVvTjMYx9kMtrrvgxoYtl-HrTe0rbuApgKvkZREOlRFadM6cOJurTpRTKfG2CTlmCuDsXQggXFiEssM51IKpmOb2lhZ7xW8BJVsnpkrABUXxEbBm6cJs5YnRAopHUqRqaaaXoOqX7TJosisMdmu183f3bfg2O9LETN9ByrrZW7uwZHerKer5UMQh29_daym
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwMhECZaTdSLj9b4loNHaVmWsnA0VWOjrU3amN4aYEFr0q2pu_39AltbTbx4Y0l2ITDLfMzMNwPAFYm0pYwrRBmmyB94SClDUcpjLJmJhNAqFJtIul0-HIreGrhecmGMMSH4zNR9M_jy06kuvKms4bQtc_B8HWw0KSW4ZGstLSpeeAVdgd-mCEQ4h2FixJoxDrQu5iTa_1Pf2Z4Wz8k3nwaLRrvfb7V80BepLwb8VXklKJ773f9NeQ_UVgw-2Fvqpn2wZrIDsPMj-WAVvN9AwvHkBeVTFNWjF-hgiYLhgmzHr8XMc6pgv925hfNg2Ycf7hAvSs89NOFL0FtxIUGpLxFQpveAn2-Fte7NcQYJySaw1Xnu18Dg_m7QekCLygtoLHCOpKTS4SrGmtbBE3dv1bFKdGqMjVOOuTIYSwcTEk5NbBPDuZQi0cSmlijr_YKHoJJNM3MEoOKC2ij48zRNrOUxlUJKh1Nkqplmx6DqF230UebWGC3W6-Tv7kuw9TDoPI2e2t3HU7Dt96iMoD4DlXxWmHOwqef5-HN2EUTjC7Fvr-0
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+IEEE+International+Solid-State+Circuits+Conference&rft.atitle=A+280mV-to-1.1V+256b+reconfigurable+SIMD+vector+permutation+engine+with+2-dimensional+shuffle+in+22nm+CMOS&rft.au=Hsu%2C+S.&rft.au=Agarwal%2C+A.&rft.au=Anders%2C+M.&rft.au=Mathew%2C+S.&rft.date=2012-02-01&rft.pub=IEEE&rft.isbn=9781467303767&rft.issn=0193-6530&rft.spage=178&rft.epage=180&rft_id=info:doi/10.1109%2FISSCC.2012.6176966&rft.externalDocID=6176966
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0193-6530&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0193-6530&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0193-6530&client=summon