Low-Latency, Low-Area, and Scalable Systolic-Like Modular Multipliers for GF(2^) Based on Irreducible All-One Polynomials

In this paper, an efficient recursive formulation is suggested for systolic implementation of canonical basis finite field multiplication over GF(2 m ) based on irreducible AOP. We have derived a recursive algorithm for the multiplication, and used that to design a regular and localized bit-level de...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on circuits and systems. I, Regular papers Ročník 64; číslo 2; s. 399 - 408
Hlavní autoři:	Meher, Pramod Kumar, Xin Lou
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 01.02.2017 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:	Algorithm design and analysis Complexity theory Computer architecture Cycle time Design Elliptic curve cryptography Elliptic curve cryptography (ECC) Energy conservation error-control-coding finite field multiplication Modular structures Multiplexing Multiplication Power consumption Recursive algorithms Registers systolic array Throughput Tradeoff analysis Very large scale integration very large scale integration (VLSI)
ISSN:	1549-8328, 1558-0806
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	In this paper, an efficient recursive formulation is suggested for systolic implementation of canonical basis finite field multiplication over GF(2 m ) based on irreducible AOP. We have derived a recursive algorithm for the multiplication, and used that to design a regular and localized bit-level dependence graph (DG) for systolic computation. The bit-level regular DG is converted into a fine-grained DG by node-splitting, and mapped that into a parallel systolic architecture. Unlike most of the existing structures, it does not involve any global communications for modular reduction. The proposed bit-parallel systolic structure has the same cycle time as that of the best existing bit-parallel systolic structure, but involves significantly less number of registers. The proposed bit-parallel design has a scalable latency of l + ⌈log 2 s⌉ +1 cycles which is considerably low compared with those of existing systolic designs. Moreover, the proposed time-multiplexed structure is designed specifically for scalability of throughput and hardware-complexity to meet the area-time trade-off in resource-constrained applications while maintaining or reducing the overall latency. The ASIC synthesis report shows that the proposed bit-parallel structures offers nearly 30% saving of area and nearly 38% saving of power consumption over the best of the existing AOP-based systolic finite field multiplier.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1549-8328 1558-0806
DOI:	10.1109/TCSI.2016.2614309