COSIME: FeFET based Associative Memory for In-Memory Cosine Similarity Search

In a number of machine learning models, an input query is searched across the trained class vectors to find the closest feature class vector in cosine similarity metric. However, performing the cosine similarities between the vectors in Von-Neumann machines involves a large number of multiplications...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD) S. 1 - 9
Hauptverfasser: Liu, Che-Kai, Chen, Haobang, Imani, Mohsen, Ni, Kai, Kazemi, Arman, Laguna, Ann Franchesca, Niemier, Michael, Hu, Xiaobo Sharon, Zhao, Liang, Zhuo, Cheng, Yin, Xunzhao
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: ACM 29.10.2022
Schlagworte:
ISSN:1558-2434
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In a number of machine learning models, an input query is searched across the trained class vectors to find the closest feature class vector in cosine similarity metric. However, performing the cosine similarities between the vectors in Von-Neumann machines involves a large number of multiplications, Euclidean normalizations and division operations, thus incurring heavy hardware energy and latency overheads. Moreover, due to the memory wall problem that presents in the conventional architecture, frequent cosine similarity-based searches (CSSs) over the class vectors requires a lot of data movements, limiting the throughput and efficiency of the system. To overcome the aforementioned challenges, this paper introduces COSIME, a general in-memory associative memory (AM) engine based on the ferroelectric FET (FeFET) device for efficient CSS. By leveraging the one-transistor AND gate function of FeFET devices, current-based translinear analog circuit and winner-take-all (WTA) circuitry, COSIME can realize parallel in-memory CSS across all the entries in a memory block, and output the closest word to the input query in cosine similarity metric. Evaluation results at the array level suggest that the proposed COSIME design achieves 333× and 90.5× latency and energy improvements, respectively, and realizes better classification accuracy when compared with an AM design implementing approximated CSS. The proposed in-memory computing fabric is evaluated for an HDC problem, showcasing that COSIME can achieve on average 47.1× and 98.5× speedup and energy efficiency improvements compared with an GPU implementation.
ISSN:1558-2434
DOI:10.1145/3508352.3549412