MIRACLE: Multimodal Information Retrieval via a Combined In-Memory Processing and Content Addressable Memory Approach

The rapid advancement of information technology has brought multimodal information retrieval into the research spotlight. Neural networks, particularly Transformers, have emerged as the dominant solution for extracting multimodal feature vectors. While neural network acceleration has been extensivel...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:2025 62nd ACM/IEEE Design Automation Conference (DAC) s. 1 - 7
Hlavní autori: Liu, Xuehui, Wang, Xueyan, Yu, Tianyang, Cheng, Chen, Ran, Shuo, Wu, Bi, Jia, Xiaotao, Liu, Weiqiang, Qu, Gang, Zhao, Weisheng
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 22.06.2025
Predmet:
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The rapid advancement of information technology has brought multimodal information retrieval into the research spotlight. Neural networks, particularly Transformers, have emerged as the dominant solution for extracting multimodal feature vectors. While neural network acceleration has been extensively explored, the subsequent retrieval stage in multimodal scenarios remains under-optimized. Conventional retrieval approaches, such as cosine similarity sorting on von Neumann architectures, suffer from significant data migration and computational inefficiencies. Hashing methods enhance storage and computation efficiency but encounter challenges in energy-efficient implementation and mitigating accuracy losses due to modal heterogeneity. This paper presents a hybrid architecture that integrates in-memory processing (PIM) and content-addressable memory (CAM) to address these challenges. Transformer-extracted features are processed via in-memory random hashing leveraging device-intrinsic properties, with CAM facilitating parallel search space reduction. A final cosine similarity reranking stage refines the results while balancing accuracy with energy efficiency. Experimental evaluations validate that the proposed method, when compared to the baseline traditional CPU-based cosine similarity retrieval, 1) achieves almost identical level of accuracy, dramatically outperforming other pure CAMbased Hamming distance retrieval approaches; and 2) reduces latency by 9.45 \times and energy consumption by 30.20 \times.
DOI:10.1109/DAC63849.2025.11132973