Overview of Memory-Efficient Architectures for Deep Learning in Real-Time Systems

With advancements in artificial intelligence (AI), deep learning (DL) has become crucial for real-time data analytics in areas like autonomous driving, healthcare, and predictive maintenance; however, its computational and memory demands often exceed the capabilities of low-end devices. This paper e...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Engineering proceedings Ročník 104; číslo 1; s. 77
Hlavní autoři: Bilgin Demir, Ervin Domazet, Daniela Mechkaroska
Médium: Journal Article
Jazyk:angličtina
Vydáno: MDPI AG 01.09.2025
Témata:
ISSN:2673-4591
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:With advancements in artificial intelligence (AI), deep learning (DL) has become crucial for real-time data analytics in areas like autonomous driving, healthcare, and predictive maintenance; however, its computational and memory demands often exceed the capabilities of low-end devices. This paper explores optimizing deep learning architectures for memory efficiency to enable real-time computation in low-power designs. Strategies include model compression, quantization, and efficient network designs. Techniques such as eliminating unnecessary parameters, sparse representations, and optimized data handling significantly enhance system performance. The design addresses cache utilization, memory hierarchies, and data movement, reducing latency and energy use. By comparing memory management methods, this study highlights dynamic pruning and adaptive compression as effective solutions for improving efficiency and performance. These findings guide the development of accurate, power-efficient deep learning systems for real-time applications, unlocking new possibilities for edge and embedded AI.
ISSN:2673-4591
DOI:10.3390/engproc2025104077