Overview of Memory-Efficient Architectures for Deep Learning in Real-Time Systems
With advancements in artificial intelligence (AI), deep learning (DL) has become crucial for real-time data analytics in areas like autonomous driving, healthcare, and predictive maintenance; however, its computational and memory demands often exceed the capabilities of low-end devices. This paper e...
Saved in:
| Published in: | Engineering proceedings Vol. 104; no. 1; p. 77 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
MDPI AG
01.09.2025
|
| Subjects: | |
| ISSN: | 2673-4591 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | With advancements in artificial intelligence (AI), deep learning (DL) has become crucial for real-time data analytics in areas like autonomous driving, healthcare, and predictive maintenance; however, its computational and memory demands often exceed the capabilities of low-end devices. This paper explores optimizing deep learning architectures for memory efficiency to enable real-time computation in low-power designs. Strategies include model compression, quantization, and efficient network designs. Techniques such as eliminating unnecessary parameters, sparse representations, and optimized data handling significantly enhance system performance. The design addresses cache utilization, memory hierarchies, and data movement, reducing latency and energy use. By comparing memory management methods, this study highlights dynamic pruning and adaptive compression as effective solutions for improving efficiency and performance. These findings guide the development of accurate, power-efficient deep learning systems for real-time applications, unlocking new possibilities for edge and embedded AI. |
|---|---|
| ISSN: | 2673-4591 |
| DOI: | 10.3390/engproc2025104077 |