Real-Time Driver Monitoring: Implementing FPGA-Accelerated CNNs for Pose Detection

As autonomous driving technology advances at an unprecedented pace, drivers are experiencing greater freedom within their vehicles, which accelerates the development of various intelligent systems to support safe and more efficient driving. These intelligent systems provide interactive applications...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on very large scale integration (VLSI) systems Ročník 33; číslo 7; s. 1848 - 1857
Hlavní autoři: Kim, Minjoon, So, Jaehyuk
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.07.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1063-8210, 1557-9999
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:As autonomous driving technology advances at an unprecedented pace, drivers are experiencing greater freedom within their vehicles, which accelerates the development of various intelligent systems to support safe and more efficient driving. These intelligent systems provide interactive applications between the vehicle and the driver, utilizing driver behavior analysis (DBA). A key performance indicator is real-time driver monitoring quality, as it directly impacts both safety and convenience in vehicle operation. In order to achieve real-time interaction, an image processing speed exceeding 30 frames/s and a delay time (latency) below 100 ms are generally required. However, expensive devices are often necessary to support this with software. Therefore, this article presents an algorithm and implementation results for immediate in-vehicle DBA through field-programmable gate array (FPGA)-based high-speed upper body-pose estimation. First, we define the 11 key points related to the driver's pose and gaze and model a convolutional neural network (CNN) architecture that can quickly detect them. The proposed algorithm utilizes regeneration and retraining through layer reduction based on the residual-CNN model. In addition, the algorithm presents the results of its implementation at the register transfer level (RTL) level of the VCU118 FPGA and demonstrates simulation results of 34.7 frames/s and a delay time of 75.3 ms. Lastly, we discuss the results of linking a demo application and creating a vehicle testbed to experiment with the driver-vehicle interaction (DVI) system. A developed FPGA platform is implemented to process camera image input in real time. It reliably supports detected pose and gaze results at 30 frames/s via Ethernet. It also presents results that verify its application in screen control and driver monitoring systems.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1063-8210
1557-9999
DOI:10.1109/TVLSI.2025.3554880