Real-Time Resource Allocation in Passive Optical Network for Energy-Efficient Inference at GPU-Based Network Edge

In recent years, the advances in deep learning (DL) technology have greatly improved artificial intelligence (AI)-related research and services. Among them, real-time object recognition using network cameras has become an important technology for various applications. A large number of network camer...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE internet of things journal Ročník 9; číslo 18; s. 17348 - 17358
Hlavní autori: Nakayama, Yu, Onodera, Yukito, Nguyen, Anh Hoang Ngoc, Hara-Azumi, Yuko
Médium: Journal Article
Jazyk:English
Japanese
Vydavateľské údaje: Piscataway IEEE 15.09.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:2327-4662, 2327-4662
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:In recent years, the advances in deep learning (DL) technology have greatly improved artificial intelligence (AI)-related research and services. Among them, real-time object recognition using network cameras has become an important technology for various applications. A large number of network cameras are being deployed for real-time object detection using DL models at GPU-based edge servers. A significant issue for widely deploying this type of systems is low-cost network deployment and low-latency data transmission. A promising option for efficiently accommodating numerous network cameras is time- and wavelength-division multiplexed passive optical network (TWDM-PON), which has prevailed in optical access network systems. The key challenge in a GPU-based inference system via TWDM-PON is to optimally allocate upstream wavelengths and bandwidths to enable real-time inference. To address this problem, this article proposes the concept of an inference system in which many cameras upload image data to a GPU-based edge server via TWDM-PON. A real-time resource allocation scheme for TWDM-PON is also proposed to guarantee low latency and time-synchronized data arrival at the edge. We formulated the wavelength and bandwidth allocation problem as a Boolean satisfiability problem (SAT) for fast computation. The performance of the proposed method is verified by computer simulation. The proposed scheme contributes to the increase in the batch size of arriving data at the edge server while ensuring low-latency data transmission. As a consequence, the computational efficiency of the GPU-based inference server is greatly improved by the increase in the batch size of data.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2022.3155606