Multiagent Reinforcement Learning for Hyperparameter Optimization of Convolutional Neural Networks
Nowadays, deep convolutional neural networks (DCNNs) play a significant role in many application domains, such as computer vision, medical imaging, and image processing. Nonetheless, designing a DCNN, able to defeat the state of the art, is a manual, challenging, and time-consuming task, due to the...
Uloženo v:
| Vydáno v: | IEEE transactions on computer-aided design of integrated circuits and systems Ročník 41; číslo 4; s. 1034 - 1047 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.04.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 0278-0070, 1937-4151 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Nowadays, deep convolutional neural networks (DCNNs) play a significant role in many application domains, such as computer vision, medical imaging, and image processing. Nonetheless, designing a DCNN, able to defeat the state of the art, is a manual, challenging, and time-consuming task, due to the extremely large design space, as a consequence of a large number of layers and their corresponding hyperparameters. In this work, we address the challenge of performing hyperparameter optimization of DCNNs through a novel multiagent reinforcement learning (MARL)-based approach, eliminating the human effort. In particular, we adapt <inline-formula> <tex-math notation="LaTeX">Q </tex-math></inline-formula>-learning and define learning agents per layer to split the design space into independent smaller design subspaces such that each agent fine tunes the hyperparameters of the assigned layer concerning a global reward. Moreover, we provide a novel formation of <inline-formula> <tex-math notation="LaTeX">Q </tex-math></inline-formula>-tables along with a new update rule that facilitates agents' communication. Our MARL-based approach is data driven and able to consider an arbitrary set of design objectives and constraints. We apply our MARL-based solution to different well-known DCNNs, including GoogLeNet, VGG, and U-Net, and various datasets for image classification and semantic segmentation. Our results have shown that compared to the original CNNs, the MARL-based approach can reduce the model size, training time, and inference time by up to, respectively, <inline-formula> <tex-math notation="LaTeX">83\times </tex-math></inline-formula>, 52%, and 54% without any degradation in accuracy. Moreover, our approach is very competitive to state-of-the-art neural architecture search methods in terms of the designed CNN accuracy and its number of parameters while significantly reducing the optimization cost. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0278-0070 1937-4151 |
| DOI: | 10.1109/TCAD.2021.3077193 |