SNAS: Fast Hardware-Aware Neural Architecture Search Methodology

Recently, automated neural architecture search (NAS) emerges as the default technique to find a state-of-the-art (SOTA) convolutional neural network (CNN) architecture with higher accuracy than manually designed architectures for image classification. In this article, we present a fast hardware-awar...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on computer-aided design of integrated circuits and systems Vol. 41; no. 11; pp. 4826 - 4836
Main Authors: Lee, Jaeseong, Rhim, Jungsub, Kang, Duseok, Ha, Soonhoi
Format: Journal Article
Language:English
Published: New York IEEE 01.11.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:0278-0070, 1937-4151
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently, automated neural architecture search (NAS) emerges as the default technique to find a state-of-the-art (SOTA) convolutional neural network (CNN) architecture with higher accuracy than manually designed architectures for image classification. In this article, we present a fast hardware-aware NAS methodology, called S3NAS, reflecting the latest research results. It consists of three steps: 1) supernet design; 2) Single-Path NAS for fast architecture exploration; and 3) scaling and post-processing. In the first step, we design a supernet, superset of candidate networks with two features: one is to allow stages to have a different number of blocks, and the other is to enable blocks to have parallel layers of different kernel sizes (MixConv). Next, we perform a differential search by extending the Single-Path NAS technique to support the MixConv layer and to add a latency-aware loss term to reduce the hyperparameter search overhead. Finally, we use compound scaling to scale up the network maximally within the latency constraint. In addition, we add squeeze-and-excitation (SE) blocks and h-swish activation functions if beneficial in the post-processing step. Experiments with the proposed methodology on four different hardware platforms demonstrate the effectiveness of the proposed methodology. It is capable of finding networks with better latency-accuracy tradeoff than SOTA networks, and the network search can be done within 4 h using TPUv3.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0278-0070
1937-4151
DOI:10.1109/TCAD.2021.3134843