SNAS: Fast Hardware-Aware Neural Architecture Search Methodology

Recently, automated neural architecture search (NAS) emerges as the default technique to find a state-of-the-art (SOTA) convolutional neural network (CNN) architecture with higher accuracy than manually designed architectures for image classification. In this article, we present a fast hardware-awar...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on computer-aided design of integrated circuits and systems Vol. 41; no. 11; pp. 4826 - 4836
Main Authors:	Lee, Jaeseong, Rhim, Jungsub, Kang, Duseok, Ha, Soonhoi
Format:	Journal Article
Language:	English
Published:	New York IEEE 01.11.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Artificial neural networks Compounds Computer architecture Constraint-aware AutoML Convolution Convolutional neural networks convolutional neural networks (CNNs) Hardware Image classification Kernel Methodology Network latency neural architecture search (NAS) neural network design Neural networks Searching Space exploration
ISSN:	0278-0070, 1937-4151
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Recently, automated neural architecture search (NAS) emerges as the default technique to find a state-of-the-art (SOTA) convolutional neural network (CNN) architecture with higher accuracy than manually designed architectures for image classification. In this article, we present a fast hardware-aware NAS methodology, called S3NAS, reflecting the latest research results. It consists of three steps: 1) supernet design; 2) Single-Path NAS for fast architecture exploration; and 3) scaling and post-processing. In the first step, we design a supernet, superset of candidate networks with two features: one is to allow stages to have a different number of blocks, and the other is to enable blocks to have parallel layers of different kernel sizes (MixConv). Next, we perform a differential search by extending the Single-Path NAS technique to support the MixConv layer and to add a latency-aware loss term to reduce the hyperparameter search overhead. Finally, we use compound scaling to scale up the network maximally within the latency constraint. In addition, we add squeeze-and-excitation (SE) blocks and h-swish activation functions if beneficial in the post-processing step. Experiments with the proposed methodology on four different hardware platforms demonstrate the effectiveness of the proposed methodology. It is capable of finding networks with better latency-accuracy tradeoff than SOTA networks, and the network search can be done within 4 h using TPUv3.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0278-0070 1937-4151
DOI:	10.1109/TCAD.2021.3134843