Towards understanding residual and dilated dense neural networks via convolutional sparse coding
Abstract Convolutional neural network (CNN) and its variants have led to many state-of-the-art results in various fields. However, a clear theoretical understanding of such networks is still lacking. Recently, a multilayer convolutional sparse coding (ML-CSC) model has been proposed and proved to eq...
Uloženo v:
| Vydáno v: | National science review Ročník 8; číslo 3; s. nwaa159 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
China
Oxford University Press
01.03.2021
|
| Témata: | |
| ISSN: | 2095-5138, 2053-714X, 2053-714X |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Abstract
Convolutional neural network (CNN) and its variants have led to many state-of-the-art results in various fields. However, a clear theoretical understanding of such networks is still lacking. Recently, a multilayer convolutional sparse coding (ML-CSC) model has been proposed and proved to equal such simply stacked networks (plain networks). Here, we consider the initialization, the dictionary design and the number of iterations to be factors in each layer that greatly affect the performance of the ML-CSC model. Inspired by these considerations, we propose two novel multilayer models: the residual convolutional sparse coding (Res-CSC) model and the mixed-scale dense convolutional sparse coding (MSD-CSC) model. They are closely related to the residual neural network (ResNet) and the mixed-scale (dilated) dense neural network (MSDNet), respectively. Mathematically, we derive the skip connection in the ResNet as a special case of a new forward propagation rule for the ML-CSC model. We also find a theoretical interpretation of dilated convolution and dense connection in the MSDNet by analyzing the MSD-CSC model, which gives a clear mathematical understanding of each. We implement the iterative soft thresholding algorithm and its fast version to solve the Res-CSC and MSD-CSC models. The unfolding operation can be employed for further improvement. Finally, extensive numerical experiments and comparison with competing methods demonstrate their effectiveness.
From the view of convolutional sparse coding, we build mathematically equivalent forms of two advanced deep learning models including residual and dilated dense neural networks with skip connections. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 2095-5138 2053-714X 2053-714X |
| DOI: | 10.1093/nsr/nwaa159 |