LayerCAM: Exploring Hierarchical Class Activation Maps for Localization

The class activation maps are generated from the final convolutional layer of CNN. They can highlight discriminative object regions for the class of interest. These discovered object regions have been widely used for weakly-supervised tasks. However, due to the small spatial resolution of the final...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing Jg. 30; S. 5875 - 5888
Hauptverfasser:	Jiang, Peng-Tao, Zhang, Chang-Bin, Hou, Qibin, Cheng, Ming-Ming, Wei, Yunchao
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York IEEE 2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	class activation maps Convolution Feature maps Image segmentation Localization Location awareness Pixels Reliability Semantic segmentation Semantics Spatial resolution Task analysis Weakly-supervised object localization
ISSN:	1057-7149, 1941-0042, 1941-0042
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The class activation maps are generated from the final convolutional layer of CNN. They can highlight discriminative object regions for the class of interest. These discovered object regions have been widely used for weakly-supervised tasks. However, due to the small spatial resolution of the final convolutional layer, such class activation maps often locate coarse regions of the target objects, limiting the performance of weakly-supervised tasks that need pixel-accurate object locations. Thus, we aim to generate more fine-grained object localization information from the class activation maps to locate the target objects more accurately. In this paper, by rethinking the relationships between the feature maps and their corresponding gradients, we propose a simple yet effective method, called LayerCAM. It can produce reliable class activation maps for different layers of CNN. This property enables us to collect object localization information from coarse (rough spatial localization) to fine (precise fine-grained details) levels. We further integrate them into a high-quality class activation map, where the object-related pixels can be better highlighted. To evaluate the quality of the class activation maps produced by LayerCAM, we apply them to weakly-supervised object localization and semantic segmentation. Experiments demonstrate that the class activation maps generated by our method are more effective and reliable than those by the existing attention methods. The code will be made publicly available.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1057-7149 1941-0042 1941-0042
DOI:	10.1109/TIP.2021.3089943