Transformer-Driven Semantic Relation Inference for Multilabel Classification of High-Resolution Remote Sensing Images

It is hard to use a single label to describe an image for the complexity of remote sensing scenes. Thus, it is a more general and practical choice to use multilabel image classification for high-resolution remote sensing (HRS) images. How to construct the relation between categories is a vital probl...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal of selected topics in applied earth observations and remote sensing Vol. 15; pp. 1884 - 1901
Main Authors: Tan, Xiaowei, Xiao, Zhifeng, Zhu, Jianjun, Wan, Qiao, Wang, Kai, Li, Deren
Format: Journal Article
Language:English
Published: Piscataway IEEE 2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1939-1404, 2151-1535
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:It is hard to use a single label to describe an image for the complexity of remote sensing scenes. Thus, it is a more general and practical choice to use multilabel image classification for high-resolution remote sensing (HRS) images. How to construct the relation between categories is a vital problem for multilabel classification. Some researchers use the recurrent neural network (RNN) or long short-term memory (LSTM) to exploit label relations over the last years. However, the RNN or LSTM could model such category dependence in a chain propagation manner. The performance of the RNN/LSTM might be questioned when a specific category is improperly inferred. To address this, we propose a novel HRS image multilabel classification network, transformer-driven semantic relation inference network. The network comprises two modules: semantic sensitive module (SSM) and semantic relation-building module (SRBM). The SSM locates the semantic attentional regions in the features extracted by a deep convolutional neural network and generates a discriminative content-aware category representation (CACR). The SRBM uses label relation inference from outputs of the SSM to predict final results. The characteristic of the proposed method is that it can extract semantic attentional regions relevant to the category and generate a discriminative CACR and natural and interpretable reasoning about label relations. Experiments were performed on the public UCM multilabel and MLRSNet datasets. Quantitative and qualitative analyses on state-of-the-art multilabel benchmarks proved that the proposed method could effectively locate semantic regions and build relationships between categories with better robustness.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2022.3145042