SemiBaCon: Semi-Supervised Balanced Contrastive Learning for Multimodal Remote Sensing Image Classification

Gespeichert in:
Bibliographische Detailangaben
Titel: SemiBaCon: Semi-Supervised Balanced Contrastive Learning for Multimodal Remote Sensing Image Classification
Autoren: Yufei He, Bobo Xi, Guocheng Li, Tie Zheng, Yunsong Li, Changbin Xue, Ming Shen
Quelle: He, Y, Xi, B, Li, G, Zheng, T, Li, Y, Xue, C & Shen, M 2025, 'SemiBaCon : Semi-Supervised Balanced Contrastive Learning for Multi-Modal Remote Sensing Image Classification', IEEE Transactions on Geoscience and Remote Sensing, vol. 63, 5519014. https://doi.org/10.1109/TGRS.2025.3589487
Verlagsinformationen: Institute of Electrical and Electronics Engineers (IEEE), 2025.
Publikationsjahr: 2025
Schlagwörter: Training data, Balanced sampling, Image classification, Superpixels, Aerospace electronics, Contrastive learning, Representation learning, Semantics, Laser radar, Multi-modal, Feature extraction, Training, Data mining, Semi-supervised
Beschreibung: The limited availability of annotated training data significantly constrains the classification accuracy of hyperspectral image (HSI) and LiDAR fusion approaches. Although contrastive learning has emerged as a potential solution, current implementations frequently neglect the critical class imbalance issues during unlabeled sample selection. To address the issue, we introduce a novel semi-supervised balanced contrastive learning (SemiBaCon) framework for multi-modal remote sensing image classification. First, we propose a superpixel-based balanced sampling (SPBS) mechanism that fundamentally addresses class imbalance through intelligent pseudo-label generation. By segmenting the HSI data into homogeneous superpixels and implementing intra-region label propagation, the method ensures statistically balanced pseudo-label selection across categories, effectively overcoming the bias introduced by conventional random sampling strategies. Second, our architecture integrates a dual-stream encoder combining convolutional neural networks (CNNs) with Transformers, enabling hierarchical feature extraction from spectral-spatial characteristics of HSI and elevation patterns of LiDAR. This design facilitates the construction of multi-modal positive sample pairs, achieving enhanced representation learning through inter-modal consistency constraints. Third, we develop a pseudo-label guided contrastive learning (PLCL) paradigm that synergistically combines pseudo-label confidence with feature similarity metrics, which effectively reduces intra-class variance and improves decision boundaries in the latent space. Comprehensive evaluations on three benchmark datasets demonstrate the framework’s superior performance compared to the state-of-the-art methods.
Publikationsart: Article
Dateibeschreibung: application/pdf
ISSN: 1558-0644
0196-2892
DOI: 10.1109/tgrs.2025.3589487
Zugangs-URL: http://www.scopus.com/inward/record.url?scp=105011733161&partnerID=8YFLogxK
https://doi.org/10.1109/TGRS.2025.3589487
https://vbn.aau.dk/ws/files/787953676/Accepted_Author_Manuscript.pdf
https://vbn.aau.dk/da/publications/ef605206-aa69-4314-b35e-e49aa99f8a98
Rights: IEEE Copyright
Dokumentencode: edsair.doi.dedup.....d1e5e9d0d52f8b41addc2b24ca239ccb
Datenbank: OpenAIRE
Beschreibung
Abstract:The limited availability of annotated training data significantly constrains the classification accuracy of hyperspectral image (HSI) and LiDAR fusion approaches. Although contrastive learning has emerged as a potential solution, current implementations frequently neglect the critical class imbalance issues during unlabeled sample selection. To address the issue, we introduce a novel semi-supervised balanced contrastive learning (SemiBaCon) framework for multi-modal remote sensing image classification. First, we propose a superpixel-based balanced sampling (SPBS) mechanism that fundamentally addresses class imbalance through intelligent pseudo-label generation. By segmenting the HSI data into homogeneous superpixels and implementing intra-region label propagation, the method ensures statistically balanced pseudo-label selection across categories, effectively overcoming the bias introduced by conventional random sampling strategies. Second, our architecture integrates a dual-stream encoder combining convolutional neural networks (CNNs) with Transformers, enabling hierarchical feature extraction from spectral-spatial characteristics of HSI and elevation patterns of LiDAR. This design facilitates the construction of multi-modal positive sample pairs, achieving enhanced representation learning through inter-modal consistency constraints. Third, we develop a pseudo-label guided contrastive learning (PLCL) paradigm that synergistically combines pseudo-label confidence with feature similarity metrics, which effectively reduces intra-class variance and improves decision boundaries in the latent space. Comprehensive evaluations on three benchmark datasets demonstrate the framework’s superior performance compared to the state-of-the-art methods.
ISSN:15580644
01962892
DOI:10.1109/tgrs.2025.3589487