ITER: Image-to-pixel Representation for Weakly Supervised HSI Classification
Recent years have witnessed the superiority of deep learning-based algorithms in the field of HSI classification. However, a prerequisite for the favorable performance of these methods is a large number of refined pixel-level annotations. Due to atmospheric changes, sensor differences, and complex l...
Uloženo v:
| Vydáno v: | IEEE transactions on image processing Ročník 33; s. 1 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
United States
IEEE
01.01.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1057-7149, 1941-0042, 1941-0042 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Recent years have witnessed the superiority of deep learning-based algorithms in the field of HSI classification. However, a prerequisite for the favorable performance of these methods is a large number of refined pixel-level annotations. Due to atmospheric changes, sensor differences, and complex land cover distribution, pixel-level labeling of high-dimensional hyperspectral image (HSI) is extremely difficult, time-consuming, and laborious. To overcome the above hurdle, an Image-To-pixEl Representation (ITER) approach is proposed in this paper. To the best of our knowledge, this is the first time that image-level annotation is introduced to predict pixel-level classification maps for HSI. The proposed model is along the lines of subject modeling to boundary refinement, corresponding to pseudo-label generation and pixel-level prediction. Concretely, in the pseudo-label generation part, the spectral/spatial activation, spectral-spatial alignment loss, and geographic element enhancement are sequentially designed to locate discriminate regions of each category, optimize multi-domain class activation map (CAM) collaborative training, and refine labels, respectively. For the pixel-level prediction portion, a high frequency-aware self-attention in a high-enhanced transformer is put forward to achieve detailed feature representation. With the two-stage pipeline, ITER explores weakly supervised HSI classification with image-level tags, bridging the gap between image-level annotation and dense prediction. Extensive experiments in three benchmark datasets with state-of-the-art (SOTA) works show the performance of the proposed approach. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1057-7149 1941-0042 1941-0042 |
| DOI: | 10.1109/TIP.2023.3326699 |