Self-supervised Variational Autoencoder for Unsupervised Object Counting from Very High-Resolution Satellite Imagery: Applications in Dwelling Extraction in FDP Settlement Areas

In supervised learning, deep learning models demand a large corpus of annotated data for object detection and classification tasks. This constrains their utility in humanitarian emergency response. To overcome this problem, we have proposed an unsupervised dwelling counting from very high-resolution...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on geoscience and remote sensing Vol. 62; p. 1
Main Authors: Gella, Getachew Workineh, Gangloff, Hugo, Wendt, Lorenz, Tiede, Dirk, Lang, Stefan
Format: Journal Article
Language:English
Published: New York IEEE 01.01.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Institute of Electrical and Electronics Engineers
Subjects:
ISSN:0196-2892, 1558-0644
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In supervised learning, deep learning models demand a large corpus of annotated data for object detection and classification tasks. This constrains their utility in humanitarian emergency response. To overcome this problem, we have proposed an unsupervised dwelling counting from very high-resolution satellite imagery by combining a Variational Autoencoder(VAE) with an anomaly detection approach. When VAEs are applied in earth observation for dwelling localization and counting, we observed two critical limitations (1) the balance between reconstruction and good latent code, where in-favour of good reconstruction of dwellings leads to weak anomaly score maps that fail to properly localize dwellings (2) limited spatiotemporal invariance of the learned latent code. When the model is trained with datasets obtained from different geography and time, it fails to properly localize dwellings. For the first problem, we introduced self-supervision by creating synthetic anomalies. For the second problem, we introduced latent space conditioning. The approach is tested on 9 very high-resolution images obtained from six Forcibly Displaced People settlement areas. Results indicate that combining VAE with an anomaly detection approach has reached an AUC value ranging from 0.70 at complex settlements towards 0.98 at relatively less complex settlement areas. Similarly, an MAE value of 56.67 towards 5.03 is achieved for dwelling counting. Joint training of combined datasets with latent space conditioning and self-supervision enabled the achievement of results better than classical VAE, with improved spatiotemporal transferability of the model with more crisp and strong anomaly maps. Overall implementation code will be available at https://github.com/getch-geohum/SSL-VAE.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0196-2892
1558-0644
DOI:10.1109/TGRS.2023.3345179