Data augmentation for thermal infrared object detection with cascade pyramid generative adversarial network

Object detection based on convolutional neural network (CNN) should be trained effectively with much data. Data augmentation techniques devote to generate more data, which can enhance the generalization ability and robustness of detection network. For object detection in thermal infrared (TIR) image...

Full description

Saved in:
Bibliographic Details
Published in:Applied intelligence (Dordrecht, Netherlands) Vol. 52; no. 1; pp. 967 - 981
Main Authors: Dai, Xuerui, Yuan, Xue, Wei, Xueye
Format: Journal Article
Language:English
Published: New York Springer US 01.01.2022
Springer Nature B.V
Subjects:
ISSN:0924-669X, 1573-7497
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Object detection based on convolutional neural network (CNN) should be trained effectively with much data. Data augmentation techniques devote to generate more data, which can enhance the generalization ability and robustness of detection network. For object detection in thermal infrared (TIR) images, objects are difficult to label because of the heavy noise and low resolution. So, it is highly recommended for us to do data augmentation. However, traditional data augmentation strategies (such as image flipping, random color jittering) only produce limited training samples. In order to generate images with high resolution, and ensure they are subject to the distribution of real samples, generative adversarial network (GAN) is introduced. To generate high-resolution samples, image pyramids are input into different branches, then these cascade features are fused to gradually improve the resolution. For the sake of improving the discriminant capability of discriminator, the feature matching loss is calculated when training. And the generated images with different resolutions are discriminated in multiple stages. The data augmentation algorithm proposed in this paper is called cascade pyramid generative adversarial network (CPGAN). No matter on the KAIST Multispectral data set or OSU thermal-color data set, with our CPGAN, the detection accuracy of classical detection algorithms is greatly improved. In addition, the detection speed remains entirely unaffected because CPGAN only exists in the training phase.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-021-02445-9