Object detection based on RGC mask R-CNN

Object detection is a crucial topic in computer vision. Mask Region-Convolution Neural Network (R-CNN) based methods, wherein a large intersection over union (IoU) threshold is chosen for high quality samples, have often been employed for object detection. However, the detection performance of such...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IET image processing Ročník 14; číslo 8; s. 1502 - 1508
Hlavní autori:	Wu, Minghu, Yue, Hanhui, Wang, Juan, Huang, Yongxi, Liu, Min, Jiang, Yuhan, Ke, Cong, Zeng, Cheng
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	The Institution of Engineering and Technology 19.06.2020
Predmet:	bounding box head computer vision detection performance feature extraction feature pyramid network neck high quality samples image classification image coding image representation improved Mask R‐CNN‐based method learning (artificial intelligence) Mask Region‐Convolution Neural Network based methods neural nets object detection Research Article ResNet Group Cascade Mask R‐CNN RGC mask R‐CNN image classification detection performance improved Mask R-CNN-based method object detection high quality samples bounding box head feature extraction ResNet Group Cascade Mask R-CNN computer vision image representation RGC mask R-CNN feature pyramid network neck Mask Region-Convolution Neural Network based methods learning (artificial intelligence) high-quality samples image coding neural nets
ISSN:	1751-9659, 1751-9667
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Object detection is a crucial topic in computer vision. Mask Region-Convolution Neural Network (R-CNN) based methods, wherein a large intersection over union (IoU) threshold is chosen for high quality samples, have often been employed for object detection. However, the detection performance of such methods deteriorates when samples are reduced. To address this, the authors propose an improved Mask R-CNN-based method: the ResNet Group Cascade (RGC) Mask R-CNN. First, they compared ResNet with different layers, finding that ResNeXt-101-64 × 4d is superior to other backbone networks. Secondly, during the training of the test model, the performance of Mask R-CNN suffered from a small batch processing scale, resulting in inaccurately calculated mean and variance; thus, group normalisation was added to the backbone, feature pyramid network neck and bounding box head of the network. Finally, the higher the intersection of Mask R-CNN than the threshold, the easier it is to obtain high-quality samples. However, blindly selecting a high threshold leads to sample reduction and overfitting. Thus, a proposed cascade network configuration with three IoU thresholds was utilised in the process of model training. The model was trained and tested on the COCO and PASCAL VOC07 datasets. Their proposed algorithm demonstrated superior performance compared to that of the Mask R-CNN.
ISSN:	1751-9659 1751-9667
DOI:	10.1049/iet-ipr.2019.0057