Multi-modal RGB–Depth–Thermal Human Body Segmentation

This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	International journal of computer vision Ročník 118; číslo 2; s. 217 - 239
Hlavní autoři:	Palmero, Cristina, Clapés, Albert, Bahnsen, Chris, Møgelmose, Andreas, Moeslund, Thomas B., Escalera, Sergio
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York Springer US 01.06.2016 Springer Springer Nature B.V
Témata:	Algorithms Artificial Intelligence Calibration Computer Imaging Computer Science Computer vision Data mining Datasets Feature extraction Human acts Human behavior Human body Human subjects Image Processing and Computer Vision Image processing systems Learning Machine learning Mathematical analysis Pattern Recognition Pattern Recognition and Graphics Probabilistic methods Probability theory Registration Segmentation State of the art Studies Support vector machines Vision Vision systems RGB Depth Human body segmentation Thermal
ISSN:	0920-5691, 1573-1405
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations.
Bibliografie:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23
ISSN:	0920-5691 1573-1405
DOI:	10.1007/s11263-016-0901-x