Predicting postural risk level with computer vision and machine learning on multiple sources of images

Postural risk level assesses the degree of potential harms to a human body due to bad postures often encountered in workplaces. If accurate 3-dimensional coordinates of body joints can be found from images, then postural risk level can be calculated analytically. Unfortunately, even with today'...

Full description

Saved in:
Bibliographic Details
Published in:Engineering applications of artificial intelligence Vol. 143; p. 109981
Main Author: Doong, Shing Hwang
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.03.2025
Subjects:
ISSN:0952-1976
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Postural risk level assesses the degree of potential harms to a human body due to bad postures often encountered in workplaces. If accurate 3-dimensional coordinates of body joints can be found from images, then postural risk level can be calculated analytically. Unfortunately, even with today's very successful deep learning technologies, human pose estimation algorithms still cannot give accurately enough coordinates for the task. Machine learning (ML) may be needed to improve risk estimation from outputs of pose estimators. In this study, we apply ML to train a risk level predictor based on joint angles calculated from outputs of pose estimators. We validate our work with publicly available image datasets. Two state-of-the-art pose estimators with pre-trained models were used as is. Multiple sources of images were combined in different ways to tackle covariate shift and concept drift issues in ML. It is found that a weakly supervised algorithm increased balanced accuracy and F1 score substantially from the baseline approach on two public image datasets. F1 score is the harmonic mean of precision and recall. A two-stage multi-source domain adaptation algorithm also improved these measures, though the improvement was not as impressive as the weak supervision approach. The baseline approach calculated risk levels analytically with outputs of pose estimators. The balanced accuracy is 0.676/0.735/0.745 for baseline/two-stage/weak-supervision with one pose estimator, and 0.749/0.745/0.761 with another one. The F1 score is 0.692/0.726/0.736 for baseline/two-stage/weak-supervision with one pose estimator, and 0.745/0.732/0.748 with another estimator. Applications to private images with the proposed procedure are explained.
ISSN:0952-1976
DOI:10.1016/j.engappai.2024.109981