Encoding human activities using multimodal wearable sensory data

Human Activity Recognition (HAR) is the task to automatically analyze and recognize human body gestures or actions. HAR using time-series multi-modal sensory data is a challenging and important task in the field of machine learning and feature engineering due to its increasing demands in numerous re...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Expert systems with applications Jg. 261; S. 125564
Hauptverfasser: Khan, Muhammad Hassan, Shafiq, Hadia, Farid, Muhammad Shahid, Grzegorzek, Marcin
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Ltd 01.02.2025
Schlagworte:
ISSN:0957-4174
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Human Activity Recognition (HAR) is the task to automatically analyze and recognize human body gestures or actions. HAR using time-series multi-modal sensory data is a challenging and important task in the field of machine learning and feature engineering due to its increasing demands in numerous real-world applications such as healthcare, sports and surveillance. Numerous daily wearable devices e.g., smartphones, smartwatches, and smart glasses can be used to collect and analyze the human activities on an unprecedented scale. This paper presents a generic framework to recognize the different human activities using continuous time-series multimodal sensory data of these smart gadgets. The proposed framework follows the channel of Bag-of-Features which consists of four steps: (i) Data acquisition and pre-processing, (ii) codebook computation, (iii) feature encoding, and (iv) classification. Each step in the framework plays a significant role to generate an appropriate feature representation of raw sensory data for efficient activity recognition. In the first step, we employed a simple overlapped-window sampling approach to segment the continuous time-series sensory data to make it suitable for activity recognition. Secondly, we build a codebook using k-means clustering algorithm to group the similar sub-sequences. The center of each group is known as codeword and we assume that it represents a specific movement in the activity sequence. The third step consists of feature encoding which transform the raw sensory data of activity sequence into its respective high-level representation for the classification. Specifically, we presented three reconstruction-based encoding techniques to encode sensory data, namely: Sparse Coding, Local Coordinate Coding, and Locality-constrained Linear Coding. The segmented activity sub-sequences are transformed to high-level representation using these techniques and earlier computed codebook. Finally, the encoded features are classified using a simple Random Forest classifier. The proposed HAR framework using three different encoding techniques is evaluated on three large benchmark datasets: UniMiB-SHAR dataset, MHEALTH dataset and WISDM dataset and their results are compared with most recent state-of-the-art techniques. It outperforms the existing techniques and achieves 98.39%, 99.5%, and 99.4% recognition scores on UniMiB-SHAR dataset, MHEALTH dataset, and WISDM dataset respectively. The excellent recognition results and computational analysis confirms the effectiveness and efficiency of the proposed framework. •A generic framework to analyze human activities using multimodal sensory data.•A comprehensive and systematic assessment of existing approaches.•An extensive evaluation of different reconstruction-based encoding methods.•Experimental evaluation of the encoding methods on three large benchmark datasets.
ISSN:0957-4174
DOI:10.1016/j.eswa.2024.125564