Multiple Moving Objects Using Deep Learning for Trajectory Extraction and Clustering

Multiple Object Tracking (MOT) is a key challenge in computer vision. It is simply the process of tracking multiple objects within the frames of an input video. This technique has applications in many fields like medical research and video surveillance [1], as it can be used to test drug effects in...

Full description

Saved in:

Bibliographic Details
Published in:	2023 Intelligent Methods, Systems, and Applications (IMSA) pp. 62 - 67
Main Authors:	Elmasry, Yomna, Elzeky, Magda, Atia, Ayman
Format:	Conference Proceeding
Language:	English
Published:	IEEE 15.07.2023
Subjects:	Behavioral sciences Cell tracking challenge Clustering algorithms Deep learning Feature extraction Multiple object tracking (MOT) Multiple Objects Tracking Object detection Object tracking Trajectory Video surveillance YOLO
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Multiple Object Tracking (MOT) is a key challenge in computer vision. It is simply the process of tracking multiple objects within the frames of an input video. This technique has applications in many fields like medical research and video surveillance [1], as it can be used to test drug effects in medical research and to identify potentially dangerous behavior through video surveillance. This paper evaluates the tracking ability of You Only Look Once (YOLO) model in two case studies. It also compares the performance of k-means and density-based clustering algorithms (DBSCAN) in these studies. We extract the trajectories of the subjects and analyze their behavior based on their common characteristics of movement in two case studies. The first case study focused on the Cell Tracking Challenge dataset, while the second used the Multiple Subject Tracking (MOT) dataset. We used a deep learning model You Only Look Once (YOLO) for object tracking in two case studies. YOLO uses a fully convolutional neural network (CNN) to process images efficiently. The input to YOLO model passes on its network only once making it a fast model, unlike two-stage models like RCNN, which need the input images to pass the network two times for determining the regions of interest and object detection in different stages. For understanding the behavior of the moving objects, we clustered the tracked objects based on their common behavior. The clustering algorithms were evaluated using silhouette score and Davies-Bouldin Index (DBI) in two case studies. In the first case, k-means achieved a silhouette score of 0.590236 with four clusters and a DBI score of 0.538835 with nine clusters. In the second case, we obtained a silhouette score of 0.658 with two clusters and a DBI score of 0.458 with two clusters. As for DBSCAN algorithm, it got a silhouette score of -0.179 in the first case and -0.208 in the second case. These results indicate that k-means outperformed DBSCAN in these case studies.
DOI:	10.1109/IMSA58542.2023.10217507