Detecting the risk of bullying victimization among adolescents: A large-scale machine learning approach

There is an increasing interest in using machine learning methods to identify risk factors for problematic behaviors. The current study tested and compared six machine learning algorithms: Logistic Regression, Naive Bayes, Decision Tree, Random Forest, K-Nearest Neighbors (KNN), and Light Gradient B...

Full description

Saved in:
Bibliographic Details
Published in:Computers in human behavior Vol. 147; p. 107817
Main Authors: Yan, Wei, Yuan, Yidan, Yang, Menghao, Zhang, Peng, Peng, Kaiping
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.10.2023
Subjects:
ISSN:0747-5632, 1873-7692
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:There is an increasing interest in using machine learning methods to identify risk factors for problematic behaviors. The current study tested and compared six machine learning algorithms: Logistic Regression, Naive Bayes, Decision Tree, Random Forest, K-Nearest Neighbors (KNN), and Light Gradient Boosting Machine (LightGBM), to detect risk factors for both traditional bullying victimization and cyberbullying victimization among Chinese adolescents. The Random Forest algorithm and LightGBM algorithm obtained similar accuracy and precision, and outperformed other four algorithms. We then combined the feature importance of LightGBM and Random Forest algorithms to evaluate the predictive power of 40 potentially relevant personal, educational, social and psychological factors in predicting bullying victimization, achieving better accuracy and higher performance. These results showed that the combined model can distinguish high-risk and low-risk adolescents for both types of bullying victimization based on a few easy-to-find variables. By comparing the relative significance of each factor, the current study also found mental illness, physical illness, and unhealthy living environments as having the highest values in predicting bullying victimization. Thus, the recommended model has a great application value in preventing bullying victimization among Chinese adolescents. •Most people (over 410000 students) participated.•Most risk factors (40 variables) from four domains were analyzed.•The most significant risk factors were identified.•Both traditional and cyber bullying victimization were included.•Two algorithms were recommended in predicting bullying victimization.
ISSN:0747-5632
1873-7692
DOI:10.1016/j.chb.2023.107817