Data-Driven Insights: Boosting Algorithms to Uncover Electricity Theft Patterns in AMI

This study introduces a sophisticated supervised machine learning method for electric theft detection utilizing a customized histogram gradient boosting (HGB) algorithm. Comprehensive preprocessing, including imputation, normalization, outlier management, and resampling, ensures that the time-series...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on instrumentation and measurement Ročník 74; s. 1 - 12
Hlavní autoři:	Khan, Inam Ullah, Ali, Arshid, Taylor, C. James, Ma, Xiandong
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:	Accuracy Advanced metering infrastructure Advanced metering infrastructure (AMI) Algorithms Boosting boosting algorithms class balancing Classification algorithms Computational modeling Costs Decision trees Electricity Electricity consumption electricity theft detection (ETD) feature engineering Feature extraction Machine learning Meters Optimization Outliers (statistics) Performance measurement Resampling smart grid Smart grids Smart meters Supervised learning Theft
ISSN:	0018-9456, 1557-9662
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	This study introduces a sophisticated supervised machine learning method for electric theft detection utilizing a customized histogram gradient boosting (HGB) algorithm. Comprehensive preprocessing, including imputation, normalization, outlier management, and resampling, ensures that the time-series data are accurately prepared for analysis. The synthetic minority oversampling technique-edited nearest neighbor (SMOTE-ENN) algorithm corrects class imbalances, preparing the data for the feature optimization stage, in which key features are selected and extracted. The HGB algorithm, enhanced through Bayesian optimization, is central to the training process, resulting in a model that precisely classifies electricity consumption patterns as genuine or fraudulent. The robustness of the model is evaluated against other recognized boosting methods, such as adaptive boosting (ADB), gradient boosting decision tree (GBDT), and LightGBM, alongside various ensemble and traditional machine learning models. Utilizing key performance metrics such as accuracy, F1-score, and area under the curve (AUC) for validation, the proposed model yields very promising results, with 93% accuracy, 95% F1-score, and 98% AUC, outperforming the comparison group under similar dataset and hyperparameter conditions. This underscores the model's potential as a highly accurate tool for combating electricity theft within an advanced metering infrastructure (AMI).
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9456 1557-9662
DOI:	10.1109/TIM.2025.3557097