MQTTEEB-D: A high-fidelity benchmark for real-time MQTT anomaly detection using machine learning techniques
Message Queuing Telemetry Transport (MQTT) is essential for resource-constrained Internet of Things (IoT) environments; however, its widespread adoption has introduced significant security vulnerabilities. Although machine learning (ML) offers a promising solution for anomaly detection, existing mod...
Uloženo v:
| Vydáno v: | Ad hoc networks Ročník 181; s. 104062 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier B.V
01.02.2026
|
| Témata: | |
| ISSN: | 1570-8705 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Message Queuing Telemetry Transport (MQTT) is essential for resource-constrained Internet of Things (IoT) environments; however, its widespread adoption has introduced significant security vulnerabilities. Although machine learning (ML) offers a promising solution for anomaly detection, existing models are often hindered by unrealistic data, severe class imbalances, and high computational costs. To address these limitations, we present a comprehensive ML framework for MQTT anomaly detection benchmarked on MQTTEEB-D, a high-fidelity dataset from a physical IoT testbed. Our framework evaluates a diverse suite of algorithms, including tree ensembles and boosting methods, on both original imbalanced and balanced data. We assessed performance using standard metrics, imbalance-stable metrics such as the Matthews Correlation Coefficient (MCC), and a Performance–Efficiency Score (PES) to quantify the trade-off between predictive power and computational cost. Our results establish a new state-of-the-art, with the top models achieving over 98.8% accuracy and F1-score. These models also yielded dramatic efficiency gains, including a 43-fold reduction in training time and a 299-fold speedup in inference latency over previous benchmarks. Critically, we found that a model’s resilience to class imbalance is more vital for real-world deployment than its peak performance on artificially balanced data. Simpler tree-based models remained robust under imbalanced conditions, where more complex algorithms failed. These findings provide a new benchmark and reorient model selection towards efficient, reliable, and deployable IoT security systems. |
|---|---|
| ISSN: | 1570-8705 |
| DOI: | 10.1016/j.adhoc.2025.104062 |