Detecting Integer Overflow Errors in Java Source Code via Machine Learning

Integer overflow is a common cause of software failure and security vulnerability. Existing approaches to detecting integer overflow errors rely on traditional static code analysis and dynamic testing. This paper presents a novel machine learning-based approach that predicts integer overflow errors...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings - International Conference on Tools with Artificial Intelligence, TAI s. 724 - 728
Hlavní autori: Luo, Yu, Xu, Weifeng, Xu, Dianxiang
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.11.2021
Predmet:
ISSN:2375-0197
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Integer overflow is a common cause of software failure and security vulnerability. Existing approaches to detecting integer overflow errors rely on traditional static code analysis and dynamic testing. This paper presents a novel machine learning-based approach that predicts integer overflow errors by treating source code as text. It exploits text classifiers to determine whether each method in a given Java program contains an integer overflow error. As the training data is essential, we have constructed a comprehensive dataset to accounts for (a) integer overflow errors of all integer types and operations in Java (i.e., positive samples); (b) various programming techniques for preventing integer overflow errors (i.e., negative samples); and (c) malicious scenarios that may mislead text classifiers (i.e., adversarial samples). We have trained three classifiers, BERT, fastText, and NBSVM, that represent different text embedding techniques. BERT, as a representative deep-learning transformer, has achieved the highest performance scores and remained robust even when tested with the adversarial samples.
ISSN:2375-0197
DOI:10.1109/ICTAI52525.2021.00115