Unsupervised Machine Learning for Effective Code Smell Detection: A Novel Method.

Uložené v:
Podrobná bibliografia
Názov: Unsupervised Machine Learning for Effective Code Smell Detection: A Novel Method.
Autori: Gupta, Ruchin, Kumar, Narendra, Kumar, Sunil, Seth, Jitendra Kumar
Zdroj: Journal of Communications Software & Systems; Dec2024, Vol. 20 Issue 4, p307-316, 10p
Predmety: MACHINE learning, SOURCE code, COMPUTER programming education, TRAINING needs, ENGINEERING
Abstrakt: The quality of source code is negatively impacted by code smells. Since the term "code smell" originated, numerous attempts have been made to comprehend it by identifying it using various techniques, such as metric-based, heuristic-based, optimization-based, machine learning (ML)-based, etc. Among these, supervised machine learning (SML) has shown effectiveness in detecting code smells. However, SML techniques have significant limitations, including the dependency on expensive and high-quality labeled data, the need for representative training datasets, and the risk of introducing biases in labeled examples that lead to skewed predictions. To overcome these challenges, this study introduces a method that leverages unsupervised machine learning (UnML) along with feature engineering. Unlike SML, UnML does not require labeled data and minimizes potential biases. The proposed method was evaluated using four datasets containing different types of code smells and was compared with a previous study that used SML techniques. The results indicate that the UnML-based method is effective, achieving outcomes closely aligned with those from the SML approach. This method is especially beneficial in situations where labeled data is scarce or unavailable and can be used to identify new code smells, generate labeled data for SML and detect multiple code smells simultaneously within a codebase. [ABSTRACT FROM AUTHOR]
Copyright of Journal of Communications Software & Systems is the property of Croatian Communications & Information Society and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáza: Complementary Index
Buďte prvý, kto okomentuje tento záznam!
Najprv sa musíte prihlásiť.