A systematic review of unsupervised learning techniques for software defect prediction
Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners because it reduces the need for labeled training data. Investigate the use and performance of unsupervised learning techniques in software def...
Saved in:
| Published in: | Information and software technology Vol. 122; p. 106287 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
01.06.2020
|
| Subjects: | |
| ISSN: | 0950-5849, 1873-6025 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners because it reduces the need for labeled training data.
Investigate the use and performance of unsupervised learning techniques in software defect prediction.
We conducted a systematic literature review that identified 49 studies containing 2456 individual experimental results, which satisfied our inclusion criteria published between January 2000 and March 2018. In order to compare prediction performance across these studies in a consistent way, we (re-)computed the confusion matrices and employed the Matthews Correlation Coefficient (MCC) as our main performance measure.
Our meta-analysis shows that unsupervised models are comparable with supervised models for both within-project and cross-project prediction. Among the 14 families of unsupervised model, Fuzzy CMeans (FCM) and Fuzzy SOMs (FSOMs) perform best. In addition, where we were able to check, we found that almost 11% (262/2456) of published results (contained in 16 papers) were internally inconsistent and a further 33% (823/2456) provided insufficient details for us to check.
Although many factors impact the performance of a classifier, e.g., dataset characteristics, broadly speaking, unsupervised classifiers do not seem to perform worse than the supervised classifiers in our review. However, we note a worrying prevalence of (i) demonstrably erroneous experimental results, (ii) undemanding benchmarks and (iii) incomplete reporting. We therefore encourage researchers to be comprehensive in their reporting. |
|---|---|
| AbstractList | Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners because it reduces the need for labeled training data.
Investigate the use and performance of unsupervised learning techniques in software defect prediction.
We conducted a systematic literature review that identified 49 studies containing 2456 individual experimental results, which satisfied our inclusion criteria published between January 2000 and March 2018. In order to compare prediction performance across these studies in a consistent way, we (re-)computed the confusion matrices and employed the Matthews Correlation Coefficient (MCC) as our main performance measure.
Our meta-analysis shows that unsupervised models are comparable with supervised models for both within-project and cross-project prediction. Among the 14 families of unsupervised model, Fuzzy CMeans (FCM) and Fuzzy SOMs (FSOMs) perform best. In addition, where we were able to check, we found that almost 11% (262/2456) of published results (contained in 16 papers) were internally inconsistent and a further 33% (823/2456) provided insufficient details for us to check.
Although many factors impact the performance of a classifier, e.g., dataset characteristics, broadly speaking, unsupervised classifiers do not seem to perform worse than the supervised classifiers in our review. However, we note a worrying prevalence of (i) demonstrably erroneous experimental results, (ii) undemanding benchmarks and (iii) incomplete reporting. We therefore encourage researchers to be comprehensive in their reporting. |
| ArticleNumber | 106287 |
| Author | Li, Ning Shepperd, Martin Guo, Yuchen |
| Author_xml | – sequence: 1 givenname: Ning orcidid: 0000-0001-7394-0640 surname: Li fullname: Li, Ning organization: School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China – sequence: 2 givenname: Martin orcidid: 0000-0003-1874-6145 surname: Shepperd fullname: Shepperd, Martin email: martin.shepperd@brunel.ac.uk organization: Brunel University London, Uxbridge, UB8 3PH, United Kingdom – sequence: 3 givenname: Yuchen orcidid: 0000-0003-2756-9216 surname: Guo fullname: Guo, Yuchen organization: Department of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, 710049, China |
| BookMark | eNqFkE1LAzEQhoNUsK3-Aw_5A1sn2Wx214NQil9Q8KJeQ5qdaEqbrUna0n_vLuvJg54GBp535n0mZORbj4RcM5gxYPJmPXPextbOOPB-JXlVnpExq8o8k8CLERlDXUBWVKK-IJMY1wCshBzG5H1O4ykm3OrkDA14cHikraV7H_c7DAcXsaEb1ME7_0ETmk_vvvYYqW0D7U6mow5IG7RoEt0FbJxJrvWX5NzqTcSrnzklbw_3r4unbPny-LyYLzOTlzxltpDYVLw0JdRixUwtma5lZRhKDVgKKWAFsm6YqCrQOeYFYyIXXKKx3PImnxIx5JrQxhjQql1wWx1OioHq3ai1Gtyo3o0a3HTY7S_MuKT7x1PQbvMffDfA2BXrfAUVjUNvuu6hs6Ca1v0d8A3tpYWZ |
| CitedBy_id | crossref_primary_10_1049_2024_6294422 crossref_primary_10_1002_spe_3152 crossref_primary_10_1007_s11334_021_00387_6 crossref_primary_10_3390_math9151722 crossref_primary_10_3390_coatings15020164 crossref_primary_10_1016_j_eswa_2023_122409 crossref_primary_10_1016_j_jksuci_2021_09_018 crossref_primary_10_3390_math10173120 crossref_primary_10_1016_j_nanoen_2023_108559 crossref_primary_10_3390_e23101274 crossref_primary_10_1109_TSE_2022_3171202 crossref_primary_10_1016_j_infsof_2021_106742 crossref_primary_10_1109_TR_2024_3356515 crossref_primary_10_1109_TR_2025_3548107 crossref_primary_10_1155_2022_2522202 crossref_primary_10_1016_j_infsof_2022_106847 crossref_primary_10_1002_smr_2625 crossref_primary_10_3390_axioms11050223 crossref_primary_10_3390_app132212147 crossref_primary_10_1109_TR_2022_3158949 crossref_primary_10_1016_j_heliyon_2024_e25276 crossref_primary_10_1109_TR_2024_3393415 crossref_primary_10_21015_vtse_v10i4_1275 crossref_primary_10_1049_2024_8027037 crossref_primary_10_1016_j_procs_2023_10_256 crossref_primary_10_1016_j_infsof_2022_107128 crossref_primary_10_1371_journal_pone_0301541 crossref_primary_10_1002_smr_2533 crossref_primary_10_1007_s10489_021_02346_x crossref_primary_10_1515_auto_2022_0039 crossref_primary_10_1007_s10515_022_00344_y crossref_primary_10_1002_cpe_7240 crossref_primary_10_1016_j_infsof_2024_107456 crossref_primary_10_1016_j_jss_2022_111537 crossref_primary_10_1088_1742_6596_2025_1_012100 crossref_primary_10_1109_ACCESS_2024_3467183 crossref_primary_10_1007_s11042_022_14065_7 crossref_primary_10_1007_s10515_024_00424_1 crossref_primary_10_1109_ACCESS_2023_3287326 crossref_primary_10_3233_JIFS_213570 crossref_primary_10_1016_j_jss_2020_110763 crossref_primary_10_1109_TR_2023_3295598 crossref_primary_10_1049_sfw2_12073 crossref_primary_10_1108_IJICC_09_2024_0472 crossref_primary_10_1016_j_tifs_2025_104938 crossref_primary_10_1002_smr_70049 crossref_primary_10_1109_ACCESS_2020_3017101 crossref_primary_10_3390_f13122041 crossref_primary_10_1002_cpe_7664 crossref_primary_10_1002_cpe_7305 crossref_primary_10_1016_j_foodchem_2024_138945 crossref_primary_10_3390_su151310543 crossref_primary_10_1016_j_infsof_2022_106939 crossref_primary_10_3390_s21227535 crossref_primary_10_1016_j_jer_2023_10_038 crossref_primary_10_1016_j_knosys_2025_114146 crossref_primary_10_1016_j_jss_2023_111676 crossref_primary_10_1016_j_scico_2024_103164 crossref_primary_10_1016_j_infsof_2021_106662 crossref_primary_10_1016_j_procs_2023_01_002 crossref_primary_10_1016_j_jss_2020_110862 crossref_primary_10_1155_2021_5069016 crossref_primary_10_3390_app13031710 crossref_primary_10_3390_electronics10020179 crossref_primary_10_1002_smr_2549 crossref_primary_10_2186_jpr_JPR_D_23_00154 crossref_primary_10_1109_ACCESS_2022_3211401 crossref_primary_10_1109_ACCESS_2022_3174115 crossref_primary_10_1109_ACCESS_2024_3382991 crossref_primary_10_1016_j_infsof_2022_107102 crossref_primary_10_3390_en14237833 crossref_primary_10_1007_s10515_025_00510_y crossref_primary_10_1016_j_infsof_2020_106432 crossref_primary_10_1016_j_ijcip_2022_100527 crossref_primary_10_3390_diagnostics12112708 crossref_primary_10_1088_1612_202X_ad8742 crossref_primary_10_1002_btm2_70002 crossref_primary_10_1007_s10664_022_10215_5 crossref_primary_10_2516_stet_2024024 crossref_primary_10_1016_j_jnca_2023_103796 crossref_primary_10_1002_aisy_202300366 crossref_primary_10_1109_TR_2021_3060937 crossref_primary_10_1016_j_infsof_2023_107175 crossref_primary_10_1007_s00500_021_06048_x crossref_primary_10_1016_j_comnet_2022_108987 crossref_primary_10_1007_s10489_020_01935_6 crossref_primary_10_1007_s11334_023_00542_1 crossref_primary_10_1007_s11219_023_09640_6 crossref_primary_10_1016_j_cities_2022_103925 crossref_primary_10_1007_s10664_024_10455_7 crossref_primary_10_1007_s10489_025_06557_4 crossref_primary_10_1016_j_jss_2024_112179 crossref_primary_10_1109_TIM_2024_3446650 crossref_primary_10_1109_ACCESS_2024_3494044 crossref_primary_10_3390_drones7030214 crossref_primary_10_32604_cmc_2023_045522 crossref_primary_10_3390_biology12101298 crossref_primary_10_3390_foods14030411 crossref_primary_10_3390_sym13112040 crossref_primary_10_1007_s42773_023_00225_x crossref_primary_10_1007_s13198_021_01582_1 crossref_primary_10_3390_sym13112166 |
| Cites_doi | 10.1007/s10994-009-5119-5 10.1016/j.patrec.2005.10.010 10.5860/crln.76.3.9277 10.1007/s10515-013-0129-8 10.1109/32.815326 10.1109/TSE.2011.103 10.1109/TSE.2012.70 10.1109/TSE.2014.2322358 10.1016/j.eswa.2008.10.027 10.1109/TSE.2018.2836442 10.1093/bioinformatics/16.5.412 10.1146/annurev.pu.17.050196.000245 10.1038/s41562-016-0021 |
| ContentType | Journal Article |
| Copyright | 2020 Elsevier B.V. |
| Copyright_xml | – notice: 2020 Elsevier B.V. |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.infsof.2020.106287 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Business |
| EISSN | 1873-6025 |
| ExternalDocumentID | 10_1016_j_infsof_2020_106287 S0950584920300379 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1~. 1~5 29I 4.4 457 4G. 5GY 5VS 7-5 71M 77K 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN AAYOK ABBOA ABFNM ABFRF ABJNI ABMAC ABTAH ABXDB ABYKQ ACDAQ ACGFO ACGFS ACGOD ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD AEBSH AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BKOJK BKOMP BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W KOM LG9 M41 MO0 MS~ N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 R2- RIG ROL RPZ SBC SDF SDG SDP SES SEW SPC SPCBC SSV SSZ T5K TWZ UHS UNMZH WH7 WUQ XFK ZY4 ~G- 77I 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c372t-f56ed827c7094b1c961a968c1e6a0e74640b069d14880a3e351143426ecf2f2d3 |
| ISICitedReferencesCount | 133 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000525318800008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0950-5849 |
| IngestDate | Tue Nov 18 22:12:03 EST 2025 Sat Nov 29 07:12:00 EST 2025 Fri Feb 23 02:50:03 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Systematic review Software defect prediction Unsupervised learning Machine learning Meta-analysis |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c372t-f56ed827c7094b1c961a968c1e6a0e74640b069d14880a3e351143426ecf2f2d3 |
| ORCID | 0000-0003-1874-6145 0000-0001-7394-0640 0000-0003-2756-9216 |
| ParticipantIDs | crossref_primary_10_1016_j_infsof_2020_106287 crossref_citationtrail_10_1016_j_infsof_2020_106287 elsevier_sciencedirect_doi_10_1016_j_infsof_2020_106287 |
| PublicationCentury | 2000 |
| PublicationDate | June 2020 2020-06-00 |
| PublicationDateYYYYMMDD | 2020-06-01 |
| PublicationDate_xml | – month: 06 year: 2020 text: June 2020 |
| PublicationDecade | 2020 |
| PublicationTitle | Information and software technology |
| PublicationYear | 2020 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | N. Li, M. Shepperd, Y. Guo, A Systematic Review of Unsupervised Defect Prediction Dataset, Mendeley Data, v1, 10.17632/h24ctmyx73.1 Flach, Kull (bib0016) 2015 Yan, Yang, Liu, Zhang (bib0006) 2016 Mosteller, Colditz (bib0024) 1996; 17 Michie, Spiegelhalter, Taylor (bib0022) 1994 Hedges, Olkin (bib0025) 1985 Nam, Kim (bib0004) 2015 Bowes, Hall, Gray (bib0011) 2014; 21 Song, Guo, Shepperd (bib0017) 2018; 45 Hall, Beecham, Bowes, Gray, Counsell (bib0003) 2012; 38 Grissom, Kim (bib0021) 2012 Catal, Diri (bib0002) 2009; 36 Perlin, Imasato, Borenstein (bib0018) 2017 Chambers (bib0026) 1983 Powers (bib0009) 2011; 2 Munafò, Nosek, Bishop, Button, Chambers, du Sert, Simonsohn, Wagenmakers, Ware, Ioannidis (bib0027) 2017; 1 Kohavi (bib0020) 1995; 14 Fenton, Neil (bib0001) 1999; 25 Hand (bib0015) 2009; 77 Kitchenham, Budgen, Brereton (bib0005) 2015 Boucher, Badri (bib0007) 2016 Fawcett (bib0013) 2006; 27 Berger, Cirasella (bib0019) 2015; 76 Kamei, Shihab, Adams, Hassan, Mockus, Sinha, Ubayashi (bib0014) 2013; 39 Shepperd, Bowes, Hall (bib0010) 2014; 40 B. Bushman, M. Wang, Vote-counting Procedures in Meta-analysis, Russell Sage Foundation, New York, NY, US, pp. 207–220. Baldi, Brunak, Chauvin, Andersen, Nielsen (bib0012) 2000; 16 Nam (10.1016/j.infsof.2020.106287_bib0004) 2015 Munafò (10.1016/j.infsof.2020.106287_bib0027) 2017; 1 Shepperd (10.1016/j.infsof.2020.106287_bib0010) 2014; 40 Boucher (10.1016/j.infsof.2020.106287_bib0007) 2016 Mosteller (10.1016/j.infsof.2020.106287_bib0024) 1996; 17 Michie (10.1016/j.infsof.2020.106287_bib0022) 1994 Bowes (10.1016/j.infsof.2020.106287_bib0011) 2014; 21 Kamei (10.1016/j.infsof.2020.106287_bib0014) 2013; 39 Fenton (10.1016/j.infsof.2020.106287_bib0001) 1999; 25 Perlin (10.1016/j.infsof.2020.106287_bib0018) 2017 Chambers (10.1016/j.infsof.2020.106287_bib0026) 1983 Baldi (10.1016/j.infsof.2020.106287_bib0012) 2000; 16 Berger (10.1016/j.infsof.2020.106287_bib0019) 2015; 76 Yan (10.1016/j.infsof.2020.106287_bib0006) 2016 Grissom (10.1016/j.infsof.2020.106287_bib0021) 2012 Hand (10.1016/j.infsof.2020.106287_bib0015) 2009; 77 Flach (10.1016/j.infsof.2020.106287_bib0016) 2015 Powers (10.1016/j.infsof.2020.106287_bib0009) 2011; 2 Hall (10.1016/j.infsof.2020.106287_bib0003) 2012; 38 10.1016/j.infsof.2020.106287_bib0008 Song (10.1016/j.infsof.2020.106287_bib0017) 2018; 45 Kohavi (10.1016/j.infsof.2020.106287_bib0020) 1995; 14 Kitchenham (10.1016/j.infsof.2020.106287_bib0005) 2015 Fawcett (10.1016/j.infsof.2020.106287_bib0013) 2006; 27 10.1016/j.infsof.2020.106287_bib0023 Hedges (10.1016/j.infsof.2020.106287_bib0025) 1985 Catal (10.1016/j.infsof.2020.106287_bib0002) 2009; 36 |
| References_xml | – volume: 14 start-page: 1137 year: 1995 end-page: 1145 ident: bib0020 article-title: A study of cross-validation and bootstrap for accuracy estimation and model selection publication-title: IJCAI – start-page: 838 year: 2015 end-page: 846 ident: bib0016 article-title: Precision-recall-gain curves: PR analysis done right publication-title: Advances in Neural Information Processing Systems – start-page: 1 year: 2017 end-page: 19 ident: bib0018 article-title: Is predatory publishing a real threat? evidence from a large database study publication-title: Scientometrics – volume: 38 start-page: 1276 year: 2012 end-page: 1304 ident: bib0003 article-title: A systematic literature review on fault prediction performance in software engineering publication-title: IEEE Trans. Softw. Eng. – volume: 16 start-page: 412 year: 2000 end-page: 424 ident: bib0012 article-title: Assessing the accuracy of prediction algorithms for classification: an overview publication-title: Bioinformatics – year: 2012 ident: bib0021 article-title: Effect Sizes for Research: Univariate and Multivariate Applications – volume: 45 start-page: 1253 year: 2018 end-page: 1269 ident: bib0017 article-title: A comprehensive investigation of the role of imbalanced learning for software defect prediction publication-title: IEEE Trans. Softw. Eng. – volume: 27 start-page: 861 year: 2006 end-page: 874 ident: bib0013 article-title: An introduction to roc analysis publication-title: Pattern Recognit. Lett. – volume: 36 start-page: 7346 year: 2009 end-page: 7354 ident: bib0002 article-title: A systematic review of software fault prediction studies publication-title: Expert Syst. Appl. – volume: 40 start-page: 603 year: 2014 end-page: 616 ident: bib0010 article-title: Researcher bias: The use of machine learning in software defect prediction publication-title: IEEE Trans. Softw. Eng. – year: 1994 ident: bib0022 article-title: Machine learning, neural and statistical classification publication-title: Ellis Horwood Series in Artifical Intelligence – year: 2015 ident: bib0005 article-title: Evidence-based Software Engineering and Systematic Reviews – start-page: 169 year: 2016 end-page: 176 ident: bib0007 article-title: Using software metrics thresholds to predict fault-prone classes in object-oriented software publication-title: 2016 4th Intl Conf on Applied Computing and Information Technology/3rd Intl Conf on Computational Science/Intelligence and Applied Informatics/1st Intl Conf on Big Data, Cloud Computing, Data Science & Engineering (ACIT-CSII-BCD) – volume: 21 start-page: 287 year: 2014 end-page: 313 ident: bib0011 article-title: DConfusion: a technique to allow cross study performance evaluation of fault prediction studies publication-title: Autom. Softw. Eng. – volume: 39 start-page: 757 year: 2013 end-page: 773 ident: bib0014 article-title: A large-scale empirical study of just-in-time quality assurance publication-title: IEEE Trans. Softw. Eng. – reference: N. Li, M. Shepperd, Y. Guo, A Systematic Review of Unsupervised Defect Prediction Dataset, Mendeley Data, v1, 10.17632/h24ctmyx73.1 – volume: 17 start-page: 1 year: 1996 end-page: 23 ident: bib0024 article-title: Understanding research synthesis (meta-analysis) publication-title: Annu. Rev. Public Health – year: 1983 ident: bib0026 article-title: Graphical Methods for Data Analysis – reference: B. Bushman, M. Wang, Vote-counting Procedures in Meta-analysis, Russell Sage Foundation, New York, NY, US, pp. 207–220. – year: 1985 ident: bib0025 article-title: Statistical Methods for Meta-analysis – volume: 1 start-page: 21 year: 2017 ident: bib0027 article-title: A manifesto for reproducible science publication-title: Nature Human Behav. – volume: 25 start-page: 675 year: 1999 end-page: 689 ident: bib0001 article-title: A critique of software defect prediction models publication-title: IEEE Trans. Softw. Eng. – volume: 76 start-page: 132 year: 2015 end-page: 135 ident: bib0019 article-title: Beyond Beall’s list: better understanding predatory publishers publication-title: College Res. Lib. News – start-page: 452 year: 2015 end-page: 463 ident: bib0004 article-title: Clami: defect prediction on unlabeled datasets (T) publication-title: Automated Software Engineering (ASE), 2015 30th IEEE/ACM International Conference on – volume: 2 start-page: 37 year: 2011 end-page: 63 ident: bib0009 article-title: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation publication-title: J. Mach. Learn. Technol. – start-page: 134 year: 2016 end-page: 140 ident: bib0006 article-title: Self-learning change-prone class prediction. publication-title: SEKE – volume: 77 start-page: 103 year: 2009 end-page: 123 ident: bib0015 article-title: Measuring classifier performance: a coherent alternative to the area under the ROC curve publication-title: Mach. Learn. – ident: 10.1016/j.infsof.2020.106287_bib0008 – volume: 77 start-page: 103 issue: 1 year: 2009 ident: 10.1016/j.infsof.2020.106287_bib0015 article-title: Measuring classifier performance: a coherent alternative to the area under the ROC curve publication-title: Mach. Learn. doi: 10.1007/s10994-009-5119-5 – volume: 27 start-page: 861 issue: 8 year: 2006 ident: 10.1016/j.infsof.2020.106287_bib0013 article-title: An introduction to roc analysis publication-title: Pattern Recognit. Lett. doi: 10.1016/j.patrec.2005.10.010 – volume: 76 start-page: 132 issue: 3 year: 2015 ident: 10.1016/j.infsof.2020.106287_bib0019 article-title: Beyond Beall’s list: better understanding predatory publishers publication-title: College Res. Lib. News doi: 10.5860/crln.76.3.9277 – year: 1994 ident: 10.1016/j.infsof.2020.106287_bib0022 article-title: Machine learning, neural and statistical classification – volume: 21 start-page: 287 issue: 2 year: 2014 ident: 10.1016/j.infsof.2020.106287_bib0011 article-title: DConfusion: a technique to allow cross study performance evaluation of fault prediction studies publication-title: Autom. Softw. Eng. doi: 10.1007/s10515-013-0129-8 – volume: 14 start-page: 1137 year: 1995 ident: 10.1016/j.infsof.2020.106287_bib0020 article-title: A study of cross-validation and bootstrap for accuracy estimation and model selection – year: 2012 ident: 10.1016/j.infsof.2020.106287_bib0021 – volume: 25 start-page: 675 issue: 5 year: 1999 ident: 10.1016/j.infsof.2020.106287_bib0001 article-title: A critique of software defect prediction models publication-title: IEEE Trans. Softw. Eng. doi: 10.1109/32.815326 – start-page: 169 year: 2016 ident: 10.1016/j.infsof.2020.106287_bib0007 article-title: Using software metrics thresholds to predict fault-prone classes in object-oriented software – volume: 38 start-page: 1276 issue: 6 year: 2012 ident: 10.1016/j.infsof.2020.106287_bib0003 article-title: A systematic literature review on fault prediction performance in software engineering publication-title: IEEE Trans. Softw. Eng. doi: 10.1109/TSE.2011.103 – volume: 39 start-page: 757 issue: 6 year: 2013 ident: 10.1016/j.infsof.2020.106287_bib0014 article-title: A large-scale empirical study of just-in-time quality assurance publication-title: IEEE Trans. Softw. Eng. doi: 10.1109/TSE.2012.70 – start-page: 452 year: 2015 ident: 10.1016/j.infsof.2020.106287_bib0004 article-title: Clami: defect prediction on unlabeled datasets (T) – volume: 40 start-page: 603 issue: 6 year: 2014 ident: 10.1016/j.infsof.2020.106287_bib0010 article-title: Researcher bias: The use of machine learning in software defect prediction publication-title: IEEE Trans. Softw. Eng. doi: 10.1109/TSE.2014.2322358 – volume: 36 start-page: 7346 issue: 4 year: 2009 ident: 10.1016/j.infsof.2020.106287_bib0002 article-title: A systematic review of software fault prediction studies publication-title: Expert Syst. Appl. doi: 10.1016/j.eswa.2008.10.027 – year: 1985 ident: 10.1016/j.infsof.2020.106287_bib0025 – start-page: 838 year: 2015 ident: 10.1016/j.infsof.2020.106287_bib0016 article-title: Precision-recall-gain curves: PR analysis done right – start-page: 1 year: 2017 ident: 10.1016/j.infsof.2020.106287_bib0018 article-title: Is predatory publishing a real threat? evidence from a large database study publication-title: Scientometrics – volume: 45 start-page: 1253 issue: 12 year: 2018 ident: 10.1016/j.infsof.2020.106287_bib0017 article-title: A comprehensive investigation of the role of imbalanced learning for software defect prediction publication-title: IEEE Trans. Softw. Eng. doi: 10.1109/TSE.2018.2836442 – year: 1983 ident: 10.1016/j.infsof.2020.106287_bib0026 – volume: 2 start-page: 37 issue: 1 year: 2011 ident: 10.1016/j.infsof.2020.106287_bib0009 article-title: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation publication-title: J. Mach. Learn. Technol. – volume: 16 start-page: 412 issue: 5 year: 2000 ident: 10.1016/j.infsof.2020.106287_bib0012 article-title: Assessing the accuracy of prediction algorithms for classification: an overview publication-title: Bioinformatics doi: 10.1093/bioinformatics/16.5.412 – year: 2015 ident: 10.1016/j.infsof.2020.106287_bib0005 – start-page: 134 year: 2016 ident: 10.1016/j.infsof.2020.106287_bib0006 article-title: Self-learning change-prone class prediction. – ident: 10.1016/j.infsof.2020.106287_bib0023 – volume: 17 start-page: 1 issue: 1 year: 1996 ident: 10.1016/j.infsof.2020.106287_bib0024 article-title: Understanding research synthesis (meta-analysis) publication-title: Annu. Rev. Public Health doi: 10.1146/annurev.pu.17.050196.000245 – volume: 1 start-page: 21 issue: 1 year: 2017 ident: 10.1016/j.infsof.2020.106287_bib0027 article-title: A manifesto for reproducible science publication-title: Nature Human Behav. doi: 10.1038/s41562-016-0021 |
| SSID | ssj0017030 |
| Score | 2.6336992 |
| SecondaryResourceType | review_article |
| Snippet | Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 106287 |
| SubjectTerms | Machine learning Meta-analysis Software defect prediction Systematic review Unsupervised learning |
| Title | A systematic review of unsupervised learning techniques for software defect prediction |
| URI | https://dx.doi.org/10.1016/j.infsof.2020.106287 |
| Volume | 122 |
| WOSCitedRecordID | wos000525318800008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-6025 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017030 issn: 0950-5849 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1La9wwEBZtUkovJX2RpEnRobegYkteST4uIaEtIRSalu3JyLIMWYKzrNdtfn5HHsne4tIX9GKMsCx7vo_RaDQzIuS1dNpks7JktkpLlmW1ZFomNXNCG5h_FNz3icIX6vJSLxb5hxCn2_bHCaim0Xd3-eq_Qg1tALZPnf0LuIeXQgPcA-hwBdjh-kfAzyfVmfvMlK5pu5VXDC2YmDfRITKUcO3LMpy0oJS_-ViwymFR47XfxxmwW8aw9yHl8QT97qHXZuKnv7hGuoX5EWtBruA7qpAoFCt_-xCgrnfbfumAR822N4InY9QUusgmaTLB15gwsHRQNzrUtFoJJhPMeh5UMeYoT9Q6ehiWfi0Cv_TGDwyNkoe5-seC2R_9cH40DgosESq_T3a5muWg83bn784W74ddJq_tsBYjfl5Mrezj_6Zj_dx02TJHrvbI47COoHPE_wm555qn5GFMY3hGPs_pSAOKNKC3Nd2mAY00oCMNKEBLI6AUaUBHGjwnn87Prk7fsnCGBrNC8Q2rZ9JVmiurYB1fpjaXqcmltqmTJnEqk1lSJjKvUq_IjXB-XzkTYLY5W_OaV-IF2WluG7dPqLLGmFkptJU8s8rosl-9ZtLkYPdrfUBEFE9hQ4F5f87JTREjCZcFCrXwQi1QqAeEDb1WWGDlN8-rKPkiGIlo_BVAll_2PPznni_Jo5HrR2Rns-7cMXlgv26u2_WrwKrvZiuQLQ |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+systematic+review+of+unsupervised+learning+techniques+for+software+defect+prediction&rft.jtitle=Information+and+software+technology&rft.au=Li%2C+Ning&rft.au=Shepperd%2C+Martin&rft.au=Guo%2C+Yuchen&rft.date=2020-06-01&rft.pub=Elsevier+B.V&rft.issn=0950-5849&rft.eissn=1873-6025&rft.volume=122&rft_id=info:doi/10.1016%2Fj.infsof.2020.106287&rft.externalDocID=S0950584920300379 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0950-5849&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0950-5849&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0950-5849&client=summon |