A systematic review of unsupervised learning techniques for software defect prediction

Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners because it reduces the need for labeled training data. Investigate the use and performance of unsupervised learning techniques in software def...

Full description

Saved in:
Bibliographic Details
Published in:Information and software technology Vol. 122; p. 106287
Main Authors: Li, Ning, Shepperd, Martin, Guo, Yuchen
Format: Journal Article
Language:English
Published: Elsevier B.V 01.06.2020
Subjects:
ISSN:0950-5849, 1873-6025
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners because it reduces the need for labeled training data. Investigate the use and performance of unsupervised learning techniques in software defect prediction. We conducted a systematic literature review that identified 49 studies containing 2456 individual experimental results, which satisfied our inclusion criteria published between January 2000 and March 2018. In order to compare prediction performance across these studies in a consistent way, we (re-)computed the confusion matrices and employed the Matthews Correlation Coefficient (MCC) as our main performance measure. Our meta-analysis shows that unsupervised models are comparable with supervised models for both within-project and cross-project prediction. Among the 14 families of unsupervised model, Fuzzy CMeans (FCM) and Fuzzy SOMs (FSOMs) perform best. In addition, where we were able to check, we found that almost 11% (262/2456) of published results (contained in 16 papers) were internally inconsistent and a further 33% (823/2456) provided insufficient details for us to check. Although many factors impact the performance of a classifier, e.g., dataset characteristics, broadly speaking, unsupervised classifiers do not seem to perform worse than the supervised classifiers in our review. However, we note a worrying prevalence of (i) demonstrably erroneous experimental results, (ii) undemanding benchmarks and (iii) incomplete reporting. We therefore encourage researchers to be comprehensive in their reporting.
AbstractList Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners because it reduces the need for labeled training data. Investigate the use and performance of unsupervised learning techniques in software defect prediction. We conducted a systematic literature review that identified 49 studies containing 2456 individual experimental results, which satisfied our inclusion criteria published between January 2000 and March 2018. In order to compare prediction performance across these studies in a consistent way, we (re-)computed the confusion matrices and employed the Matthews Correlation Coefficient (MCC) as our main performance measure. Our meta-analysis shows that unsupervised models are comparable with supervised models for both within-project and cross-project prediction. Among the 14 families of unsupervised model, Fuzzy CMeans (FCM) and Fuzzy SOMs (FSOMs) perform best. In addition, where we were able to check, we found that almost 11% (262/2456) of published results (contained in 16 papers) were internally inconsistent and a further 33% (823/2456) provided insufficient details for us to check. Although many factors impact the performance of a classifier, e.g., dataset characteristics, broadly speaking, unsupervised classifiers do not seem to perform worse than the supervised classifiers in our review. However, we note a worrying prevalence of (i) demonstrably erroneous experimental results, (ii) undemanding benchmarks and (iii) incomplete reporting. We therefore encourage researchers to be comprehensive in their reporting.
ArticleNumber 106287
Author Li, Ning
Shepperd, Martin
Guo, Yuchen
Author_xml – sequence: 1
  givenname: Ning
  orcidid: 0000-0001-7394-0640
  surname: Li
  fullname: Li, Ning
  organization: School of Computer Science, Northwestern Polytechnical University, Xi’an, 710072, China
– sequence: 2
  givenname: Martin
  orcidid: 0000-0003-1874-6145
  surname: Shepperd
  fullname: Shepperd, Martin
  email: martin.shepperd@brunel.ac.uk
  organization: Brunel University London, Uxbridge, UB8 3PH, United Kingdom
– sequence: 3
  givenname: Yuchen
  orcidid: 0000-0003-2756-9216
  surname: Guo
  fullname: Guo, Yuchen
  organization: Department of Computer Science and Technology, Xi’an Jiaotong University, Xi’an, 710049, China
BookMark eNqFkE1LAzEQhoNUsK3-Aw_5A1sn2Wx214NQil9Q8KJeQ5qdaEqbrUna0n_vLuvJg54GBp535n0mZORbj4RcM5gxYPJmPXPextbOOPB-JXlVnpExq8o8k8CLERlDXUBWVKK-IJMY1wCshBzG5H1O4ykm3OrkDA14cHikraV7H_c7DAcXsaEb1ME7_0ETmk_vvvYYqW0D7U6mow5IG7RoEt0FbJxJrvWX5NzqTcSrnzklbw_3r4unbPny-LyYLzOTlzxltpDYVLw0JdRixUwtma5lZRhKDVgKKWAFsm6YqCrQOeYFYyIXXKKx3PImnxIx5JrQxhjQql1wWx1OioHq3ai1Gtyo3o0a3HTY7S_MuKT7x1PQbvMffDfA2BXrfAUVjUNvuu6hs6Ca1v0d8A3tpYWZ
CitedBy_id crossref_primary_10_1049_2024_6294422
crossref_primary_10_1002_spe_3152
crossref_primary_10_1007_s11334_021_00387_6
crossref_primary_10_3390_math9151722
crossref_primary_10_3390_coatings15020164
crossref_primary_10_1016_j_eswa_2023_122409
crossref_primary_10_1016_j_jksuci_2021_09_018
crossref_primary_10_3390_math10173120
crossref_primary_10_1016_j_nanoen_2023_108559
crossref_primary_10_3390_e23101274
crossref_primary_10_1109_TSE_2022_3171202
crossref_primary_10_1016_j_infsof_2021_106742
crossref_primary_10_1109_TR_2024_3356515
crossref_primary_10_1109_TR_2025_3548107
crossref_primary_10_1155_2022_2522202
crossref_primary_10_1016_j_infsof_2022_106847
crossref_primary_10_1002_smr_2625
crossref_primary_10_3390_axioms11050223
crossref_primary_10_3390_app132212147
crossref_primary_10_1109_TR_2022_3158949
crossref_primary_10_1016_j_heliyon_2024_e25276
crossref_primary_10_1109_TR_2024_3393415
crossref_primary_10_21015_vtse_v10i4_1275
crossref_primary_10_1049_2024_8027037
crossref_primary_10_1016_j_procs_2023_10_256
crossref_primary_10_1016_j_infsof_2022_107128
crossref_primary_10_1371_journal_pone_0301541
crossref_primary_10_1002_smr_2533
crossref_primary_10_1007_s10489_021_02346_x
crossref_primary_10_1515_auto_2022_0039
crossref_primary_10_1007_s10515_022_00344_y
crossref_primary_10_1002_cpe_7240
crossref_primary_10_1016_j_infsof_2024_107456
crossref_primary_10_1016_j_jss_2022_111537
crossref_primary_10_1088_1742_6596_2025_1_012100
crossref_primary_10_1109_ACCESS_2024_3467183
crossref_primary_10_1007_s11042_022_14065_7
crossref_primary_10_1007_s10515_024_00424_1
crossref_primary_10_1109_ACCESS_2023_3287326
crossref_primary_10_3233_JIFS_213570
crossref_primary_10_1016_j_jss_2020_110763
crossref_primary_10_1109_TR_2023_3295598
crossref_primary_10_1049_sfw2_12073
crossref_primary_10_1108_IJICC_09_2024_0472
crossref_primary_10_1016_j_tifs_2025_104938
crossref_primary_10_1002_smr_70049
crossref_primary_10_1109_ACCESS_2020_3017101
crossref_primary_10_3390_f13122041
crossref_primary_10_1002_cpe_7664
crossref_primary_10_1002_cpe_7305
crossref_primary_10_1016_j_foodchem_2024_138945
crossref_primary_10_3390_su151310543
crossref_primary_10_1016_j_infsof_2022_106939
crossref_primary_10_3390_s21227535
crossref_primary_10_1016_j_jer_2023_10_038
crossref_primary_10_1016_j_knosys_2025_114146
crossref_primary_10_1016_j_jss_2023_111676
crossref_primary_10_1016_j_scico_2024_103164
crossref_primary_10_1016_j_infsof_2021_106662
crossref_primary_10_1016_j_procs_2023_01_002
crossref_primary_10_1016_j_jss_2020_110862
crossref_primary_10_1155_2021_5069016
crossref_primary_10_3390_app13031710
crossref_primary_10_3390_electronics10020179
crossref_primary_10_1002_smr_2549
crossref_primary_10_2186_jpr_JPR_D_23_00154
crossref_primary_10_1109_ACCESS_2022_3211401
crossref_primary_10_1109_ACCESS_2022_3174115
crossref_primary_10_1109_ACCESS_2024_3382991
crossref_primary_10_1016_j_infsof_2022_107102
crossref_primary_10_3390_en14237833
crossref_primary_10_1007_s10515_025_00510_y
crossref_primary_10_1016_j_infsof_2020_106432
crossref_primary_10_1016_j_ijcip_2022_100527
crossref_primary_10_3390_diagnostics12112708
crossref_primary_10_1088_1612_202X_ad8742
crossref_primary_10_1002_btm2_70002
crossref_primary_10_1007_s10664_022_10215_5
crossref_primary_10_2516_stet_2024024
crossref_primary_10_1016_j_jnca_2023_103796
crossref_primary_10_1002_aisy_202300366
crossref_primary_10_1109_TR_2021_3060937
crossref_primary_10_1016_j_infsof_2023_107175
crossref_primary_10_1007_s00500_021_06048_x
crossref_primary_10_1016_j_comnet_2022_108987
crossref_primary_10_1007_s10489_020_01935_6
crossref_primary_10_1007_s11334_023_00542_1
crossref_primary_10_1007_s11219_023_09640_6
crossref_primary_10_1016_j_cities_2022_103925
crossref_primary_10_1007_s10664_024_10455_7
crossref_primary_10_1007_s10489_025_06557_4
crossref_primary_10_1016_j_jss_2024_112179
crossref_primary_10_1109_TIM_2024_3446650
crossref_primary_10_1109_ACCESS_2024_3494044
crossref_primary_10_3390_drones7030214
crossref_primary_10_32604_cmc_2023_045522
crossref_primary_10_3390_biology12101298
crossref_primary_10_3390_foods14030411
crossref_primary_10_3390_sym13112040
crossref_primary_10_1007_s42773_023_00225_x
crossref_primary_10_1007_s13198_021_01582_1
crossref_primary_10_3390_sym13112166
Cites_doi 10.1007/s10994-009-5119-5
10.1016/j.patrec.2005.10.010
10.5860/crln.76.3.9277
10.1007/s10515-013-0129-8
10.1109/32.815326
10.1109/TSE.2011.103
10.1109/TSE.2012.70
10.1109/TSE.2014.2322358
10.1016/j.eswa.2008.10.027
10.1109/TSE.2018.2836442
10.1093/bioinformatics/16.5.412
10.1146/annurev.pu.17.050196.000245
10.1038/s41562-016-0021
ContentType Journal Article
Copyright 2020 Elsevier B.V.
Copyright_xml – notice: 2020 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.infsof.2020.106287
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Business
EISSN 1873-6025
ExternalDocumentID 10_1016_j_infsof_2020_106287
S0950584920300379
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1~.
1~5
29I
4.4
457
4G.
5GY
5VS
7-5
71M
77K
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
AAYOK
ABBOA
ABFNM
ABFRF
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACDAQ
ACGFO
ACGFS
ACGOD
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
AEBSH
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BKOMP
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
IHE
J1W
KOM
LG9
M41
MO0
MS~
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PQQKQ
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSV
SSZ
T5K
TWZ
UHS
UNMZH
WH7
WUQ
XFK
ZY4
~G-
77I
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c372t-f56ed827c7094b1c961a968c1e6a0e74640b069d14880a3e351143426ecf2f2d3
ISICitedReferencesCount 133
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000525318800008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0950-5849
IngestDate Tue Nov 18 22:12:03 EST 2025
Sat Nov 29 07:12:00 EST 2025
Fri Feb 23 02:50:03 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Systematic review
Software defect prediction
Unsupervised learning
Machine learning
Meta-analysis
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c372t-f56ed827c7094b1c961a968c1e6a0e74640b069d14880a3e351143426ecf2f2d3
ORCID 0000-0003-1874-6145
0000-0001-7394-0640
0000-0003-2756-9216
ParticipantIDs crossref_primary_10_1016_j_infsof_2020_106287
crossref_citationtrail_10_1016_j_infsof_2020_106287
elsevier_sciencedirect_doi_10_1016_j_infsof_2020_106287
PublicationCentury 2000
PublicationDate June 2020
2020-06-00
PublicationDateYYYYMMDD 2020-06-01
PublicationDate_xml – month: 06
  year: 2020
  text: June 2020
PublicationDecade 2020
PublicationTitle Information and software technology
PublicationYear 2020
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References N. Li, M. Shepperd, Y. Guo, A Systematic Review of Unsupervised Defect Prediction Dataset, Mendeley Data, v1, 10.17632/h24ctmyx73.1
Flach, Kull (bib0016) 2015
Yan, Yang, Liu, Zhang (bib0006) 2016
Mosteller, Colditz (bib0024) 1996; 17
Michie, Spiegelhalter, Taylor (bib0022) 1994
Hedges, Olkin (bib0025) 1985
Nam, Kim (bib0004) 2015
Bowes, Hall, Gray (bib0011) 2014; 21
Song, Guo, Shepperd (bib0017) 2018; 45
Hall, Beecham, Bowes, Gray, Counsell (bib0003) 2012; 38
Grissom, Kim (bib0021) 2012
Catal, Diri (bib0002) 2009; 36
Perlin, Imasato, Borenstein (bib0018) 2017
Chambers (bib0026) 1983
Powers (bib0009) 2011; 2
Munafò, Nosek, Bishop, Button, Chambers, du Sert, Simonsohn, Wagenmakers, Ware, Ioannidis (bib0027) 2017; 1
Kohavi (bib0020) 1995; 14
Fenton, Neil (bib0001) 1999; 25
Hand (bib0015) 2009; 77
Kitchenham, Budgen, Brereton (bib0005) 2015
Boucher, Badri (bib0007) 2016
Fawcett (bib0013) 2006; 27
Berger, Cirasella (bib0019) 2015; 76
Kamei, Shihab, Adams, Hassan, Mockus, Sinha, Ubayashi (bib0014) 2013; 39
Shepperd, Bowes, Hall (bib0010) 2014; 40
B. Bushman, M. Wang, Vote-counting Procedures in Meta-analysis, Russell Sage Foundation, New York, NY, US, pp. 207–220.
Baldi, Brunak, Chauvin, Andersen, Nielsen (bib0012) 2000; 16
Nam (10.1016/j.infsof.2020.106287_bib0004) 2015
Munafò (10.1016/j.infsof.2020.106287_bib0027) 2017; 1
Shepperd (10.1016/j.infsof.2020.106287_bib0010) 2014; 40
Boucher (10.1016/j.infsof.2020.106287_bib0007) 2016
Mosteller (10.1016/j.infsof.2020.106287_bib0024) 1996; 17
Michie (10.1016/j.infsof.2020.106287_bib0022) 1994
Bowes (10.1016/j.infsof.2020.106287_bib0011) 2014; 21
Kamei (10.1016/j.infsof.2020.106287_bib0014) 2013; 39
Fenton (10.1016/j.infsof.2020.106287_bib0001) 1999; 25
Perlin (10.1016/j.infsof.2020.106287_bib0018) 2017
Chambers (10.1016/j.infsof.2020.106287_bib0026) 1983
Baldi (10.1016/j.infsof.2020.106287_bib0012) 2000; 16
Berger (10.1016/j.infsof.2020.106287_bib0019) 2015; 76
Yan (10.1016/j.infsof.2020.106287_bib0006) 2016
Grissom (10.1016/j.infsof.2020.106287_bib0021) 2012
Hand (10.1016/j.infsof.2020.106287_bib0015) 2009; 77
Flach (10.1016/j.infsof.2020.106287_bib0016) 2015
Powers (10.1016/j.infsof.2020.106287_bib0009) 2011; 2
Hall (10.1016/j.infsof.2020.106287_bib0003) 2012; 38
10.1016/j.infsof.2020.106287_bib0008
Song (10.1016/j.infsof.2020.106287_bib0017) 2018; 45
Kohavi (10.1016/j.infsof.2020.106287_bib0020) 1995; 14
Kitchenham (10.1016/j.infsof.2020.106287_bib0005) 2015
Fawcett (10.1016/j.infsof.2020.106287_bib0013) 2006; 27
10.1016/j.infsof.2020.106287_bib0023
Hedges (10.1016/j.infsof.2020.106287_bib0025) 1985
Catal (10.1016/j.infsof.2020.106287_bib0002) 2009; 36
References_xml – volume: 14
  start-page: 1137
  year: 1995
  end-page: 1145
  ident: bib0020
  article-title: A study of cross-validation and bootstrap for accuracy estimation and model selection
  publication-title: IJCAI
– start-page: 838
  year: 2015
  end-page: 846
  ident: bib0016
  article-title: Precision-recall-gain curves: PR analysis done right
  publication-title: Advances in Neural Information Processing Systems
– start-page: 1
  year: 2017
  end-page: 19
  ident: bib0018
  article-title: Is predatory publishing a real threat? evidence from a large database study
  publication-title: Scientometrics
– volume: 38
  start-page: 1276
  year: 2012
  end-page: 1304
  ident: bib0003
  article-title: A systematic literature review on fault prediction performance in software engineering
  publication-title: IEEE Trans. Softw. Eng.
– volume: 16
  start-page: 412
  year: 2000
  end-page: 424
  ident: bib0012
  article-title: Assessing the accuracy of prediction algorithms for classification: an overview
  publication-title: Bioinformatics
– year: 2012
  ident: bib0021
  article-title: Effect Sizes for Research: Univariate and Multivariate Applications
– volume: 45
  start-page: 1253
  year: 2018
  end-page: 1269
  ident: bib0017
  article-title: A comprehensive investigation of the role of imbalanced learning for software defect prediction
  publication-title: IEEE Trans. Softw. Eng.
– volume: 27
  start-page: 861
  year: 2006
  end-page: 874
  ident: bib0013
  article-title: An introduction to roc analysis
  publication-title: Pattern Recognit. Lett.
– volume: 36
  start-page: 7346
  year: 2009
  end-page: 7354
  ident: bib0002
  article-title: A systematic review of software fault prediction studies
  publication-title: Expert Syst. Appl.
– volume: 40
  start-page: 603
  year: 2014
  end-page: 616
  ident: bib0010
  article-title: Researcher bias: The use of machine learning in software defect prediction
  publication-title: IEEE Trans. Softw. Eng.
– year: 1994
  ident: bib0022
  article-title: Machine learning, neural and statistical classification
  publication-title: Ellis Horwood Series in Artifical Intelligence
– year: 2015
  ident: bib0005
  article-title: Evidence-based Software Engineering and Systematic Reviews
– start-page: 169
  year: 2016
  end-page: 176
  ident: bib0007
  article-title: Using software metrics thresholds to predict fault-prone classes in object-oriented software
  publication-title: 2016 4th Intl Conf on Applied Computing and Information Technology/3rd Intl Conf on Computational Science/Intelligence and Applied Informatics/1st Intl Conf on Big Data, Cloud Computing, Data Science & Engineering (ACIT-CSII-BCD)
– volume: 21
  start-page: 287
  year: 2014
  end-page: 313
  ident: bib0011
  article-title: DConfusion: a technique to allow cross study performance evaluation of fault prediction studies
  publication-title: Autom. Softw. Eng.
– volume: 39
  start-page: 757
  year: 2013
  end-page: 773
  ident: bib0014
  article-title: A large-scale empirical study of just-in-time quality assurance
  publication-title: IEEE Trans. Softw. Eng.
– reference: N. Li, M. Shepperd, Y. Guo, A Systematic Review of Unsupervised Defect Prediction Dataset, Mendeley Data, v1, 10.17632/h24ctmyx73.1
– volume: 17
  start-page: 1
  year: 1996
  end-page: 23
  ident: bib0024
  article-title: Understanding research synthesis (meta-analysis)
  publication-title: Annu. Rev. Public Health
– year: 1983
  ident: bib0026
  article-title: Graphical Methods for Data Analysis
– reference: B. Bushman, M. Wang, Vote-counting Procedures in Meta-analysis, Russell Sage Foundation, New York, NY, US, pp. 207–220.
– year: 1985
  ident: bib0025
  article-title: Statistical Methods for Meta-analysis
– volume: 1
  start-page: 21
  year: 2017
  ident: bib0027
  article-title: A manifesto for reproducible science
  publication-title: Nature Human Behav.
– volume: 25
  start-page: 675
  year: 1999
  end-page: 689
  ident: bib0001
  article-title: A critique of software defect prediction models
  publication-title: IEEE Trans. Softw. Eng.
– volume: 76
  start-page: 132
  year: 2015
  end-page: 135
  ident: bib0019
  article-title: Beyond Beall’s list: better understanding predatory publishers
  publication-title: College Res. Lib. News
– start-page: 452
  year: 2015
  end-page: 463
  ident: bib0004
  article-title: Clami: defect prediction on unlabeled datasets (T)
  publication-title: Automated Software Engineering (ASE), 2015 30th IEEE/ACM International Conference on
– volume: 2
  start-page: 37
  year: 2011
  end-page: 63
  ident: bib0009
  article-title: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation
  publication-title: J. Mach. Learn. Technol.
– start-page: 134
  year: 2016
  end-page: 140
  ident: bib0006
  article-title: Self-learning change-prone class prediction.
  publication-title: SEKE
– volume: 77
  start-page: 103
  year: 2009
  end-page: 123
  ident: bib0015
  article-title: Measuring classifier performance: a coherent alternative to the area under the ROC curve
  publication-title: Mach. Learn.
– ident: 10.1016/j.infsof.2020.106287_bib0008
– volume: 77
  start-page: 103
  issue: 1
  year: 2009
  ident: 10.1016/j.infsof.2020.106287_bib0015
  article-title: Measuring classifier performance: a coherent alternative to the area under the ROC curve
  publication-title: Mach. Learn.
  doi: 10.1007/s10994-009-5119-5
– volume: 27
  start-page: 861
  issue: 8
  year: 2006
  ident: 10.1016/j.infsof.2020.106287_bib0013
  article-title: An introduction to roc analysis
  publication-title: Pattern Recognit. Lett.
  doi: 10.1016/j.patrec.2005.10.010
– volume: 76
  start-page: 132
  issue: 3
  year: 2015
  ident: 10.1016/j.infsof.2020.106287_bib0019
  article-title: Beyond Beall’s list: better understanding predatory publishers
  publication-title: College Res. Lib. News
  doi: 10.5860/crln.76.3.9277
– year: 1994
  ident: 10.1016/j.infsof.2020.106287_bib0022
  article-title: Machine learning, neural and statistical classification
– volume: 21
  start-page: 287
  issue: 2
  year: 2014
  ident: 10.1016/j.infsof.2020.106287_bib0011
  article-title: DConfusion: a technique to allow cross study performance evaluation of fault prediction studies
  publication-title: Autom. Softw. Eng.
  doi: 10.1007/s10515-013-0129-8
– volume: 14
  start-page: 1137
  year: 1995
  ident: 10.1016/j.infsof.2020.106287_bib0020
  article-title: A study of cross-validation and bootstrap for accuracy estimation and model selection
– year: 2012
  ident: 10.1016/j.infsof.2020.106287_bib0021
– volume: 25
  start-page: 675
  issue: 5
  year: 1999
  ident: 10.1016/j.infsof.2020.106287_bib0001
  article-title: A critique of software defect prediction models
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/32.815326
– start-page: 169
  year: 2016
  ident: 10.1016/j.infsof.2020.106287_bib0007
  article-title: Using software metrics thresholds to predict fault-prone classes in object-oriented software
– volume: 38
  start-page: 1276
  issue: 6
  year: 2012
  ident: 10.1016/j.infsof.2020.106287_bib0003
  article-title: A systematic literature review on fault prediction performance in software engineering
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2011.103
– volume: 39
  start-page: 757
  issue: 6
  year: 2013
  ident: 10.1016/j.infsof.2020.106287_bib0014
  article-title: A large-scale empirical study of just-in-time quality assurance
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2012.70
– start-page: 452
  year: 2015
  ident: 10.1016/j.infsof.2020.106287_bib0004
  article-title: Clami: defect prediction on unlabeled datasets (T)
– volume: 40
  start-page: 603
  issue: 6
  year: 2014
  ident: 10.1016/j.infsof.2020.106287_bib0010
  article-title: Researcher bias: The use of machine learning in software defect prediction
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2014.2322358
– volume: 36
  start-page: 7346
  issue: 4
  year: 2009
  ident: 10.1016/j.infsof.2020.106287_bib0002
  article-title: A systematic review of software fault prediction studies
  publication-title: Expert Syst. Appl.
  doi: 10.1016/j.eswa.2008.10.027
– year: 1985
  ident: 10.1016/j.infsof.2020.106287_bib0025
– start-page: 838
  year: 2015
  ident: 10.1016/j.infsof.2020.106287_bib0016
  article-title: Precision-recall-gain curves: PR analysis done right
– start-page: 1
  year: 2017
  ident: 10.1016/j.infsof.2020.106287_bib0018
  article-title: Is predatory publishing a real threat? evidence from a large database study
  publication-title: Scientometrics
– volume: 45
  start-page: 1253
  issue: 12
  year: 2018
  ident: 10.1016/j.infsof.2020.106287_bib0017
  article-title: A comprehensive investigation of the role of imbalanced learning for software defect prediction
  publication-title: IEEE Trans. Softw. Eng.
  doi: 10.1109/TSE.2018.2836442
– year: 1983
  ident: 10.1016/j.infsof.2020.106287_bib0026
– volume: 2
  start-page: 37
  issue: 1
  year: 2011
  ident: 10.1016/j.infsof.2020.106287_bib0009
  article-title: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation
  publication-title: J. Mach. Learn. Technol.
– volume: 16
  start-page: 412
  issue: 5
  year: 2000
  ident: 10.1016/j.infsof.2020.106287_bib0012
  article-title: Assessing the accuracy of prediction algorithms for classification: an overview
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/16.5.412
– year: 2015
  ident: 10.1016/j.infsof.2020.106287_bib0005
– start-page: 134
  year: 2016
  ident: 10.1016/j.infsof.2020.106287_bib0006
  article-title: Self-learning change-prone class prediction.
– ident: 10.1016/j.infsof.2020.106287_bib0023
– volume: 17
  start-page: 1
  issue: 1
  year: 1996
  ident: 10.1016/j.infsof.2020.106287_bib0024
  article-title: Understanding research synthesis (meta-analysis)
  publication-title: Annu. Rev. Public Health
  doi: 10.1146/annurev.pu.17.050196.000245
– volume: 1
  start-page: 21
  issue: 1
  year: 2017
  ident: 10.1016/j.infsof.2020.106287_bib0027
  article-title: A manifesto for reproducible science
  publication-title: Nature Human Behav.
  doi: 10.1038/s41562-016-0021
SSID ssj0017030
Score 2.6336992
SecondaryResourceType review_article
Snippet Unsupervised machine learners have been increasingly applied to software defect prediction. It is an approach that may be valuable for software practitioners...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 106287
SubjectTerms Machine learning
Meta-analysis
Software defect prediction
Systematic review
Unsupervised learning
Title A systematic review of unsupervised learning techniques for software defect prediction
URI https://dx.doi.org/10.1016/j.infsof.2020.106287
Volume 122
WOSCitedRecordID wos000525318800008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1873-6025
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017030
  issn: 0950-5849
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1La9wwEBZtUkovJX2RpEnRobegYkteST4uIaEtIRSalu3JyLIMWYKzrNdtfn5HHsne4tIX9GKMsCx7vo_RaDQzIuS1dNpks7JktkpLlmW1ZFomNXNCG5h_FNz3icIX6vJSLxb5hxCn2_bHCaim0Xd3-eq_Qg1tALZPnf0LuIeXQgPcA-hwBdjh-kfAzyfVmfvMlK5pu5VXDC2YmDfRITKUcO3LMpy0oJS_-ViwymFR47XfxxmwW8aw9yHl8QT97qHXZuKnv7hGuoX5EWtBruA7qpAoFCt_-xCgrnfbfumAR822N4InY9QUusgmaTLB15gwsHRQNzrUtFoJJhPMeh5UMeYoT9Q6ehiWfi0Cv_TGDwyNkoe5-seC2R_9cH40DgosESq_T3a5muWg83bn784W74ddJq_tsBYjfl5Mrezj_6Zj_dx02TJHrvbI47COoHPE_wm555qn5GFMY3hGPs_pSAOKNKC3Nd2mAY00oCMNKEBLI6AUaUBHGjwnn87Prk7fsnCGBrNC8Q2rZ9JVmiurYB1fpjaXqcmltqmTJnEqk1lSJjKvUq_IjXB-XzkTYLY5W_OaV-IF2WluG7dPqLLGmFkptJU8s8rosl-9ZtLkYPdrfUBEFE9hQ4F5f87JTREjCZcFCrXwQi1QqAeEDb1WWGDlN8-rKPkiGIlo_BVAll_2PPznni_Jo5HrR2Rns-7cMXlgv26u2_WrwKrvZiuQLQ
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+systematic+review+of+unsupervised+learning+techniques+for+software+defect+prediction&rft.jtitle=Information+and+software+technology&rft.au=Li%2C+Ning&rft.au=Shepperd%2C+Martin&rft.au=Guo%2C+Yuchen&rft.date=2020-06-01&rft.pub=Elsevier+B.V&rft.issn=0950-5849&rft.eissn=1873-6025&rft.volume=122&rft_id=info:doi/10.1016%2Fj.infsof.2020.106287&rft.externalDocID=S0950584920300379
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0950-5849&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0950-5849&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0950-5849&client=summon