Boosting Just-in-Time Defect Prediction with Specific Features of C/C++ Programming Languages in Code Changes

Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affec...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings (IEEE/ACM International Conference on Mining Software Repositories. Online) S. 472 - 484
Hauptverfasser: Ni, Chao, Xu, Xiaodan, Yang, Kaiwen, Lo, David
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.05.2023
Schlagworte:
ISSN:2574-3864
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affect the quality of software projects. Meanwhile, the C programming language, one of the most popular ones, is widely used to develop foundation applications (i.e., operating system, database, compiler, etc.) in IT companies and its change-level characteristics on project quality have not been fully investigated. Additionally, whether open-source C projects have similar important features to commercial projects has not been studied much.To address the aforementioned limitations, in this paper, we investigate the impacts of programming language-specific features on the state-of-the-art JIT defect identification approach in an industrial setting. We collect and label the top-10 most starred C projects (i.e., 329,021 commits) on GitHub and 8 C projects in an ICT company (i.e., 12,983 commits). We also propose nine C-specific change-level features and focus our investigations on both open-source C projects on GitHub and C projects at the ICT company considering three aspects: (1) The effectiveness of C-specific change-level features in improving the performance of identification of defect-inducing changes, (2) The importance of features in the identification of defect-inducing changes between open-source C projects and commercial C projects, and (3) The effectiveness of combining language-independent features and C-specific features in a real-life setting at the ICT company.
AbstractList Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affect the quality of software projects. Meanwhile, the C programming language, one of the most popular ones, is widely used to develop foundation applications (i.e., operating system, database, compiler, etc.) in IT companies and its change-level characteristics on project quality have not been fully investigated. Additionally, whether open-source C projects have similar important features to commercial projects has not been studied much.To address the aforementioned limitations, in this paper, we investigate the impacts of programming language-specific features on the state-of-the-art JIT defect identification approach in an industrial setting. We collect and label the top-10 most starred C projects (i.e., 329,021 commits) on GitHub and 8 C projects in an ICT company (i.e., 12,983 commits). We also propose nine C-specific change-level features and focus our investigations on both open-source C projects on GitHub and C projects at the ICT company considering three aspects: (1) The effectiveness of C-specific change-level features in improving the performance of identification of defect-inducing changes, (2) The importance of features in the identification of defect-inducing changes between open-source C projects and commercial C projects, and (3) The effectiveness of combining language-independent features and C-specific features in a real-life setting at the ICT company.
Author Ni, Chao
Lo, David
Yang, Kaiwen
Xu, Xiaodan
Author_xml – sequence: 1
  givenname: Chao
  surname: Ni
  fullname: Ni, Chao
  email: chaoni@zju.edu.cn
  organization: Zhejiang University,China
– sequence: 2
  givenname: Xiaodan
  surname: Xu
  fullname: Xu, Xiaodan
  email: xiaodanxu@zju.edu.cn
  organization: Zhejiang University,China
– sequence: 3
  givenname: Kaiwen
  surname: Yang
  fullname: Yang, Kaiwen
  email: kwyang@zju.edu.cn
  organization: Zhejiang University,China
– sequence: 4
  givenname: David
  surname: Lo
  fullname: Lo, David
  email: davidlo@smu.edu.sg
  organization: Singapore Management University,Singapore
BookMark eNotjMtOwzAUBQ0CiVL6BbDwvkrrV2J7CYHyUBGIlnV161ynRiSpYleIvycIVkeaM5pzctJ2LRJyydmMc2bnz6u33DItZ4IJOWOMaXFEJlZbI3MmOTeqOCYjkWuVSVOoMzKJ8WPQpOBcczUizU3XxRTamj4dYspCm61Dg_QWPbpEX3usgkuha-lXSDu62qMLPji6QEiHHiPtPC3n5XQ6qF3dQ9P8ppbQ1geohzu0tOwqpOVuQBgvyKmHz4iT_x2T98XdunzIli_3j-X1Mguc25TlHjUUTG0NSCsLMMqLotgKzC33YJVC52RVWWDoKuGMcKAMWADvpAbn5Zhc_XUDIm72fWig_95wxrXizMgfZc5cmQ
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/MSR59073.2023.00072
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350311846
EISSN 2574-3864
EndPage 484
ExternalDocumentID 10174108
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  funderid: 10.13039/501100001809
GroupedDBID 6IE
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i119t-5fe7a604b8a3936a84f266b2e591fa944ecc3dd9a0ecd2c82ca48a9aafc37acf3
IEDL.DBID RIE
IngestDate Wed Aug 27 02:22:07 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i119t-5fe7a604b8a3936a84f266b2e591fa944ecc3dd9a0ecd2c82ca48a9aafc37acf3
PageCount 13
ParticipantIDs ieee_primary_10174108
PublicationCentury 2000
PublicationDate 2023-May
PublicationDateYYYYMMDD 2023-05-01
PublicationDate_xml – month: 05
  year: 2023
  text: 2023-May
PublicationDecade 2020
PublicationTitle Proceedings (IEEE/ACM International Conference on Mining Software Repositories. Online)
PublicationTitleAbbrev MSR
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003211714
Score 1.8303074
Snippet Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming...
SourceID ieee
SourceType Publisher
StartPage 472
SubjectTerms C/C++ programming language
Just-in-Time
Supervised Methods
Title Boosting Just-in-Time Defect Prediction with Specific Features of C/C++ Programming Languages in Code Changes
URI https://ieeexplore.ieee.org/document/10174108
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELYAMTABooi3PLBVbpPYie2VQsVQqoqXulWOfZYyNEFNy-_Hl4TCwsAWRZEj3Vk63_l7EHILIBzn3rNQug0TJuUsTyFhTnFrfaKUbYY57xM5nar5XM86snrDhQGABnwGA3xs7vJdZTc4Khvi9hExUnt3pcxastZ2oMJDKyNj0SkLxZEePr08p6H34wO0CEehQpQB_uWh0pSQ8eE_f35Eej9kPDrblpljsgPlCVneVVWNiGWKdlysKBlyOeg9IDojfI7XLxhyinNW2pjM-8JSPPBtQoNNK09Hw1G_jysjPmuJS0262WVNi5KOKge05R7UPfI2fngdPbLOOYEVcazXLPUgTRaJXBmueWaU8KEQ5wmkOvZGCxESx53TJgLrEqsS1DY32hhvuTTW81OyV1YlnBHqlIMonBENyvbIkFDrM7RMl8Jbl8v8nPQwVouPVhxj8R2miz_eX5IDTEeLGbwie-vVBq7Jvv1cF_XqpknpF95io50
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFA8yBT2pOPHbHLyNbG2Trs3V6pjYjaFTdhtp8gI9rJV18-83r6vTiwdvpZQU3gu8vJffByF3AMJwbi1zpVsxoULOshACZmKutQ3iWNfDnPc0Go_j2UxOGrJ6zYUBgBp8Bl18rO_yTanXOCrr4fYRPlJ7d9E6q6FrbUcq3DUzkS8abSHfk73R60vouj_eRZNwlCpEIeBfLip1ERkc_vP3R6T9Q8ejk22hOSY7UJyQxX1ZVohZpmjIxfKCIZuDPgDiM9zneAGDQac4aaW1zbzNNcUj39q12LS0NOklnQ6ujAitBS6VNtPLiuYFTUoDdMM-qNrkbfA4TYas8U5gue_LFQstRKrviSxWXPK-ioV1pTgLIJS-VVIIlzpujFQeaBPoOEB1cyWVsppHSlt-SlpFWcAZoSY24LlTokLhnsilVNs-mqZHwmqTRdk5aWOs5h8beYz5d5gu_nh_S_aH01E6T5_Gz5fkAFOzQRBekdZquYZrsqc_V3m1vKnT-wWTF6bm
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE%2FACM+International+Conference+on+Mining+Software+Repositories.+Online%29&rft.atitle=Boosting+Just-in-Time+Defect+Prediction+with+Specific+Features+of+C%2FC%2B%2B+Programming+Languages+in+Code+Changes&rft.au=Ni%2C+Chao&rft.au=Xu%2C+Xiaodan&rft.au=Yang%2C+Kaiwen&rft.au=Lo%2C+David&rft.date=2023-05-01&rft.pub=IEEE&rft.eissn=2574-3864&rft.spage=472&rft.epage=484&rft_id=info:doi/10.1109%2FMSR59073.2023.00072&rft.externalDocID=10174108