Boosting Just-in-Time Defect Prediction with Specific Features of C/C++ Programming Languages in Code Changes
Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affec...
Saved in:
| Published in: | Proceedings (IEEE/ACM International Conference on Mining Software Repositories. Online) pp. 472 - 484 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.05.2023
|
| Subjects: | |
| ISSN: | 2574-3864 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affect the quality of software projects. Meanwhile, the C programming language, one of the most popular ones, is widely used to develop foundation applications (i.e., operating system, database, compiler, etc.) in IT companies and its change-level characteristics on project quality have not been fully investigated. Additionally, whether open-source C projects have similar important features to commercial projects has not been studied much.To address the aforementioned limitations, in this paper, we investigate the impacts of programming language-specific features on the state-of-the-art JIT defect identification approach in an industrial setting. We collect and label the top-10 most starred C projects (i.e., 329,021 commits) on GitHub and 8 C projects in an ICT company (i.e., 12,983 commits). We also propose nine C-specific change-level features and focus our investigations on both open-source C projects on GitHub and C projects at the ICT company considering three aspects: (1) The effectiveness of C-specific change-level features in improving the performance of identification of defect-inducing changes, (2) The importance of features in the identification of defect-inducing changes between open-source C projects and commercial C projects, and (3) The effectiveness of combining language-independent features and C-specific features in a real-life setting at the ICT company. |
|---|---|
| AbstractList | Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming language-independent change-level features. However, different programming languages have different characteristics and consequently may affect the quality of software projects. Meanwhile, the C programming language, one of the most popular ones, is widely used to develop foundation applications (i.e., operating system, database, compiler, etc.) in IT companies and its change-level characteristics on project quality have not been fully investigated. Additionally, whether open-source C projects have similar important features to commercial projects has not been studied much.To address the aforementioned limitations, in this paper, we investigate the impacts of programming language-specific features on the state-of-the-art JIT defect identification approach in an industrial setting. We collect and label the top-10 most starred C projects (i.e., 329,021 commits) on GitHub and 8 C projects in an ICT company (i.e., 12,983 commits). We also propose nine C-specific change-level features and focus our investigations on both open-source C projects on GitHub and C projects at the ICT company considering three aspects: (1) The effectiveness of C-specific change-level features in improving the performance of identification of defect-inducing changes, (2) The importance of features in the identification of defect-inducing changes between open-source C projects and commercial C projects, and (3) The effectiveness of combining language-independent features and C-specific features in a real-life setting at the ICT company. |
| Author | Ni, Chao Lo, David Yang, Kaiwen Xu, Xiaodan |
| Author_xml | – sequence: 1 givenname: Chao surname: Ni fullname: Ni, Chao email: chaoni@zju.edu.cn organization: Zhejiang University,China – sequence: 2 givenname: Xiaodan surname: Xu fullname: Xu, Xiaodan email: xiaodanxu@zju.edu.cn organization: Zhejiang University,China – sequence: 3 givenname: Kaiwen surname: Yang fullname: Yang, Kaiwen email: kwyang@zju.edu.cn organization: Zhejiang University,China – sequence: 4 givenname: David surname: Lo fullname: Lo, David email: davidlo@smu.edu.sg organization: Singapore Management University,Singapore |
| BookMark | eNotjMtOwzAUBQ0CiVL6BbDwvkrrV2J7CYHyUBGIlnV161ynRiSpYleIvycIVkeaM5pzctJ2LRJyydmMc2bnz6u33DItZ4IJOWOMaXFEJlZbI3MmOTeqOCYjkWuVSVOoMzKJ8WPQpOBcczUizU3XxRTamj4dYspCm61Dg_QWPbpEX3usgkuha-lXSDu62qMLPji6QEiHHiPtPC3n5XQ6qF3dQ9P8ppbQ1geohzu0tOwqpOVuQBgvyKmHz4iT_x2T98XdunzIli_3j-X1Mguc25TlHjUUTG0NSCsLMMqLotgKzC33YJVC52RVWWDoKuGMcKAMWADvpAbn5Zhc_XUDIm72fWig_95wxrXizMgfZc5cmQ |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/MSR59073.2023.00072 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798350311846 |
| EISSN | 2574-3864 |
| EndPage | 484 |
| ExternalDocumentID | 10174108 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China funderid: 10.13039/501100001809 |
| GroupedDBID | 6IE 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL |
| ID | FETCH-LOGICAL-i119t-5fe7a604b8a3936a84f266b2e591fa944ecc3dd9a0ecd2c82ca48a9aafc37acf3 |
| IEDL.DBID | RIE |
| IngestDate | Wed Aug 27 02:22:07 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i119t-5fe7a604b8a3936a84f266b2e591fa944ecc3dd9a0ecd2c82ca48a9aafc37acf3 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10174108 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-May |
| PublicationDateYYYYMMDD | 2023-05-01 |
| PublicationDate_xml | – month: 05 year: 2023 text: 2023-May |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings (IEEE/ACM International Conference on Mining Software Repositories. Online) |
| PublicationTitleAbbrev | MSR |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003211714 |
| Score | 1.8303074 |
| Snippet | Just-in-time (JIT) defect prediction can identify changes as defect-inducing ones or clean ones and many approaches are proposed based on several programming... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 472 |
| SubjectTerms | C/C++ programming language Just-in-Time Supervised Methods |
| Title | Boosting Just-in-Time Defect Prediction with Specific Features of C/C++ Programming Languages in Code Changes |
| URI | https://ieeexplore.ieee.org/document/10174108 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmACRBHf8sBWuU1sJ7ZXAhVDqSo-pG6Va5-lDE1Q0_L7ySWhsDCwRVZkS3eRLu_83jtC7iJIvE4MZ8Zwz6TjihmQmtVgJJVK6GBDY-I6UdOpns_NrBOrN1oYAGjIZzDEx-Yu35dui62yEX4-MkZp775SaSvW2jVURA1lVCw7Z6E4MqPn15ekxn5iiCPC0agQbYB_zVBpSsj46J-HH5P-jxiPznZl5oTsQXFKVvdlWSFjmeI4LpYXDLUc9AGQnVG_jtcvGHKKfVbaDJkPuaP4w7etATYtA81G2WCAOyM_a4VbTbreZUXzgmalB9pqD6o-eR8_vmVPrJucwPI4NhuWBFA2jeRSW2FEarUMdSFeckhMHKyRsk6c8N7YCJznTnP0NrfG2uCEsi6IM9IrygLOCVVCgQnK-cQGtDXVkHDFRQ3INfdWyAvSx1gtPlpzjMV3mC7_WL8ih5iOljN4TXqb9RZuyIH73OTV-rZJ6ReMz6Jn |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA6igp5UnPjbHLyNbG2SLsnV6ZjYjaETdhtZfkAPa2Xd_PvNa-v04sFbCSWB9wqv38v3fQ-h-8glViaKEqWoJdxQQZTjkgQw0uOCSa99ZeKaivFYzmZq0ojVKy2Mc64in7kOPFZ3-bYwG2iVdeHz4TFIe_cSzmlUy7W2LRUWwIyIeeMtFEeqO3p7TQL6Yx0YEg5WhWAE_GuKSlVEBkf_PP4YtX7keHiyLTQnaMflp2j5UBQlcJYxDOQiWU5AzYEfHfAzwutwAQNBx9BpxdWYeZ8ZDL98mwCxceFxv9tvt2FnYGgtYau06V6WOMtxv7AO1-qDsoXeB0_T_pA0sxNIFsdqTRLvhO5FfCE1U6ynJfehFC-oS1TsteI8pI5Zq3TkjKVGUnA310prb5jQxrMztJsXuTtHWDDhlBfGJtqDsal0CRWUBUguqdWMX6AWxGr-UdtjzL_DdPnH-h06GE5H6Tx9Hr9coUNITc0gvEa769XG3aB987nOytVtld4vLSGlrg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28IEEE%2FACM+International+Conference+on+Mining+Software+Repositories.+Online%29&rft.atitle=Boosting+Just-in-Time+Defect+Prediction+with+Specific+Features+of+C%2FC%2B%2B+Programming+Languages+in+Code+Changes&rft.au=Ni%2C+Chao&rft.au=Xu%2C+Xiaodan&rft.au=Yang%2C+Kaiwen&rft.au=Lo%2C+David&rft.date=2023-05-01&rft.pub=IEEE&rft.eissn=2574-3864&rft.spage=472&rft.epage=484&rft_id=info:doi/10.1109%2FMSR59073.2023.00072&rft.externalDocID=10174108 |