The Python Software Quality Dataset
With Python's ascension as a dominant program-ming language, particularly in the fields of artificial intelligence and data science, the need for comprehensive datasets focusing on software quality within Python projects has become increasingly noticeable. This study introduces a detailed datas...
Saved in:
| Published in: | Proceedings (EUROMICRO Conference on Software Engineering and Advanced Applications. Online) pp. 395 - 398 |
|---|---|
| Main Authors: | , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
28.08.2024
|
| Subjects: | |
| ISSN: | 2376-9521 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | With Python's ascension as a dominant program-ming language, particularly in the fields of artificial intelligence and data science, the need for comprehensive datasets focusing on software quality within Python projects has become increasingly noticeable. This study introduces a detailed dataset designed to address this gap, enriching academic resources in software engineering. The dataset encompasses a wide array of software quality metrics on up to 80 projects, including 51.765.853 Sonar-Qube issues, 268.506 SonarQube code quality metrics, 11.915 software refactoring records, and 155.127 pairs of bug-inducing and bug-fixing commits, along with 863.931 GitHub issue tracker entries. This extensive collection serves as a versatile tool for various research activities, enabling analysis of the relationships between technical debt and software refactorings, correlations be-tween refactoring processes and bug resolution, and their overall impact on software maintainability and reliability. By offering a comprehensive and multifaceted dataset, this study significantly contributes to understanding and improving software quality in Python projects. |
|---|---|
| AbstractList | With Python's ascension as a dominant program-ming language, particularly in the fields of artificial intelligence and data science, the need for comprehensive datasets focusing on software quality within Python projects has become increasingly noticeable. This study introduces a detailed dataset designed to address this gap, enriching academic resources in software engineering. The dataset encompasses a wide array of software quality metrics on up to 80 projects, including 51.765.853 Sonar-Qube issues, 268.506 SonarQube code quality metrics, 11.915 software refactoring records, and 155.127 pairs of bug-inducing and bug-fixing commits, along with 863.931 GitHub issue tracker entries. This extensive collection serves as a versatile tool for various research activities, enabling analysis of the relationships between technical debt and software refactorings, correlations be-tween refactoring processes and bug resolution, and their overall impact on software maintainability and reliability. By offering a comprehensive and multifaceted dataset, this study significantly contributes to understanding and improving software quality in Python projects. |
| Author | Moldovan, Vasilica-Andreea Berciu, Liviu-Marian Patcas, Rares-Danut |
| Author_xml | – sequence: 1 givenname: Vasilica-Andreea surname: Moldovan fullname: Moldovan, Vasilica-Andreea email: vasilica.moldovan@ubbcluj.ro organization: Babes-Bolyai University,Faculty of Mathematics and Informatics,Cluj-Napoca,Romania – sequence: 2 givenname: Liviu-Marian surname: Berciu fullname: Berciu, Liviu-Marian email: liviu.berciu@ubbcluj.ro organization: Babes-Bolyai University,Faculty of Mathematics and Informatics,Cluj-Napoca,Romania – sequence: 3 givenname: Rares-Danut surname: Patcas fullname: Patcas, Rares-Danut email: rares.patcas@ubbcluj.ro organization: Babes-Bolyai University,Faculty of Mathematics and Informatics,Cluj-Napoca,Romania |
| BookMark | eNotzMFKw0AQgOFVFKw1b9BDwHPi7Ex2snsMtVWhoNJ6LpvuhEZqIsmK5O0V9PRfPv5rddH1nSi10JBrDe5uu6oqLtCZHAGLHACYz1TiSmfJAFlAxnM1Qyo5cwb1lUrG8f2XESGUZGfqdneU9GWKx75Lt30Tv_0g6euXP7VxSu999KPEG3XZ-NMoyX_n6m292i0fs83zw9Oy2mStBo5ZaCTU-iCsa3QuAKIvGB0cSqipYR0MI9qGjSALOy8haOuB2LIEMkxztfj7tiKy_xzaDz9Mew0WqMCSfgCP4kF_ |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/SEAA64295.2024.00066 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798350380262 |
| EISSN | 2376-9521 |
| EndPage | 398 |
| ExternalDocumentID | 10803427 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL 6IN ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK OCL RIE RIL |
| ID | FETCH-LOGICAL-i106t-dfedb1ce61b299d022a46290c70b3f61d56228f65e26e69aedd18a03686ed3563 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001413352200056&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:24:08 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i106t-dfedb1ce61b299d022a46290c70b3f61d56228f65e26e69aedd18a03686ed3563 |
| PageCount | 4 |
| ParticipantIDs | ieee_primary_10803427 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-Aug.-28 |
| PublicationDateYYYYMMDD | 2024-08-28 |
| PublicationDate_xml | – month: 08 year: 2024 text: 2024-Aug.-28 day: 28 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings (EUROMICRO Conference on Software Engineering and Advanced Applications. Online) |
| PublicationTitleAbbrev | SEAA |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0003320738 |
| Score | 1.8812252 |
| Snippet | With Python's ascension as a dominant program-ming language, particularly in the fields of artificial intelligence and data science, the need for comprehensive... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 395 |
| SubjectTerms | Computer bugs Correlation Data science Focusing Github Mining Measurement Python Python Dataset Refactoring Software development management Software engineering Software Metrics Software quality Software reliability SonarQube Mining SZZ |
| Title | The Python Software Quality Dataset |
| URI | https://ieeexplore.ieee.org/document/10803427 |
| WOSCitedRecordID | wos001413352200056&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEJ0I8eAJPzB-p4leV7rtbrc9EoV4IiRowo1022nCBQwuGv6902VBLx68Nb007bR9M9O-NwAPuXFCepcnzhlNAYrCRJuMJ9RVkH9gCRJsXWyiGI30dGrGDVm95sIgYv35DB9js37L90u3jqmyXvwPJzNRtKBVFGpL1tonVKQUtF11Q49LuelNBv0-udcmpzBQRJFsHrUQfxVRqTFk2Pnn6MfQ_WHjsfEeZ07gABen0NmVY2DN6TyDezI5G2-iGACb0O36ZVfIthoZG_ZsK8Krqgtvw8Hr00vS1EBI5hSsVYkP6MvUoUpLAg5PiGszJQx3BS9lUKkn_0XooHIUCpWx6H2qLcGSVuhlruQ5tBfLBV4AwwyDDIEbTH2GuiwDnT6yieDeO5mFS-jGSc_etzIXs918r_7ov4ajuK4xwSr0DbSr1Rpv4dB9VvOP1V1tnG9SpY8S |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG0UTfSEHxi_3USvK922222PRCEYkZCACTfSbaeJFzC4aPj3TpcFvXjw1vTStNP2zUz73hByl2rLuLNpbK1WGKBIiJUWNMauDP0Dg5BgymITWb-vxmM9qMjqJRcGAMrPZ3AfmuVbvpvZRUiVNcN_OC5Ytk12UiEYXdG1NikVzhluWFUR5BKqm8N2q4UOtk4xEGRBJpsGNcRfZVRKFOnU_zn-AWn88PGiwQZpDskWTI9IfV2QIarO5zG5RaNHg2WQA4iGeL9-mTlEK5WMZfRoCkSsokFeO-3RQzeuqiDEbxiuFbHz4PLEgkxyhA6HmGuEZJrajObcy8ShB8OUlykwCVIbcC5RBoFJSXA8lfyE1KazKZySCAR47j3VkDgBKs89nj-0CqPOWS78GWmESU_eV0IXk_V8z__ovyF73dFLb9J76j9fkP2wxiHdytQlqRXzBVyRXftZvH3Mr0tDfQP3NZJZ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%28EUROMICRO+Conference+on+Software+Engineering+and+Advanced+Applications.+Online%29&rft.atitle=The+Python+Software+Quality+Dataset&rft.au=Moldovan%2C+Vasilica-Andreea&rft.au=Berciu%2C+Liviu-Marian&rft.au=Patcas%2C+Rares-Danut&rft.date=2024-08-28&rft.pub=IEEE&rft.eissn=2376-9521&rft.spage=395&rft.epage=398&rft_id=info:doi/10.1109%2FSEAA64295.2024.00066&rft.externalDocID=10803427 |