PYEVOLVE: Automating Frequent Code Changes in Python ML Systems
Because of the naturalness of software and the rapid evolution of Machine Learning (ML) techniques, frequently repeated code change patterns (CPATs) occur often. They range from simple API migrations to changes involving several complex control structures such as for loops. While manually performing...
Uložené v:
| Vydané v: | Proceedings / International Conference on Software Engineering s. 995 - 1007 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.05.2023
|
| Predmet: | |
| ISSN: | 1558-1225 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Because of the naturalness of software and the rapid evolution of Machine Learning (ML) techniques, frequently repeated code change patterns (CPATs) occur often. They range from simple API migrations to changes involving several complex control structures such as for loops. While manually performing CPATs is tedious, the current state-of-the-art techniques for inferring transformation rules are not advanced enough to handle unseen variants of complex CPATs, resulting in a low recall rate. In this paper we present a novel, automated workflow that mines CPATs, infers the transformation rules, and then transplants them automatically to new target sites. We designed, implemented, evaluated and released this in a tool, PYEVOLVE. At its core is a novel data-flow, control-flow aware transformation rule inference engine. Our technique allows us to advance the state-of-the-art for transformation-by-example tools; without it, 70% of the code changes that PYEVOLVE transforms would not be possible to automate. Our thorough empirical evaluation of over 40,000 transformations shows 97% precision and 94% recall. By accepting 90% of CPATs generated by PYEVOLVE in famous open-source projects, developers confirmed its changes are useful. |
|---|---|
| AbstractList | Because of the naturalness of software and the rapid evolution of Machine Learning (ML) techniques, frequently repeated code change patterns (CPATs) occur often. They range from simple API migrations to changes involving several complex control structures such as for loops. While manually performing CPATs is tedious, the current state-of-the-art techniques for inferring transformation rules are not advanced enough to handle unseen variants of complex CPATs, resulting in a low recall rate. In this paper we present a novel, automated workflow that mines CPATs, infers the transformation rules, and then transplants them automatically to new target sites. We designed, implemented, evaluated and released this in a tool, PYEVOLVE. At its core is a novel data-flow, control-flow aware transformation rule inference engine. Our technique allows us to advance the state-of-the-art for transformation-by-example tools; without it, 70% of the code changes that PYEVOLVE transforms would not be possible to automate. Our thorough empirical evaluation of over 40,000 transformations shows 97% precision and 94% recall. By accepting 90% of CPATs generated by PYEVOLVE in famous open-source projects, developers confirmed its changes are useful. |
| Author | Dilhara, Malinda Dig, Danny Ketkar, Ameya |
| Author_xml | – sequence: 1 givenname: Malinda surname: Dilhara fullname: Dilhara, Malinda email: malinda.malwala@colorado.edu organization: University of Colorado Boulder,USA – sequence: 2 givenname: Danny surname: Dig fullname: Dig, Danny email: danny.dig@colorado.edu organization: University of Colorado Boulder,JetBrains Research,USA – sequence: 3 givenname: Ameya surname: Ketkar fullname: Ketkar, Ameya email: ketkara@uber.com organization: Uber Technologies Inc.,USA |
| BookMark | eNotz9FKwzAUgOEoCs65N9hFXqD1nJMmabyRUbo5qGwwHXg10ibdCjbVtbvo2yvo1X_3wX_PbkIXPGNzhBgRzOM62-VJqtDEBCRiADB4xWZGp6iUTKQGNNdsglKmERLJOzbr-6YEiYZQgJqw5-1Hvt8U-_yJLy5D19qhCUe-PPvviw8DzzrneXay4eh73gS-HYdTF_hrwXdjP_i2f2C3tf3s_ey_U_a-zN-yl6jYrNbZooisEDREpA2WZBx4AylQQmhLXaWlMk5oBVo4cklVWYPOgNO6lmgdCEk1VlppElM2_3Mb7_3h69y09jweEFCT_l3_AeNrSho |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK ESBDL RIE RIO |
| DOI | 10.1109/ICSE48619.2023.00091 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore Open Access Journals IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9781665457019 1665457015 |
| EISSN | 1558-1225 |
| EndPage | 1007 |
| ExternalDocumentID | 10172702 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: NSF grantid: CNS-1941898,CNS-2213763 funderid: 10.13039/100000001 |
| GroupedDBID | -~X .4S .DC 123 23M 29O 5VS 6IE 6IF 6IH 6IK 6IL 6IM 6IN 8US AAJGR AAWTH ABLEC ADZIZ AFFNX ALMA_UNASSIGNED_HOLDINGS APO ARCSS AVWKF BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO EDO ESBDL FEDTE I-F I07 IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS XOL |
| ID | FETCH-LOGICAL-a332t-2791b29d0e90802421ab7c8b69d376073d2d4cca91d90d77f51ad0352f1c76723 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 7 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001032629800082&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:09:24 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a332t-2791b29d0e90802421ab7c8b69d376073d2d4cca91d90d77f51ad0352f1c76723 |
| OpenAccessLink | https://ieeexplore.ieee.org/document/10172702 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10172702 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-May |
| PublicationDateYYYYMMDD | 2023-05-01 |
| PublicationDate_xml | – month: 05 year: 2023 text: 2023-May |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings / International Conference on Software Engineering |
| PublicationTitleAbbrev | ICSE |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib051921306 ssj0006499 |
| Score | 2.3484254 |
| Snippet | Because of the naturalness of software and the rapid evolution of Machine Learning (ML) techniques, frequently repeated code change patterns (CPATs) occur... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 995 |
| SubjectTerms | Codes Engines Machine learning Organ transplantation Program synthesis Program transformation Programming by example Python Repetitive code changes Software Software engineering Transformation by Example Transforms |
| Title | PYEVOLVE: Automating Frequent Code Changes in Python ML Systems |
| URI | https://ieeexplore.ieee.org/document/10172702 |
| WOSCitedRecordID | wos001032629800082&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ27T8MwEMYtqBiYyqOItzywuo0fiWMWhKpUIJUSCajKVMWPoC4JalMk_nt8SQosDGxRpuhsx9_Z9_0OoSsrQg3CgkgdCwJ8dqIsFcRnzDaTzEhXg-enYzmZxLOZSluzeu2Fcc7VxWeuD4_1Xb4tzRqOygYwfcA_tY22pYwas9Zm8oQA9uJwZdj-hiOv5VuvHA3U4H74lIjYpwt9aBgO2ELAcv7qqFJvKKPuPz9lD_V-rHk4_d509tGWKw5Qd9ObAbdL9RDdpK_J9HE8Ta7x7boqQZcWb3i0rCunKzwsrcONs2CFFwVOP4EhgB_GuEWY99DLKHke3pG2WQLJOGcVYVJRzZQNnAL7rGA009LEOlIW6l4kt8wKP1yKWhVYKfOQZhZgqDk1MpKMH6FOURbuGGGeuSj3mXFu4lAIHcVOGy_ErFd3lGsZnKAeBGT-3vAw5ptYnP7x_gztQsybMsFz1KmWa3eBdsxHtVgtL-tR_ALLQJjz |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ3PT4MwFMcbnSZ6mj9m_G0PXtloKZR6MWZh2SKbJM5lnhZKi9kFDGMm_vf2AVMvHrwRTuS1pd_Xvu_nIXSrmCtBWFhc-swCPrslFGGWyZhVzGnCdQWen4V8MvHncxE1ZvXKC6O1rorPdBceq7t8lSdrOCrrwfQB_9Q22nEZo3Zt19pMHxfQXg5cGjY_Ys-o-cYtR2zRG_WfA-abhKELLcMBXAhgzl89VaotZdD-58ccoM6POQ9H39vOIdrS2RFqb7oz4GaxHqP76DWYPYWz4A4_rMsclGn2hgdFVTtd4n6uNK69BSu8zHD0CRQBPA5xAzHvoJdBMO0PraZdghU7Di0tygWRVChbCzDQMkpiyRNfekJB5Qt3FFXMDJggStiK89QlsQIcakoS7nHqnKBWlmf6FGEn1l5qcuM08U2gpedrmRgppoy-I47k9hnqQEAW7zURY7GJxfkf72_Q3nA6DhfhaPJ4gfYh_nXR4CVqlcVaX6Hd5KNcrorrakS_APVCnDo |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=PYEVOLVE%3A+Automating+Frequent+Code+Changes+in+Python+ML+Systems&rft.au=Dilhara%2C+Malinda&rft.au=Dig%2C+Danny&rft.au=Ketkar%2C+Ameya&rft.date=2023-05-01&rft.pub=IEEE&rft.eissn=1558-1225&rft.spage=995&rft.epage=1007&rft_id=info:doi/10.1109%2FICSE48619.2023.00091&rft.externalDocID=10172702 |