Knowledge-Based Environment Dependency Inference for Python Programs

Besides third-party packages, the Python interpreter and system libraries are also critical dependencies of a Python program. In our empirical study, 34% programs are only compatible with specific Python interpreter versions, and 24% programs require specific system libraries. However, existing tech...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE) S. 1245 - 1256
Hauptverfasser: Ye, Hongjie, Chen, Wei, Dou, Wensheng, Wu, Guoquan, Wei, Jun
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: ACM 01.05.2022
Schlagworte:
ISSN:1558-1225
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Besides third-party packages, the Python interpreter and system libraries are also critical dependencies of a Python program. In our empirical study, 34% programs are only compatible with specific Python interpreter versions, and 24% programs require specific system libraries. However, existing techniques mainly focus on inferring third-party package dependencies. Therefore, they can lack other necessary dependencies and violate version constraints, thus resulting in program build failures and runtime errors. This paper proposes a knowledge-based technique named PyEGo, which can automatically infer dependencies of third-party packages, the Python interpreter, and system libraries at compatible versions for Python programs. We first construct the dependency knowl-edge graph PyKG, which can portray the relations and constraints among third-party packages, the Python interpreter, and system libraries. Then, by querying PyKG with extracted program features, PyEGo constructs a program-related sub-graph with dependency candidates of the three types. It finally outputs the latest compatible dependency versions by solving constraints in the sub-graph. We evaluate PyEGo on 2,891 single-file Python gists, 100 open-source Python projects and 4,836 jupyter notebooks. The experimental re-sults show that PyEGo achieves better accuracy, 0.2x to 3.5x higher than the state-of-the-art approaches.
AbstractList Besides third-party packages, the Python interpreter and system libraries are also critical dependencies of a Python program. In our empirical study, 34% programs are only compatible with specific Python interpreter versions, and 24% programs require specific system libraries. However, existing techniques mainly focus on inferring third-party package dependencies. Therefore, they can lack other necessary dependencies and violate version constraints, thus resulting in program build failures and runtime errors. This paper proposes a knowledge-based technique named PyEGo, which can automatically infer dependencies of third-party packages, the Python interpreter, and system libraries at compatible versions for Python programs. We first construct the dependency knowl-edge graph PyKG, which can portray the relations and constraints among third-party packages, the Python interpreter, and system libraries. Then, by querying PyKG with extracted program features, PyEGo constructs a program-related sub-graph with dependency candidates of the three types. It finally outputs the latest compatible dependency versions by solving constraints in the sub-graph. We evaluate PyEGo on 2,891 single-file Python gists, 100 open-source Python projects and 4,836 jupyter notebooks. The experimental re-sults show that PyEGo achieves better accuracy, 0.2x to 3.5x higher than the state-of-the-art approaches.
Author Chen, Wei
Wei, Jun
Dou, Wensheng
Ye, Hongjie
Wu, Guoquan
Author_xml – sequence: 1
  givenname: Hongjie
  surname: Ye
  fullname: Ye, Hongjie
  email: yehongjie19@otcaix.iscas.ac.cn
  organization: Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences,State Key Lab of Computer Sciences,Beijing,China
– sequence: 2
  givenname: Wei
  surname: Chen
  fullname: Chen, Wei
  email: wchen@otcaix.iscas.ac.cn
  organization: Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences,State Key Lab of Computer Sciences,Beijing,China
– sequence: 3
  givenname: Wensheng
  surname: Dou
  fullname: Dou, Wensheng
  email: wsdou@otcaix.iscas.ac.cn
  organization: Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences,State Key Lab of Computer Sciences,Beijing,China
– sequence: 4
  givenname: Guoquan
  surname: Wu
  fullname: Wu, Guoquan
  email: gqwu@otcaix.iscas.ac.cn
  organization: Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences,State Key Lab of Computer Sciences,Beijing,China
– sequence: 5
  givenname: Jun
  surname: Wei
  fullname: Wei, Jun
  email: wj@otcaix.iscas.ac.cn
  organization: Institute of Software, Chinese Academy of Sciences, University of Chinese Academy of Sciences,State Key Lab of Computer Sciences,Beijing,China
BookMark eNotjL1OwzAURg0CibZ0ZmDxC6T4-i_xCG2Bikp0gLly7OsS1NiVE4Hy9gTBdD7pfDpTchFTREJugC0ApLoTChhjYvFL4OUZmZuyGgUThnOAczIBpaoCOFdXZNp1n-NbS2MmZPUS0_cR_QGLB9uhp-v41eQUW4w9XeEJo8foBrqJAfO4kIaU6W7oP1Kku5wO2bbdNbkM9tjh_J8z8v64fls-F9vXp83yfltYXqq-4CBDqCQEKWrhlHOCq1JbcMbauq6lqtEqMN4DVo7xKmhEpa1ElEZ4q8WM3P51G0Tcn3LT2jzsTWmE0Vz8AH6nTO4
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1145/3510003.3510127
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781450392211
1450392210
EISSN 1558-1225
EndPage 1256
ExternalDocumentID 9793962
Genre orig-research
GrantInformation_xml – fundername: Youth Innovation Promotion Association at Chinese Academy of Sciences
  grantid: 2018142
  funderid: 10.13039/501100004739
– fundername: Frontier Science Project of Chinese Academy of Sciences
  grantid: QYZDJ-SSW-JSC036
  funderid: 10.13039/501100018527
– fundername: National Key R&D Program of China
  grantid: 2017YFA0700603
  funderid: 10.13039/501100012166
– fundername: National Natural Science Foundation of China
  grantid: 61732019,U20A6003,62072444
  funderid: 10.13039/501100001809
GroupedDBID -~X
.4S
.DC
123
23M
29O
5VS
6IE
6IF
6IH
6IK
6IL
6IM
6IN
8US
AAJGR
AAWTH
ABLEC
ADZIZ
AFFNX
ALMA_UNASSIGNED_HOLDINGS
APO
ARCSS
AVWKF
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
EDO
FEDTE
I-F
I07
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
XOL
ID FETCH-LOGICAL-a275t-214ff841f43b3c5cc32576a1c9aabbb45bea519dd1e8c028f6ee56a4ee493da63
IEDL.DBID RIE
ISICitedReferencesCount 15
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000832185400101&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:28:32 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a275t-214ff841f43b3c5cc32576a1c9aabbb45bea519dd1e8c028f6ee56a4ee493da63
PageCount 12
ParticipantIDs ieee_primary_9793962
PublicationCentury 2000
PublicationDate 2022-May
PublicationDateYYYYMMDD 2022-05-01
PublicationDate_xml – month: 05
  year: 2022
  text: 2022-May
PublicationDecade 2020
PublicationTitle 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE)
PublicationTitleAbbrev ICSE
PublicationYear 2022
Publisher ACM
Publisher_xml – name: ACM
SSID ssj0006499
ssj0002871777
Score 2.3633513
Snippet Besides third-party packages, the Python interpreter and system libraries are also critical dependencies of a Python program. In our empirical study, 34%...
SourceID ieee
SourceType Publisher
StartPage 1245
SubjectTerms Data mining
environment dependency inference
Feature extraction
Knowledge based systems
knowledge graph
Libraries
Open source software
Python
Runtime
version constraint
Title Knowledge-Based Environment Dependency Inference for Python Programs
URI https://ieeexplore.ieee.org/document/9793962
WOSCitedRecordID wos000832185400101&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED6VioGpQIt4ywMjbpX4EWeFtgIhVRlA6lY59kViaVEfSP33nN00ZWBhsuUhjs6x78v57vsAHoxPJblJx7XOHJdSptwar7ixuUEC5Kidi2IT2WRiptO8aMFjUwuDiDH5DPuhG-_y_cJtQqhskNPHlIcD9yjL9K5Wq4mnBOQfqe3qU1gTlK-pfBKpBkKFQLbohzZKyPzSUomuZNz530ucQu9Qk8eKxtucQQvn59DZizKweo92Yfi2j5LxJ_JQno0OpWxsWEveui17bZ5KsJUV28AhECYI2VqrHnyMR-_PL7yWSuA2zdSap4msKiOTSopSOOWcCD8SNnG5tWVZSlWiJazmfYLGka0qjai0lUjrIbzV4gLa88UcL4E5R5tUIiEjhYFczRJI8dRWNEYmTq6gG4wy-9qxYcxqe1z_PXwDJ2koGIgpgrfQXi83eAfH7nv9uVrexyX8AWNlm_E
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEJ0QNNETKhi_7cGjhexuW3avCgQCEg6YcCPddjbxAoYPE_6901oWD148telhu5luO2-nM-8BPKU2FuQmDVeqbbgQIuY6tZKnOkuRADkqY7zYRHs8TmezbFKB57IWBhF98hk2Xdff5dul2bpQWSujjylzB-6RU84K1VplRMVhf09uF85hRWA-kPlEQrYS6ULZSdO1XkTml5qKdya92v9e4wwah6o8Nin9zTlUcHEBtb0sAwu7tA6d4T5Oxl_IR1nWPRSzsU4QvTU7NiifSsCVTXaORcBN4PK11g1473Wnr30exBK4jttyw-NIFEUqokIkeWKkMYn7ldCRybTO81zIHDWhNWsjTA3ZqlCIUmmBtCKJ1Sq5hOpiucArYMbQNhVI2Eiio1fTBFMstQWNkYmja6g7o8w_f_gw5sEeN38PP8JJf_o2mo8G4-EtnMaufMAnDN5BdbPa4j0cm6_Nx3r14JfzG7R_nzo
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2022+IEEE%2FACM+44th+International+Conference+on+Software+Engineering+%28ICSE%29&rft.atitle=Knowledge-Based+Environment+Dependency+Inference+for+Python+Programs&rft.au=Ye%2C+Hongjie&rft.au=Chen%2C+Wei&rft.au=Dou%2C+Wensheng&rft.au=Wu%2C+Guoquan&rft.date=2022-05-01&rft.pub=ACM&rft.eissn=1558-1225&rft.spage=1245&rft.epage=1256&rft_id=info:doi/10.1145%2F3510003.3510127&rft.externalDocID=9793962