Learning from, Understanding, and Supporting DevOps Artifacts for Docker
With the growing use of DevOps tools and frameworks, there is an increased need for tools and techniques that support more than code. The current state-of-the-art in static developer assistance for tools like Docker is limited to shallow syntactic validation. We identify three core challenges in the...
Uloženo v:
| Vydáno v: | 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE) s. 38 - 49 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
ACM
27.06.2020
|
| Témata: | |
| ISSN: | 1558-1225 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | With the growing use of DevOps tools and frameworks, there is an increased need for tools and techniques that support more than code. The current state-of-the-art in static developer assistance for tools like Docker is limited to shallow syntactic validation. We identify three core challenges in the realm of learning from, understanding, and supporting developers writing DevOps artifacts: (i) nested languages in DevOps artifacts, (ii) rule mining, and (iii) the lack of semantic rule-based analysis. To address these challenges we introduce a toolset, binnacle, that enabled us to ingest 900,000 GitHub repositories. Focusing on Docker, we extracted approximately 178,000 unique Dockerfiles, and also identified a Gold Set of Dockerfiles written by Docker experts. We addressed challenge (i) by reducing the number of effectively uninterpretable nodes in our ASTs by over 80% via a technique we call phased parsing. To address challenge (ii), we introduced a novel rule-mining technique capable of recovering two-thirds of the rules in a benchmark we curated. Through this automated mining, we were able to recover 16 new rules that were not found during manual rule collection. To address challenge (iii), we manually collected a set of rules for Dockerfiles from commits to the files in the Gold Set. These rules encapsulate best practices, avoid docker build failures, and improve image size and build latency. We created an analyzer that used these rules, and found that, on average, Dockerfiles on GitHub violated the rules five times more frequently than the Dockerfiles in our Gold Set. We also found that industrial Dockerfiles fared no better than those sourced from GitHub. The learned rules and analyzer in binnacle can be used to aid developers in the IDE when creating Dockerfiles, and in a post-hoc fashion to identify issues in, and to improve, existing Dockerfiles. |
|---|---|
| AbstractList | With the growing use of DevOps tools and frameworks, there is an increased need for tools and techniques that support more than code. The current state-of-the-art in static developer assistance for tools like Docker is limited to shallow syntactic validation. We identify three core challenges in the realm of learning from, understanding, and supporting developers writing DevOps artifacts: (i) nested languages in DevOps artifacts, (ii) rule mining, and (iii) the lack of semantic rule-based analysis. To address these challenges we introduce a toolset, binnacle, that enabled us to ingest 900,000 GitHub repositories. Focusing on Docker, we extracted approximately 178,000 unique Dockerfiles, and also identified a Gold Set of Dockerfiles written by Docker experts. We addressed challenge (i) by reducing the number of effectively uninterpretable nodes in our ASTs by over 80% via a technique we call phased parsing. To address challenge (ii), we introduced a novel rule-mining technique capable of recovering two-thirds of the rules in a benchmark we curated. Through this automated mining, we were able to recover 16 new rules that were not found during manual rule collection. To address challenge (iii), we manually collected a set of rules for Dockerfiles from commits to the files in the Gold Set. These rules encapsulate best practices, avoid docker build failures, and improve image size and build latency. We created an analyzer that used these rules, and found that, on average, Dockerfiles on GitHub violated the rules five times more frequently than the Dockerfiles in our Gold Set. We also found that industrial Dockerfiles fared no better than those sourced from GitHub. The learned rules and analyzer in binnacle can be used to aid developers in the IDE when creating Dockerfiles, and in a post-hoc fashion to identify issues in, and to improve, existing Dockerfiles. |
| Author | Henkel, Jordan Bird, Christian Lahiri, Shuvendu K. Reps, Thomas |
| Author_xml | – sequence: 1 givenname: Jordan surname: Henkel fullname: Henkel, Jordan email: jjhenkel@cs.wisc.edu organization: University of Wisconsin-Madison,USA – sequence: 2 givenname: Christian surname: Bird fullname: Bird, Christian email: Christian.Bird@microsoft.com organization: Microsoft Research,USA – sequence: 3 givenname: Shuvendu K. surname: Lahiri fullname: Lahiri, Shuvendu K. email: Shuvendu.Lahiri@microsoft.com organization: Microsoft Research,USA – sequence: 4 givenname: Thomas surname: Reps fullname: Reps, Thomas email: reps@cs.wisc.edu organization: University of Wisconsin-Madison,USA |
| BookMark | eNotjMFOwzAQRA0CiaZw5sDFH9AUr53E62PVFooUqQfouXKcNQpQJ7IDEn9PEJye5mlmMnYR-kCM3YJYAhTlvVJaI8BSKRSFqM5YNlmhNEhQ52wGZYk5SFlesSylNyFEVRgzY7uabAxdeOU-9qcFP4SWYhptaCe34BP58-cw9HH87Wzoaz8kvpqSt25M3PeRb3r3TvGaXXr7kejmn3N2eNi-rHd5vX98Wq_q3CqUYy6d0cY2kqQn1xSqBURowVXS2wadRUtlhWScbipENY20LLRHQdj4yrdqzu7-fjsiOg6xO9n4fTQSC5BG_QD4fk04 |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1145/3377811.3380406 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore Digital Libary (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 1450371213 9781450371216 |
| EISSN | 1558-1225 |
| EndPage | 49 |
| ExternalDocumentID | 9284129 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: ONR grantid: N00014-17-1-2889,N00014-19-1-2318 funderid: 10.13039/100000006 |
| GroupedDBID | -~X .4S .DC 123 23M 29O 5VS 6IE 6IF 6IH 6IK 6IL 6IM 6IN 8US AAJGR AAWTH ABLEC ADZIZ AFFNX ALMA_UNASSIGNED_HOLDINGS APO ARCSS AVWKF BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO EDO FEDTE I-F I07 IEGSK IJVOP IPLJI M43 OCL RIE RIL RIO RNS XOL |
| ID | FETCH-LOGICAL-a382t-2c979ab2e2fecb43d1881d1c62fab8ca8ae568e9c7b6883a387247f80e8bf6fd3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 40 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000652529800004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:32:58 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a382t-2c979ab2e2fecb43d1881d1c62fab8ca8ae568e9c7b6883a387247f80e8bf6fd3 |
| OpenAccessLink | https://dl.acm.org/doi/pdf/10.1145/3377811.3380406 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_9284129 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-06-27 |
| PublicationDateYYYYMMDD | 2020-06-27 |
| PublicationDate_xml | – month: 06 year: 2020 text: 2020-06-27 day: 27 |
| PublicationDecade | 2020 |
| PublicationTitle | 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE) |
| PublicationTitleAbbrev | ICSE |
| PublicationYear | 2020 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssj0006499 ssj0002870079 |
| Score | 2.379887 |
| Snippet | With the growing use of DevOps tools and frameworks, there is an increased need for tools and techniques that support more than code. The current... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 38 |
| SubjectTerms | DevOps Docker Mining Static Checking |
| Title | Learning from, Understanding, and Supporting DevOps Artifacts for Docker |
| URI | https://ieeexplore.ieee.org/document/9284129 |
| WOSCitedRecordID | wos000652529800004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JT0IxEG6QePCECsY9PXikwGv7uhyNSjgY5CCGG2nLlBgTIGy_3-njiZp48dQladJ0mfnamW-GkLsowCkbBDNSAJNWKmaMiMxbsN4Il8fCy_ftWff7ZjSygwpp7rkwAFA4n0ErVQtb_mQeNumrrG1RlqJ-OiAHWqsdV2v_n5IMdp1kcSqlsEIoX4byyWTeFkInTmULX2R4bNWvXCqFKunW_jeJY9L45uTRwV7bnJAKzE5J7SspAy3vaJ30yoipU5qYI006_MleaVIsacrkOU_RA6b0EbYvixW9x1aiOKwoYliKiucDlg0y7D69PvRYmS6BOWH4mvFgtXWeA48QvBSTzCAYzYLi0XkTnHGQKwM2aK9wP3CQ5lJH0wHjo4oTcUaqs_kMzgkVKveASIqbaCSKRJ8Zh8BOuhzFoZPugtTTwowXu4gY43JNLv_uviJHPL1SO4pxfU2q6-UGbshh2K7fV8vbYhs_AVLynYg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4gmugJFYxve_DIwtJ2u-3RqAQjIgcw3EhbpsSYAOH1-22XFTXx4qmPpEnTx8zXznwzALeOoRbKskhyhhFXXERSMhcZhcpIphOXefm-tdNORw4GqluA6pYLg4iZ8xnWQjWz5Y-mdhW-yurKy1Kvn3ZgN-Gcxhu21vZHJZjs4mBzyuWw8GA-D-bT4EmdsTSwKmv-TeYPrviVTSVTJs3S_6ZxCJVvVh7pbvXNERRwcgylr7QMJL-lZWjlMVPHJHBHqqT_k79SJb4kIZfnNMQPGJMHXL_OFuTOtwLJYUE8iiVe9XzgvAL95mPvvhXlCRMizSRdRtSqVGlDkTq0hrNRQ3o42rCCOm2k1VJjIiQqmxrhd8QPSilPnYxRGifciJ1AcTKd4CkQJhKDHktR6ST3QtE0pPbQjuvEC0TN9RmUw8IMZ5uYGMN8Tc7_7r6B_VbvpT1sP3WeL-CAhjdrLCKaXkJxOV_hFezZ9fJ9Mb_OtvQTtO-gzw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2020+IEEE%2FACM+42nd+International+Conference+on+Software+Engineering+%28ICSE%29&rft.atitle=Learning+from%2C+Understanding%2C+and+Supporting+DevOps+Artifacts+for+Docker&rft.au=Henkel%2C+Jordan&rft.au=Bird%2C+Christian&rft.au=Lahiri%2C+Shuvendu+K.&rft.au=Reps%2C+Thomas&rft.date=2020-06-27&rft.pub=ACM&rft.eissn=1558-1225&rft.spage=38&rft.epage=49&rft_id=info:doi/10.1145%2F3377811.3380406&rft.externalDocID=9284129 |