Cats Are Not Fish: Deep Learning Testing Calls for Out-Of-Distribution Awareness
As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after depl...
Uloženo v:
| Vydáno v: | 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) s. 1041 - 1052 |
|---|---|
| Hlavní autoři: | , , , , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
ACM
01.09.2020
|
| Témata: | |
| ISSN: | 2643-1572 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. According to the fundamental assumption of deep learning, the DL software does not provide statistical guarantee and has limited capability in handling data that falls outside of its learned distribution, i.e., out-of-distribution (OOD) data. Although recent progress has been made in designing novel testing techniques for DL software, which can detect thousands of errors, the current state-of-the-art DL testing techniques usually do not take the distribution of generated test data into consideration. It is therefore hard to judge whether the "identified errors" are indeed meaningful errors to the DL application (i.e., due to quality issues of the model) or outliers that cannot be handled by the current model (i.e., due to the lack of training data). Tofill this gap, we take thefi rst step and conduct a large scale empirical study, with a total of 451 experiment configurations, 42 deep neural networks (DNNs) and 1.2 million test data instances, to investigate and characterize the impact of OOD-awareness on DL testing. We further analyze the consequences when DL systems go into production by evaluating the effectiveness of adversarial retraining with distribution-aware errors. The results confirm that introducing data distribution awareness in both testing and enhancement phases outperforms distribution unaware retraining by up to 21.5%. |
|---|---|
| AbstractList | As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. According to the fundamental assumption of deep learning, the DL software does not provide statistical guarantee and has limited capability in handling data that falls outside of its learned distribution, i.e., out-of-distribution (OOD) data. Although recent progress has been made in designing novel testing techniques for DL software, which can detect thousands of errors, the current state-of-the-art DL testing techniques usually do not take the distribution of generated test data into consideration. It is therefore hard to judge whether the "identified errors" are indeed meaningful errors to the DL application (i.e., due to quality issues of the model) or outliers that cannot be handled by the current model (i.e., due to the lack of training data). Tofill this gap, we take thefi rst step and conduct a large scale empirical study, with a total of 451 experiment configurations, 42 deep neural networks (DNNs) and 1.2 million test data instances, to investigate and characterize the impact of OOD-awareness on DL testing. We further analyze the consequences when DL systems go into production by evaluating the effectiveness of adversarial retraining with distribution-aware errors. The results confirm that introducing data distribution awareness in both testing and enhancement phases outperforms distribution unaware retraining by up to 21.5%. |
| Author | Berend, David Xu, Chi Zhou, Lingjun Ma, Lei Liu, Yang Xie, Xiaofei Zhao, Jianjun |
| Author_xml | – sequence: 1 givenname: David surname: Berend fullname: Berend, David organization: Nanyang Technological University,Singapore – sequence: 2 givenname: Xiaofei surname: Xie fullname: Xie, Xiaofei email: xfxie@ntu.edu.sg organization: Kyushu University,Japan – sequence: 3 givenname: Lei surname: Ma fullname: Ma, Lei organization: Tianjin University,China – sequence: 4 givenname: Lingjun surname: Zhou fullname: Zhou, Lingjun organization: Nanyang Technological University, Zhejiang Sci-Tech University,China – sequence: 5 givenname: Yang surname: Liu fullname: Liu, Yang organization: Singapore Institute of Manufacturing Technology,AStar – sequence: 6 givenname: Chi surname: Xu fullname: Xu, Chi organization: Nanyang Technological University,Singapore – sequence: 7 givenname: Jianjun surname: Zhao fullname: Zhao, Jianjun organization: Nanyang Technological University,Singapore |
| BookMark | eNotjMtOwzAURA0CibZ0zYKNfyAl9vWTXZW2gFRRFmVdOck1GIWksl0h_p4g2MyZo5FmSi76oUdCbli5YEzIOwAujBELEEyp0p6RudVmHEpQWhlxTiZcCSiY1PyKTFP6KEs5ip6Ql8rlRJcR6fOQ6Sak93u6QjzSLbrYh_6N7jHlX1au6xL1Q6S7Uy52vliFlGOoTzkMPV1-uYg9pnRNLr3rEs7_OSOvm_W-eiy2u4enarktHBc6F-h54xvnQCoDcoxWtqDR17ZFFNIoPTYFDjwo3lgLwkMNsuWNbK1pHMzI7d9vQMTDMYZPF78PlhvFGMAP0R9Pyg |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1145/3324884.3416609 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9781450367684 1450367682 |
| EISSN | 2643-1572 |
| EndPage | 1052 |
| ExternalDocumentID | 9286113 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Research Foundation, Prime Ministers Office, Singapore under its National Cybersecurity R&D Program grantid: NRF2018 NCR-NCR005-0001 funderid: 10.13039/100000964 – fundername: Singapore National Research Foundation grantid: NSOE003-0001,NRFI06-2020-0022 funderid: 10.13039/100000964 – fundername: JSPS KAKENHI grantid: 20H04168,19K24348,19H04086 funderid: 10.13039/501100001691 |
| GroupedDBID | 29I 6IE 6IF 6IH 6IK 6IL 6IM 6IN 6J9 AAJGR AAWTH ABLEC ACREN ADYOE ADZIZ AFYQB ALMA_UNASSIGNED_HOLDINGS AMTXH APO BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL |
| ID | FETCH-LOGICAL-a247t-ef2cfcaa356835568d5d37efb9dee45867b9d63a3f362c9934f3b35d2c5d98ca3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 60 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000651313500087&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:33:12 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a247t-ef2cfcaa356835568d5d37efb9dee45867b9d63a3f362c9934f3b35d2c5d98ca3 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_9286113 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-Sept. |
| PublicationDateYYYYMMDD | 2020-09-01 |
| PublicationDate_xml | – month: 09 year: 2020 text: 2020-Sept. |
| PublicationDecade | 2020 |
| PublicationTitle | 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) |
| PublicationTitleAbbrev | ASE |
| PublicationYear | 2020 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssj0051577 ssj0002871035 |
| Score | 2.4656901 |
| Snippet | As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1041 |
| SubjectTerms | Data models Deep learning Deep learning testing out of distribution quality assurance Software Software engineering Software reliability Testing Training data |
| Title | Cats Are Not Fish: Deep Learning Testing Calls for Out-Of-Distribution Awareness |
| URI | https://ieeexplore.ieee.org/document/9286113 |
| WOSCitedRecordID | wos000651313500087&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T0IxFG6QODihgvGdDo4W7OP24UZA4gQMmLCR0p4aEwIELvj3bS_Xq4OLS9t0atr0nO-053wfQg_KgaVzY1OWeAxQHDBitWHEsKCUsyEiblGITajhUE-nZlxDj1UtDAAUyWfQTsPiL9-v3C49lXUM05ImidojpeShVqt6T0nI_4lX0De6aaVKKh8qsg6PwEFr0Y5GW8qUffhLS6VwJYPG_xZxilo_NXl4XHmbM1SD5TlqfIsy4PKONtG4Z_Mt7m4AD1c5Tsrmz7gPsMYlk-o7niRijdj37GKxxRG04tEuJ6NA-olEt9S_wt3PVCYW7WALvQ1eJr1XUsomEMuEygkE5oKzlmcywqvY-MxzBWFuPIDItFRxJLnlITovF_GJCHzOM89c5o12ll-g-nK1hEuEvabUUqWphESCE7GF8MFzK5h3Eqi_Qs20QbP1gRljVu7N9d_TN-iEpWi1yNC6RfV8s4M7dOz2-cd2c18c5xfoqp9Y |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEG0ImugJFYzf9uDRBfqxbdcbAQlGXDhgwo2UdmpMCBBY9O_bLit68OKlbXpq2nTmTTvzHkJ30oAm00SHLHEfoBigkVYJjRLqpDTaecTNc7EJmaZqPE6GJXS_q4UBgDz5DOphmP_l24XZhKeyRkKVIEGidi_mnDa31Vq7F5WA_ZtsB369o5ayIPMhPG4wDx2U4nVvtoUI-Ye_1FRyZ9Kt_G8ZR6j2U5WHhzt_c4xKMD9BlW9ZBlzc0ioatnW2xq0V4HSR4aBt_oA7AEtccKm-4VGg1vB9W89ma-xhKx5ssmjgok6g0S0UsHDrMxSKeUtYQ6_dx1G7FxXCCZGmXGYROGqc0ZrFwgMs39jYMglumlgAHish_UgwzZx3X8YjFO7YlMWWmtgmymh2isrzxRzOELaKEE2kIgICDY5HF9w6yzSn1ggg9hxVwwZNlltujEmxNxd_T9-ig97opT_pP6XPl-iQhtg1z9e6QuVstYFrtG8-svf16iY_2i9jvqKf |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+35th+IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%28ASE%29&rft.atitle=Cats+Are+Not+Fish%3A+Deep+Learning+Testing+Calls+for+Out-Of-Distribution+Awareness&rft.au=Berend%2C+David&rft.au=Xie%2C+Xiaofei&rft.au=Ma%2C+Lei&rft.au=Zhou%2C+Lingjun&rft.date=2020-09-01&rft.pub=ACM&rft.eissn=2643-1572&rft.spage=1041&rft.epage=1052&rft_id=info:doi/10.1145%2F3324884.3416609&rft.externalDocID=9286113 |