Cats Are Not Fish: Deep Learning Testing Calls for Out-Of-Distribution Awareness

As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after depl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE) S. 1041 - 1052
Hauptverfasser: Berend, David, Xie, Xiaofei, Ma, Lei, Zhou, Lingjun, Liu, Yang, Xu, Chi, Zhao, Jianjun
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: ACM 01.09.2020
Schlagworte:
ISSN:2643-1572
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. According to the fundamental assumption of deep learning, the DL software does not provide statistical guarantee and has limited capability in handling data that falls outside of its learned distribution, i.e., out-of-distribution (OOD) data. Although recent progress has been made in designing novel testing techniques for DL software, which can detect thousands of errors, the current state-of-the-art DL testing techniques usually do not take the distribution of generated test data into consideration. It is therefore hard to judge whether the "identified errors" are indeed meaningful errors to the DL application (i.e., due to quality issues of the model) or outliers that cannot be handled by the current model (i.e., due to the lack of training data). Tofill this gap, we take thefi rst step and conduct a large scale empirical study, with a total of 451 experiment configurations, 42 deep neural networks (DNNs) and 1.2 million test data instances, to investigate and characterize the impact of OOD-awareness on DL testing. We further analyze the consequences when DL systems go into production by evaluating the effectiveness of adversarial retraining with distribution-aware errors. The results confirm that introducing data distribution awareness in both testing and enhancement phases outperforms distribution unaware retraining by up to 21.5%.
AbstractList As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional software development process, testing the DL software to uncover its defects at an early stage is an effective way to reduce risks after deployment. According to the fundamental assumption of deep learning, the DL software does not provide statistical guarantee and has limited capability in handling data that falls outside of its learned distribution, i.e., out-of-distribution (OOD) data. Although recent progress has been made in designing novel testing techniques for DL software, which can detect thousands of errors, the current state-of-the-art DL testing techniques usually do not take the distribution of generated test data into consideration. It is therefore hard to judge whether the "identified errors" are indeed meaningful errors to the DL application (i.e., due to quality issues of the model) or outliers that cannot be handled by the current model (i.e., due to the lack of training data). Tofill this gap, we take thefi rst step and conduct a large scale empirical study, with a total of 451 experiment configurations, 42 deep neural networks (DNNs) and 1.2 million test data instances, to investigate and characterize the impact of OOD-awareness on DL testing. We further analyze the consequences when DL systems go into production by evaluating the effectiveness of adversarial retraining with distribution-aware errors. The results confirm that introducing data distribution awareness in both testing and enhancement phases outperforms distribution unaware retraining by up to 21.5%.
Author Berend, David
Xu, Chi
Zhou, Lingjun
Ma, Lei
Liu, Yang
Xie, Xiaofei
Zhao, Jianjun
Author_xml – sequence: 1
  givenname: David
  surname: Berend
  fullname: Berend, David
  organization: Nanyang Technological University,Singapore
– sequence: 2
  givenname: Xiaofei
  surname: Xie
  fullname: Xie, Xiaofei
  email: xfxie@ntu.edu.sg
  organization: Kyushu University,Japan
– sequence: 3
  givenname: Lei
  surname: Ma
  fullname: Ma, Lei
  organization: Tianjin University,China
– sequence: 4
  givenname: Lingjun
  surname: Zhou
  fullname: Zhou, Lingjun
  organization: Nanyang Technological University, Zhejiang Sci-Tech University,China
– sequence: 5
  givenname: Yang
  surname: Liu
  fullname: Liu, Yang
  organization: Singapore Institute of Manufacturing Technology,AStar
– sequence: 6
  givenname: Chi
  surname: Xu
  fullname: Xu, Chi
  organization: Nanyang Technological University,Singapore
– sequence: 7
  givenname: Jianjun
  surname: Zhao
  fullname: Zhao, Jianjun
  organization: Nanyang Technological University,Singapore
BookMark eNotjMtOwzAURA0CibZ0zYKNfyAl9vWTXZW2gFRRFmVdOck1GIWksl0h_p4g2MyZo5FmSi76oUdCbli5YEzIOwAujBELEEyp0p6RudVmHEpQWhlxTiZcCSiY1PyKTFP6KEs5ip6Ql8rlRJcR6fOQ6Sak93u6QjzSLbrYh_6N7jHlX1au6xL1Q6S7Uy52vliFlGOoTzkMPV1-uYg9pnRNLr3rEs7_OSOvm_W-eiy2u4enarktHBc6F-h54xvnQCoDcoxWtqDR17ZFFNIoPTYFDjwo3lgLwkMNsuWNbK1pHMzI7d9vQMTDMYZPF78PlhvFGMAP0R9Pyg
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/3324884.3416609
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781450367684
1450367682
EISSN 2643-1572
EndPage 1052
ExternalDocumentID 9286113
Genre orig-research
GrantInformation_xml – fundername: National Research Foundation, Prime Ministers Office, Singapore under its National Cybersecurity R&D Program
  grantid: NRF2018 NCR-NCR005-0001
  funderid: 10.13039/100000964
– fundername: Singapore National Research Foundation
  grantid: NSOE003-0001,NRFI06-2020-0022
  funderid: 10.13039/100000964
– fundername: JSPS KAKENHI
  grantid: 20H04168,19K24348,19H04086
  funderid: 10.13039/501100001691
GroupedDBID 29I
6IE
6IF
6IH
6IK
6IL
6IM
6IN
6J9
AAJGR
AAWTH
ABLEC
ACREN
ADYOE
ADZIZ
AFYQB
ALMA_UNASSIGNED_HOLDINGS
AMTXH
APO
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-a247t-ef2cfcaa356835568d5d37efb9dee45867b9d63a3f362c9934f3b35d2c5d98ca3
IEDL.DBID RIE
ISICitedReferencesCount 60
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000651313500087&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:33:12 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a247t-ef2cfcaa356835568d5d37efb9dee45867b9d63a3f362c9934f3b35d2c5d98ca3
PageCount 12
ParticipantIDs ieee_primary_9286113
PublicationCentury 2000
PublicationDate 2020-Sept.
PublicationDateYYYYMMDD 2020-09-01
PublicationDate_xml – month: 09
  year: 2020
  text: 2020-Sept.
PublicationDecade 2020
PublicationTitle 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)
PublicationTitleAbbrev ASE
PublicationYear 2020
Publisher ACM
Publisher_xml – name: ACM
SSID ssj0051577
ssj0002871035
Score 2.4656901
Snippet As Deep Learning (DL) is continuously adopted in many industrial applications, its quality and reliability start to raise concerns. Similar to the traditional...
SourceID ieee
SourceType Publisher
StartPage 1041
SubjectTerms Data models
Deep learning
Deep learning testing
out of distribution
quality assurance
Software
Software engineering
Software reliability
Testing
Training data
Title Cats Are Not Fish: Deep Learning Testing Calls for Out-Of-Distribution Awareness
URI https://ieeexplore.ieee.org/document/9286113
WOSCitedRecordID wos000651313500087&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG6QePCECsZ3evBoeWy3j_VGQOLBAAc03EgfU2NCgMCif9_psqIHL17apqemr_mmnfk-Qu600txwjk5OsJpFRi9mDQ_M2pQHrXzghTbg67MaDvV0mo0r5H6fCwMARfAZNGOz-Mv3S7eNT2WtLNGyEyVqD5SSu1yt_XtKRP5tvoe-aKaVKql8Oqlo4XBwq6ZNvLSljNGHv7RUClMyqP1vEMek8ZOTR8d7a3NCKrA4JbVvUQZantE6GfdMvqHdNdDhMqdR2fyB9gFWtGRSfaOTSKyBdc_M5xuKoJWOtjkbBdaPJLql_hXtfsY0MbwHG-Rl8DjpPbFSNoGZJFU5g5C44IzhQiK8wsILzxUEm3mAVGipsCVxhQIaL4f4JA3ccuETJ3ymneFnpLpYLuCcUCukA5FK4wS6HQk3oW21tcKjE9gxIrkg9ThBs9WOGWNWzs3l391X5CiJ3moRoXVNqvl6Czfk0H3k75v1bbGcX5QFn1k
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JTwIxFG6ImugJFYy7PXi0LN3HGwEJRhw4oOFGuhoTAgQG_fu2w4gevHhpm56abu977XvfB8CtFJIoQoKT47VEkdELaUU80poSL4X1JNcGfO2LNJXjcTIsgbttLoxzLg8-c7XYzP_y7dys41NZPcGSN6NE7S6jFDc22VrbF5WI_RtkC36DoRaiIPNpUlYPAwqbldbCtc15jD_8paaSG5Nu-X_DOATVn6w8ONzamyNQcrNjUP6WZYDFKa2AYVtlK9haOpjOMxi1ze9hx7kFLLhU3-AoUmuEuq2m0xUMsBUO1hkaeNSJNLqFAhZsfcZEsXATVsFL92HU7qFCOAEpTEWGnMfGG6UI4wFghcIyS4TzOrHOUSa5CC0e1sgH82UCQqGeaMIsNswm0ihyAnZm85k7BVAzbhyjXBkWHA9MlG9oqTWzwQ1sKobPQCVO0GSx4caYFHNz_nf3DdjvjZ77k_5j-nQBDnD0XfN4rUuwky3X7grsmY_sfbW8zpf2Cw7FoqA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+35th+IEEE%2FACM+International+Conference+on+Automated+Software+Engineering+%28ASE%29&rft.atitle=Cats+Are+Not+Fish%3A+Deep+Learning+Testing+Calls+for+Out-Of-Distribution+Awareness&rft.au=Berend%2C+David&rft.au=Xie%2C+Xiaofei&rft.au=Ma%2C+Lei&rft.au=Zhou%2C+Lingjun&rft.date=2020-09-01&rft.pub=ACM&rft.eissn=2643-1572&rft.spage=1041&rft.epage=1052&rft_id=info:doi/10.1145%2F3324884.3416609&rft.externalDocID=9286113