A Method and Experiment to evaluate Deep Neural Networks as Test Oracles for Scientific Software

Testing scientific software is challenging because usually such type of systems have non-deterministic behaviours and, in addition, they generate non-trivial outputs such as images. Artificial intelligence (AI) is now a reality which is also helping in the development of the software testing activit...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings of the 3rd ACM/IEEE International Conference on Automation of Software Test s. 40 - 51
Hlavní autor: de Santiago Junior, Valdivino Alexandre
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: ACM 01.05.2022
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Testing scientific software is challenging because usually such type of systems have non-deterministic behaviours and, in addition, they generate non-trivial outputs such as images. Artificial intelligence (AI) is now a reality which is also helping in the development of the software testing activity. In this article, we evaluate seven deep neural networks (DNNs), precisely deep convolutional neural networks (CNNs) with up to 161layers, playing the role of test oracle procedures for testing scientific models. Firstly, we propose a method, TOrC, which starts by generating training, validation, and test image datasets via combinatorial interaction testing applied to the original codes and second-order mutants. Within TOrC we also have classical steps such as transfer learning, a technique recommended for DNNs. Then, we verified the performance of the oracles (CNNs). The main conclusions of this research are: i) not necessarily a greater number of layers means that a CNN will present better performance; ii) transfer learning is a valuable technique but eventually we may need extended solutions to get better performances; iii) data-centric AI is an interesting path to follow; and iv) there is not a clear correlation between the software bugs, in the scientific models, and the errors (image misclassifications) presented by the CNNs. CCS CONCEPTS * Software and its engineering → Software testing and debugging;. Computing methodologies → Neural networks; Supervised learning by classification; Computer vision.
AbstractList Testing scientific software is challenging because usually such type of systems have non-deterministic behaviours and, in addition, they generate non-trivial outputs such as images. Artificial intelligence (AI) is now a reality which is also helping in the development of the software testing activity. In this article, we evaluate seven deep neural networks (DNNs), precisely deep convolutional neural networks (CNNs) with up to 161layers, playing the role of test oracle procedures for testing scientific models. Firstly, we propose a method, TOrC, which starts by generating training, validation, and test image datasets via combinatorial interaction testing applied to the original codes and second-order mutants. Within TOrC we also have classical steps such as transfer learning, a technique recommended for DNNs. Then, we verified the performance of the oracles (CNNs). The main conclusions of this research are: i) not necessarily a greater number of layers means that a CNN will present better performance; ii) transfer learning is a valuable technique but eventually we may need extended solutions to get better performances; iii) data-centric AI is an interesting path to follow; and iv) there is not a clear correlation between the software bugs, in the scientific models, and the errors (image misclassifications) presented by the CNNs. CCS CONCEPTS * Software and its engineering → Software testing and debugging;. Computing methodologies → Neural networks; Supervised learning by classification; Computer vision.
Author de Santiago Junior, Valdivino Alexandre
Author_xml – sequence: 1
  givenname: Valdivino Alexandre
  surname: de Santiago Junior
  fullname: de Santiago Junior, Valdivino Alexandre
  email: valdivino.santiago@inpe.br
  organization: Instituto Nacional de Pesquisas Espaciais (INPE),Coordenação de Pesquisa Aplicada e Desenvolvimento Tecnológico (COPDT),São José dos Campos,São Paulo,Brazil
BookMark eNotjztPwzAUhY0EElAyM7DcP9DiZxyPVSkFqdChZS43yY0IhCSyXQr_HkswfcN56JxLdtoPPTF2LfhMCG1ulZFaF2KWaKWSJyxztkgCV04WOT9nWQjvnHNZWCGkuWCvc3ii-DbUgH0Ny--RfPtJfYQ4AH1hd8BIcEc0wjMdPHYJ8Tj4jwAYYEchwsZj1VGAZvCwrdqUbZu2gu3QxCN6umJnDXaBsn9O2Mv9crd4mK43q8fFfD1FqW2cUuWwLlHkJacmLS-l0bp0UjulFVrTFMg5KUsGjeUV1WVyFuQcr5FrWasJu_nrbYloP6YX6H_2zrpcG6N-AZbBVKY
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1145/3524481.3527232
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781450392860
1450392865
EndPage 51
ExternalDocumentID 9796455
Genre orig-research
GroupedDBID 6IE
6IL
ACM
ALMA_UNASSIGNED_HOLDINGS
APO
CBEJK
GUFHI
LHSKQ
RIE
RIL
ID FETCH-LOGICAL-a247t-ec9adba16b0ef232b2544b9249343a75f8a00e37e5a570cedb6b08e990da042d3
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000850254300005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:51:38 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a247t-ec9adba16b0ef232b2544b9249343a75f8a00e37e5a570cedb6b08e990da042d3
PageCount 12
ParticipantIDs ieee_primary_9796455
PublicationCentury 2000
PublicationDate 2022-May
PublicationDateYYYYMMDD 2022-05-01
PublicationDate_xml – month: 05
  year: 2022
  text: 2022-May
PublicationDecade 2020
PublicationTitle Proceedings of the 3rd ACM/IEEE International Conference on Automation of Software Test
PublicationTitleAbbrev AST
PublicationYear 2022
Publisher ACM
Publisher_xml – name: ACM
SSID ssj0002871125
Score 1.8227533
Snippet Testing scientific software is challenging because usually such type of systems have non-deterministic behaviours and, in addition, they generate non-trivial...
SourceID ieee
SourceType Publisher
StartPage 40
SubjectTerms Computational modeling
Computer bugs
Correlation
Data-Centric Artificial Intelligence
Deep Convolutional Neural Networks
Deep learning
Explainable Artificial Intelligence
Software testing
Test Oracles
Training
Transfer learning
Title A Method and Experiment to evaluate Deep Neural Networks as Test Oracles for Scientific Software
URI https://ieeexplore.ieee.org/document/9796455
WOSCitedRecordID wos000850254300005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELXaioEJUIv41g2MpM2X43hEUMQApRIFdSt2fJaQUFqlKfx9zm4UGFiYbEWRrDg6v7P93jvGLo2VmHOKNFr6M3fNaAOVyiTIeJFJq02svePN64OYTPL5XE477KrVwiCiJ5_h0HX9Xb5ZFht3VDaSTjfJeZd1hRBbrVZ7nuIyfwLrxr0nSvmIUgvae0RDakXsyov8Kp_i0eNu73_j7rPBjwwPpi3AHLAOln32dg2PvuozqNLAuDXoh3oJjXU3wi3iCpzvhvqgxhO916DWMCMMgKdKOSocULoKPrQ9XQieaUH-UhUO2MvdeHZzHzRlEgIVp6IOsJDKaBVlOkRLH6qd65h2-6okTZTgNldhiIlArrgICzSa3syRYMgoClmTHLJeuSzxiEGa5bHUOkOpLGVWJpe2SBQWWkeI2ubHrO9mZ7HaOmEsmok5-fvxKduNnVjA0wPPWK-uNnjOdorP-n1dXfjf9w12jJ15
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA5zCnpS2cTf5uDRbv2VNjmKbkzc5sApu82X5gUE6UbX6b9vkpXqwYunhFIITXn5XpLv-x4h10oL5MxEmln6E3vNqD2IReQlLEuEliqUzvHmdZiOx3w2E5MGuam1MIjoyGfYsV13l68W2doelXWF1U0ytkW2WRyHwUatVZ-o2NzfwHXl3xPErGuSC7P7CDqmTUNbYORXARWHH_39_418QNo_Qjw6qSHmkDQwb5G3WzpydZ8p5Ir2aot-Wi5oZd6N9B5xSa3zBnyYxlG9VxRWdGpQgD4VYMlw1CSs1AW3IwzRZ7Mkf0GBbfLS703vBl5VKMGDME5LDzMBSkKQSB-1-VBpfcek3VlFcQQp0xx8H6MUGbDUz1BJ8yZHA0QKTNCq6Ig080WOx4TGCQ-FlAkK0Ca3UlzoLALMpAwQpeYnpGVnZ77ceGHMq4k5_fvxFdkdTEfD-fBh_HhG9kIrHXBkwXPSLIs1XpCd7LN8XxWX7ld-A9TXoMA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+3rd+ACM%2FIEEE+International+Conference+on+Automation+of+Software+Test&rft.atitle=A+Method+and+Experiment+to+evaluate+Deep+Neural+Networks+as+Test+Oracles+for+Scientific+Software&rft.au=de+Santiago+Junior%2C+Valdivino+Alexandre&rft.date=2022-05-01&rft.pub=ACM&rft.spage=40&rft.epage=51&rft_id=info:doi/10.1145%2F3524481.3527232&rft.externalDocID=9796455