BinBench: a benchmark for x64 portable operating system interface binary function representations

In this article we propose the first multi-task benchmark for evaluating the performances of machine learning models that work on low level assembly functions. While the use of multi-task benchmark is a standard in the natural language processing (NLP) field, such practice is unknown in the field of...

Full description

Saved in:
Bibliographic Details
Published in:PeerJ. Computer science Vol. 9; p. e1286
Main Authors: Console, Francesca, D’Aquanno, Giuseppe, Di Luna, Giuseppe Antonio, Querzoni, Leonardo
Format: Journal Article
Language:English
Published: United States PeerJ. Ltd 01.06.2023
PeerJ Inc
Subjects:
ISSN:2376-5992, 2376-5992
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract In this article we propose the first multi-task benchmark for evaluating the performances of machine learning models that work on low level assembly functions. While the use of multi-task benchmark is a standard in the natural language processing (NLP) field, such practice is unknown in the field of assembly language processing. However, in the latest years there has been a strong push in the use of deep neural networks architectures borrowed from NLP to solve problems on assembly code. A first advantage of having a standard benchmark is the one of making different works comparable without effort of reproducing third part solutions. The second advantage is the one of being able to test the generality of a machine learning model on several tasks. For these reasons, we propose BinBench, a benchmark for binary function models. The benchmark includes various binary analysis tasks, as well as a dataset of binary functions on which tasks should be solved. The dataset is publicly available and it has been evaluated using baseline models.
AbstractList In this article we propose the first multi-task benchmark for evaluating the performances of machine learning models that work on low level assembly functions. While the use of multi-task benchmark is a standard in the natural language processing (NLP) field, such practice is unknown in the field of assembly language processing. However, in the latest years there has been a strong push in the use of deep neural networks architectures borrowed from NLP to solve problems on assembly code. A first advantage of having a standard benchmark is the one of making different works comparable without effort of reproducing third part solutions. The second advantage is the one of being able to test the generality of a machine learning model on several tasks. For these reasons, we propose BinBench, a benchmark for binary function models. The benchmark includes various binary analysis tasks, as well as a dataset of binary functions on which tasks should be solved. The dataset is publicly available and it has been evaluated using baseline models.
In this article we propose the first multi-task benchmark for evaluating the performances of machine learning models that work on low level assembly functions. While the use of multi-task benchmark is a standard in the natural language processing (NLP) field, such practice is unknown in the field of assembly language processing. However, in the latest years there has been a strong push in the use of deep neural networks architectures borrowed from NLP to solve problems on assembly code. A first advantage of having a standard benchmark is the one of making different works comparable without effort of reproducing third part solutions. The second advantage is the one of being able to test the generality of a machine learning model on several tasks. For these reasons, we propose BinBench, a benchmark for binary function models. The benchmark includes various binary analysis tasks, as well as a dataset of binary functions on which tasks should be solved. The dataset is publicly available and it has been evaluated using baseline models.In this article we propose the first multi-task benchmark for evaluating the performances of machine learning models that work on low level assembly functions. While the use of multi-task benchmark is a standard in the natural language processing (NLP) field, such practice is unknown in the field of assembly language processing. However, in the latest years there has been a strong push in the use of deep neural networks architectures borrowed from NLP to solve problems on assembly code. A first advantage of having a standard benchmark is the one of making different works comparable without effort of reproducing third part solutions. The second advantage is the one of being able to test the generality of a machine learning model on several tasks. For these reasons, we propose BinBench, a benchmark for binary function models. The benchmark includes various binary analysis tasks, as well as a dataset of binary functions on which tasks should be solved. The dataset is publicly available and it has been evaluated using baseline models.
ArticleNumber e1286
Audience Academic
Author Di Luna, Giuseppe Antonio
Console, Francesca
Querzoni, Leonardo
D’Aquanno, Giuseppe
Author_xml – sequence: 1
  givenname: Francesca
  surname: Console
  fullname: Console, Francesca
– sequence: 2
  givenname: Giuseppe
  surname: D’Aquanno
  fullname: D’Aquanno, Giuseppe
– sequence: 3
  givenname: Giuseppe Antonio
  surname: Di Luna
  fullname: Di Luna, Giuseppe Antonio
– sequence: 4
  givenname: Leonardo
  surname: Querzoni
  fullname: Querzoni, Leonardo
BackLink https://www.ncbi.nlm.nih.gov/pubmed/37346713$$D View this record in MEDLINE/PubMed
BookMark eNptkktr3DAUhU1JadI0y26LoJt24aketmR3U5LQx0Cg0MdaXMlXE6UeyZU8JfPvq8mkIQOVFrqSvnu4B87z6ijEgFX1ktGFUky9mxDTTW3zgvFOPqlOuFCybvueHz2qj6uznG8opaxlZfXPqmOhRCMVEycVXPhwgcFevydAzK5YQ_pFXEzkVjZkimkGMyKJEyaYfViRvM0zrokPMyYHFonxAdKWuE2ws4-BJJwSZgwz7K75RfXUwZjx7P48rX5--vjj8kt99fXz8vL8qrZlrLmWxVAHYlDcgOyYMtA4gxawtU3bIXO2Q8UpOmYAmVU9p7QfJBhDLQw9F6fVcq87RLjRU_LFyFZH8PruIaaVhjR7O6IGCb1TprcNyMYxAc2gmGmEgYG2jrVF68Nea9qYNQ62mEkwHoge_gR_rVfxj2aUd7RhrCi8uVdI8fcG86zXPlscRwgYN1nzjneqbRspCvp6j66gzOaDi0XS7nB9rlompOS8KdTiP1TZA669LbFwvrwfNLw9aCjMjLfzCjY56-X3b4fsq8d-H4z-y0kB6j1gU8w5oXtAGNW7KOq7KGqb9S6K4i9yLtKt
Cites_doi 10.1007/978-3-030-01054-6
10.1145/3212695
10.48550/arXiv.2006.05477
10.1145/3428293
10.1145/3133956.3134018
10.1145/3460120.3484587
10.1109/MCI.2018.2840738
10.48550/arXiv.2107.13404
10.1155/2017/3273891
10.1007/978-3-319-41111-8
10.14722/bar.2019.23020
10.1016/j.diin.2015.01.011
10.14722/ndss.2018.23304
10.1007/978-1-4614-7138-7
10.48550/arXiv.1905.08325
10.48550/arXiv.2011.10749
10.48550/arXiv.2101.08116
10.1016/j.diin.2015.05.015
10.1145/2980983.2908126
10.1145/2594291.2594343
10.1145/2430553.2430558
10.48550/arXiv.1911.03429
10.1145/3446371
ContentType Journal Article
Copyright 2023 Console et al.
COPYRIGHT 2023 PeerJ. Ltd.
2023 Console et al. 2023 Console et al.
Copyright_xml – notice: 2023 Console et al.
– notice: COPYRIGHT 2023 PeerJ. Ltd.
– notice: 2023 Console et al. 2023 Console et al.
DBID AAYXX
CITATION
NPM
ISR
7X8
5PM
DOA
DOI 10.7717/peerj-cs.1286
DatabaseName CrossRef
PubMed
Gale In Context: Science
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList CrossRef


PubMed
MEDLINE - Academic

Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2376-5992
ExternalDocumentID oai_doaj_org_article_a6a9f7b9c4a64f13a4d71b43bad05f15
PMC10280411
A751366224
37346713
10_7717_peerj_cs_1286
Genre Journal Article
GeographicLocations New York
GeographicLocations_xml – name: New York
GrantInformation_xml – fundername: TIM S.p.A. through the PhD Scholarship
– fundername: SERICS
  grantid: PE00000014
– fundername: European Union—Next Generation EU
GroupedDBID 53G
5VS
8FE
8FG
AAFWJ
AAYXX
ABUWG
ADBBV
AFFHD
AFKRA
AFPKN
ALMA_UNASSIGNED_HOLDINGS
ARAPS
AZQEC
BCNDV
BENPR
BGLVJ
BPHCQ
CCPQU
CITATION
DWQXO
FRP
GNUQQ
GROUPED_DOAJ
HCIFZ
IAO
ICD
IEA
ISR
ITC
K6V
K7-
M~E
OK1
P62
PHGZM
PHGZT
PIMPY
PQGLB
PQQKQ
PROAC
RPM
ARCSS
H13
NPM
7X8
PUEGO
5PM
ID FETCH-LOGICAL-c511t-67718a3d72ba6817ba4fbecae5c458e1fc8e720ef1bae1c792009d6abb0cad923
IEDL.DBID DOA
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001009615200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2376-5992
IngestDate Fri Oct 03 12:38:00 EDT 2025
Tue Nov 04 02:06:47 EST 2025
Thu Sep 04 15:38:04 EDT 2025
Tue Nov 11 10:25:40 EST 2025
Tue Nov 04 17:26:21 EST 2025
Thu Nov 13 16:34:06 EST 2025
Thu May 22 04:23:31 EDT 2025
Sat Nov 29 05:30:58 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Compiler provenance
Binary functions
Assembly language
Neural networks
Dataset
Benchmark
Binary functions representation
Binary similarity
Language English
License https://creativecommons.org/licenses/by/4.0
2023 Console et al.
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c511t-67718a3d72ba6817ba4fbecae5c458e1fc8e720ef1bae1c792009d6abb0cad923
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://doaj.org/article/a6a9f7b9c4a64f13a4d71b43bad05f15
PMID 37346713
PQID 2828755463
PQPubID 23479
PageCount e1286
ParticipantIDs doaj_primary_oai_doaj_org_article_a6a9f7b9c4a64f13a4d71b43bad05f15
pubmedcentral_primary_oai_pubmedcentral_nih_gov_10280411
proquest_miscellaneous_2828755463
gale_infotracmisc_A751366224
gale_infotracacademiconefile_A751366224
gale_incontextgauss_ISR_A751366224
pubmed_primary_37346713
crossref_primary_10_7717_peerj_cs_1286
PublicationCentury 2000
PublicationDate 2023-06-01
PublicationDateYYYYMMDD 2023-06-01
PublicationDate_xml – month: 06
  year: 2023
  text: 2023-06-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: San Diego, USA
PublicationTitle PeerJ. Computer science
PublicationTitleAlternate PeerJ Comput Sci
PublicationYear 2023
Publisher PeerJ. Ltd
PeerJ Inc
Publisher_xml – name: PeerJ. Ltd
– name: PeerJ Inc
References Hegde (10.7717/peerj-cs.1286/ref-23) 2020
Chen (10.7717/peerj-cs.1286/ref-7) 2019; 1
He (10.7717/peerj-cs.1286/ref-22) 2018
Radford (10.7717/peerj-cs.1286/ref-41) 2019; 1
Fu (10.7717/peerj-cs.1286/ref-19) 2019
Allamanis (10.7717/peerj-cs.1286/ref-1) 2018; 51
Bahdanau (10.7717/peerj-cs.1286/ref-4) 2015
Brown (10.7717/peerj-cs.1286/ref-5) 2020; 33
David (10.7717/peerj-cs.1286/ref-12) 2014
Xu (10.7717/peerj-cs.1286/ref-50) 2017
Herrera (10.7717/peerj-cs.1286/ref-24) 2016
Wang (10.7717/peerj-cs.1286/ref-49) 2018
Gao (10.7717/peerj-cs.1286/ref-20) 2021
Chua (10.7717/peerj-cs.1286/ref-8) 2017
Devlin (10.7717/peerj-cs.1286/ref-13) 2019
Escalada (10.7717/peerj-cs.1286/ref-17) 2017; 2017
Katz (10.7717/peerj-cs.1286/ref-27) 2018
Patrick-Evans (10.7717/peerj-cs.1286/ref-39) 2021
David (10.7717/peerj-cs.1286/ref-11) 2017
David (10.7717/peerj-cs.1286/ref-10) 2016; 51
David (10.7717/peerj-cs.1286/ref-9) 2020; 4
Ding (10.7717/peerj-cs.1286/ref-15) 2019
Lakhotia (10.7717/peerj-cs.1286/ref-31) 2013
Massarelli (10.7717/peerj-cs.1286/ref-35) 2019b
Liu (10.7717/peerj-cs.1286/ref-33) 2018
Vaswani (10.7717/peerj-cs.1286/ref-48) 2017; 30
Khanuja (10.7717/peerj-cs.1286/ref-28) 2020
Mikolov (10.7717/peerj-cs.1286/ref-36) 2013
Rosenblum (10.7717/peerj-cs.1286/ref-46) 2011
Patrick-Evans (10.7717/peerj-cs.1286/ref-38) 2020
Young (10.7717/peerj-cs.1286/ref-51) 2018; 13
Haq (10.7717/peerj-cs.1286/ref-21) 2021; 54
Papineni (10.7717/peerj-cs.1286/ref-37) 2002
Zhang (10.7717/peerj-cs.1286/ref-52) 2015
Katz (10.7717/peerj-cs.1286/ref-26) 2019
Artuso (10.7717/peerj-cs.1286/ref-3) 2021
Li (10.7717/peerj-cs.1286/ref-32) 2021
Ribeiro (10.7717/peerj-cs.1286/ref-44) 2017
Caliskan (10.7717/peerj-cs.1286/ref-6) 2018
Kim (10.7717/peerj-cs.1286/ref-30) 2020
James (10.7717/peerj-cs.1286/ref-25) 2013; 112
DeYoung (10.7717/peerj-cs.1286/ref-14) 2019
Escalada (10.7717/peerj-cs.1286/ref-18) 2021
Petroni (10.7717/peerj-cs.1286/ref-40) 2021
Rosenblum (10.7717/peerj-cs.1286/ref-45) 2010
Dullien (10.7717/peerj-cs.1286/ref-16) 2005; 5
Khoo (10.7717/peerj-cs.1286/ref-29) 2013
Rajpurkar (10.7717/peerj-cs.1286/ref-43) 2016
Rahimian (10.7717/peerj-cs.1286/ref-42) 2015; 14
Saxena (10.7717/peerj-cs.1286/ref-47) 2008
Massarelli (10.7717/peerj-cs.1286/ref-34) 2019a
Alrabaee (10.7717/peerj-cs.1286/ref-2) 2015; 12
References_xml – start-page: 3575
  year: 2020
  ident: 10.7717/peerj-cs.1286/ref-28
  article-title: GLUECoS: an evaluation benchmark for code-switched NLP
– start-page: 373
  year: 2020
  ident: 10.7717/peerj-cs.1286/ref-38
  article-title: Probabilistic naming of functions in stripped binaries
– start-page: 346
  year: 2018
  ident: 10.7717/peerj-cs.1286/ref-27
  article-title: Using recurrent neural networks for decompilation
– volume: 1
  start-page: 9
  issue: 8
  year: 2019
  ident: 10.7717/peerj-cs.1286/ref-41
  article-title: Language models are unsupervised multitask learners
  publication-title: OpenAI Blog
– start-page: 3703
  volume-title: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8–14, 2019, Vancouver, BC, Canada
  year: 2019
  ident: 10.7717/peerj-cs.1286/ref-19
  article-title: Coda: an end-to-end neural program decompiler
– volume: 30
  volume-title: Advances in Neural Information Processing Systems
  year: 2017
  ident: 10.7717/peerj-cs.1286/ref-48
  article-title: Attention is all you need
– start-page: 667
  volume-title: α Diff: cross-version binary code similarity detection with DNN
  year: 2018
  ident: 10.7717/peerj-cs.1286/ref-33
– start-page: 4171
  volume-title: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2–7, 2019, Volume 1 (Long and Short Papers)
  year: 2019
  ident: 10.7717/peerj-cs.1286/ref-13
  article-title: BERT: pre-training of deep bidirectional transformers for language understanding
– start-page: 21
  year: 2010
  ident: 10.7717/peerj-cs.1286/ref-45
  article-title: Extracting compiler provenance from program binaries
– start-page: 100
  year: 2011
  ident: 10.7717/peerj-cs.1286/ref-46
  article-title: Recovering the toolchain provenance of binary code
– start-page: 79
  year: 2017
  ident: 10.7717/peerj-cs.1286/ref-11
  article-title: Similarity of binaries through re-optimization
– start-page: 91
  year: 2015
  ident: 10.7717/peerj-cs.1286/ref-52
  article-title: Control flow and code integrity for COTS binaries: an effective defense against real-world ROP attacks
– volume: 1
  start-page: 35
  year: 2019
  ident: 10.7717/peerj-cs.1286/ref-7
  article-title: HIMALIA: recovering compiler optimization levels from binaries by deep learning
  publication-title: Proceedings of the 2018 Intelligent Systems Conference (IntelliSys)
  doi: 10.1007/978-3-030-01054-6
– volume: 51
  start-page: 1
  issue: 4
  year: 2018
  ident: 10.7717/peerj-cs.1286/ref-1
  article-title: A survey of machine learning for big code and naturalness
  publication-title: ACM Computing Surveys
  doi: 10.1145/3212695
– year: 2020
  ident: 10.7717/peerj-cs.1286/ref-23
  article-title: Unsupervised paraphrase generation using pre-trained language models
  publication-title: CoRR
  doi: 10.48550/arXiv.2006.05477
– volume: 4
  start-page: 1
  issue: OOPSLA
  year: 2020
  ident: 10.7717/peerj-cs.1286/ref-9
  article-title: Neural reverse engineering of stripped binaries using augmented control flow graphs
  publication-title: Proceedings of the ACM on Programming Languages
  doi: 10.1145/3428293
– start-page: 329
  year: 2013
  ident: 10.7717/peerj-cs.1286/ref-29
  article-title: Rendezvous: a search engine for binary code
– start-page: 311
  year: 2002
  ident: 10.7717/peerj-cs.1286/ref-37
  article-title: BLEU: a method for automatic evaluation of machine translation
– year: 2017
  ident: 10.7717/peerj-cs.1286/ref-50
  article-title: Neural network-based graph embedding for cross-platform binary code similarity detection
  doi: 10.1145/3133956.3134018
– start-page: 607
  year: 2021
  ident: 10.7717/peerj-cs.1286/ref-20
  article-title: A lightweight framework for function name reassignment based on large-scale stripped binaries
– year: 2021
  ident: 10.7717/peerj-cs.1286/ref-32
  article-title: PalmTree: learning an assembly language model for instruction embedding
  doi: 10.1145/3460120.3484587
– volume: 13
  start-page: 55
  issue: 3
  year: 2018
  ident: 10.7717/peerj-cs.1286/ref-51
  article-title: Recent trends in deep learning based natural language processing [review article]
  publication-title: IEEE Computational Intelligence Magazine
  doi: 10.1109/MCI.2018.2840738
– year: 2021
  ident: 10.7717/peerj-cs.1286/ref-39
  article-title: XFL: extreme function labeling
  publication-title: CoRR
  doi: 10.48550/arXiv.2107.13404
– volume: 2017
  start-page: 1
  issue: 6
  year: 2017
  ident: 10.7717/peerj-cs.1286/ref-17
  article-title: An efficient platform for the automatic extraction of patterns in native code
  publication-title: Scientific Programming
  doi: 10.1155/2017/3273891
– start-page: 472
  year: 2019
  ident: 10.7717/peerj-cs.1286/ref-15
  article-title: Asm2Vec: boosting static representation robustness for binary clone search against code obfuscation and compiler optimization
– year: 2016
  ident: 10.7717/peerj-cs.1286/ref-24
  article-title: Multilabel classification
  doi: 10.1007/978-3-319-41111-8
– year: 2019a
  ident: 10.7717/peerj-cs.1286/ref-34
  article-title: Investigating graph embedding neural networks with unsupervised features extraction for binary analysis
  doi: 10.14722/bar.2019.23020
– volume: 12
  start-page: S61
  issue: 1
  year: 2015
  ident: 10.7717/peerj-cs.1286/ref-2
  article-title: SIGMA: a semantic integrated graph matching approach for identifying reused functions in binary code
  publication-title: Digital Investigation
  doi: 10.1016/j.diin.2015.01.011
– volume-title: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings
  year: 2015
  ident: 10.7717/peerj-cs.1286/ref-4
  article-title: Neural machine translation by jointly learning to align and translate
– start-page: 2523
  year: 2021
  ident: 10.7717/peerj-cs.1286/ref-40
  article-title: KILT: a benchmark for knowledge intensive language tasks
– start-page: 385
  year: 2017
  ident: 10.7717/peerj-cs.1286/ref-44
  article-title: Struc2vec: learning node representations from structural identity
– start-page: 74
  year: 2008
  ident: 10.7717/peerj-cs.1286/ref-47
  article-title: Efficient fine-grained binary instrumentation with applications to taint-tracking
– year: 2018
  ident: 10.7717/peerj-cs.1286/ref-6
  article-title: When coding style survives compilation: de-anonymizing programmers from executable binaries
  doi: 10.14722/ndss.2018.23304
– start-page: 1667
  year: 2018
  ident: 10.7717/peerj-cs.1286/ref-22
  article-title: Debin: predicting debug information in stripped binaries
– start-page: 3111
  year: 2013
  ident: 10.7717/peerj-cs.1286/ref-36
  article-title: Distributed representations of words and phrases and their compositionality
– volume: 112
  volume-title: An introduction to statistical learning
  year: 2013
  ident: 10.7717/peerj-cs.1286/ref-25
  doi: 10.1007/978-1-4614-7138-7
– year: 2019
  ident: 10.7717/peerj-cs.1286/ref-26
  article-title: Towards neural decompilation
  publication-title: CoRR
  doi: 10.48550/arXiv.1905.08325
– year: 2020
  ident: 10.7717/peerj-cs.1286/ref-30
  article-title: Revisiting binary code similarity analysis using interpretable feature engineering and lessons learned
  publication-title: ArXiv preprint
  doi: 10.48550/arXiv.2011.10749
– start-page: 99
  year: 2017
  ident: 10.7717/peerj-cs.1286/ref-8
  article-title: Neural nets can learn function type signatures from binaries
– start-page: 353
  year: 2018
  ident: 10.7717/peerj-cs.1286/ref-49
  article-title: GLUE: a multi-task benchmark and analysis platform for natural language understanding
– volume: 5
  start-page: 1
  year: 2005
  ident: 10.7717/peerj-cs.1286/ref-16
  article-title: Graph-based comparison of executable objects (English version)
  publication-title: SSTIC
– start-page: 2383
  year: 2016
  ident: 10.7717/peerj-cs.1286/ref-43
  article-title: SQuAD: 100,000+ questions for machine comprehension of text
– volume: 33
  start-page: 1877
  volume-title: Advances in Neural Information Processing Systems
  year: 2020
  ident: 10.7717/peerj-cs.1286/ref-5
  article-title: Language models are few-shot learners
– year: 2021
  ident: 10.7717/peerj-cs.1286/ref-18
  article-title: Improving type information inferred by decompilers with supervised machine learning
  publication-title: CoRR
  doi: 10.48550/arXiv.2101.08116
– volume: 14
  start-page: S146
  issue: Supplement 1
  year: 2015
  ident: 10.7717/peerj-cs.1286/ref-42
  article-title: BinComp: a stratified approach to compiler provenance attribution
  publication-title: Digital Investigation
  doi: 10.1016/j.diin.2015.05.015
– volume: 51
  start-page: 266
  issue: 6
  year: 2016
  ident: 10.7717/peerj-cs.1286/ref-10
  article-title: Statistical similarity of binaries
  publication-title: ACM SIGPLAN Notices
  doi: 10.1145/2980983.2908126
– start-page: 309
  volume-title: Detection of Intrusions and Malware, and Vulnerability Assessment—16th International Conference, DIMVA 2019, Gothenburg, Sweden, June 19–20, 2019, Proceedings, volume 11543 of Lecture Notes in Computer Science
  year: 2019b
  ident: 10.7717/peerj-cs.1286/ref-35
  article-title: SAFE: self-attentive function embeddings for binary similarity
– year: 2014
  ident: 10.7717/peerj-cs.1286/ref-12
  article-title: Tracelet-based code search in executables
  doi: 10.1145/2594291.2594343
– year: 2013
  ident: 10.7717/peerj-cs.1286/ref-31
  article-title: Fast location of similar code fragments using semantic ‘juice’
  doi: 10.1145/2430553.2430558
– year: 2019
  ident: 10.7717/peerj-cs.1286/ref-14
  article-title: ERASER: a benchmark to evaluate rationalized NLP models
  publication-title: CoRR
  doi: 10.48550/arXiv.1911.03429
– year: 2021
  ident: 10.7717/peerj-cs.1286/ref-3
  article-title: In nomine function: naming functions in stripped binaries with neural networks
  publication-title: ArXiv
– volume: 54
  start-page: 1
  year: 2021
  ident: 10.7717/peerj-cs.1286/ref-21
  article-title: A survey of binary code similarity
  publication-title: ACM Computing Surveys (CSUR)
  doi: 10.1145/3446371
SSID ssj0001511119
Score 2.249177
Snippet In this article we propose the first multi-task benchmark for evaluating the performances of machine learning models that work on low level assembly functions....
SourceID doaj
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
StartPage e1286
SubjectTerms Analysis
Assembly language
Benchmark
Binary functions
Binary functions representation
Computational linguistics
Data Mining and Machine Learning
Dataset
Language processing
Machine learning
Natural language interfaces
Neural Networks
Operating systems
Security and Privacy
Title BinBench: a benchmark for x64 portable operating system interface binary function representations
URI https://www.ncbi.nlm.nih.gov/pubmed/37346713
https://www.proquest.com/docview/2828755463
https://pubmed.ncbi.nlm.nih.gov/PMC10280411
https://doaj.org/article/a6a9f7b9c4a64f13a4d71b43bad05f15
Volume 9
WOSCitedRecordID wos001009615200002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: DOA
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: M~E
  dateStart: 20150101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: P5Z
  dateStart: 20150527
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: K7-
  dateStart: 20150527
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: BENPR
  dateStart: 20150527
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 2376-5992
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001511119
  issn: 2376-5992
  databaseCode: PIMPY
  dateStart: 20150527
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Nj9MwELVg4cCFb5bCsjIIwSlsHTt2wm2LumKFtqoWkAoXa-w4bPlIq6ZFnPjtzDjpqhEHLlysqGNV9Tw780adeWbsuddFUYrSJ5DLMlEO0x2AQialCqDzVDo3hHjZhJlM8tmsmO5c9UU1Ya08cOu4I9BQVMYVXoFWlZCgSiOckg7KYVbF9vJ0aIqdZKrtD6ZXQdGKahpMWY6WIay-Jr55JWLf9E4Qilr9f7-Rd0JSv1xyJ_6c3GY3O-LIj9sffIddCfVddmt7KQPvzug9BqN5PcLHi9ccuKOHH7D6xpGb8l9a8Ui33ffAF0uSU8bAxVsxZ07CEasKfOAu9uhyCnkEG4_Cl9smpbq5zz6ejD-8eZt09ygkHt2wTjQ6IAdZmtSh_4VxoCqEDkLmVZYHUfk8mHQYKuEgCG8K-sek1IA4eSiRAT5ge_WiDg8Z1wpUgSzBZ2mlfK5B4tegZ0pAYmWCGrAXW8faZSuXYTHNIARsRMD6xhICAzYit19OIpXr-AFibzvs7b-wH7BnBJolHYuaCmW-wKZp7On7c3tsMiG1RoIyYC-7SdUC4fPQ9R3ggkj6qjfzoDcTD5rvmZ9u94YlE1Wn1WGxaSylrYbK_eSA7bd75XJh0kiMRQIteW8X9Vbet9Tzi6jzTdxvqIR49D989ZjdSJGftVVuB2xvvdqEJ-y6_7meN6tDdtXM8kN2bTSeTM8P41nC8Z1JcDz7PcZxmn1G-_T0bPrpD5V-LCE
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=BinBench%3A+a+benchmark+for+x64+portable+operating+system+interface+binary+function+representations&rft.jtitle=PeerJ.+Computer+science&rft.au=Console%2C+Francesca&rft.au=D%27Aquanno%2C+Giuseppe&rft.au=Di+Luna%2C+Giuseppe+Antonio&rft.au=Querzoni%2C+Leonardo&rft.date=2023-06-01&rft.pub=PeerJ.+Ltd&rft.issn=2376-5992&rft.eissn=2376-5992&rft.volume=9&rft.spage=e1286&rft_id=info:doi/10.7717%2Fpeerj-cs.1286&rft.externalDocID=A751366224
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2376-5992&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2376-5992&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2376-5992&client=summon