LibvDiff: Library Version Difference Guided OSS Version Identification in Binaries

Open-source software (OSS) has been extensively employed to expedite software development, inevitably exposing downstream software to the peril of potential vulnerabilities. Precisely identifying the version of OSS not only facilitates the detection of vulnerabilities associated with it but also ena...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings / International Conference on Software Engineering s. 791 - 802
Hlavní autori: Dong, Chaopeng, Li, Siyuan, Yang, Shouguo, Xiao, Yang, Wang, Yongpan, Li, Hong, Li, Zhi, Sun, Limin
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: ACM 14.04.2024
Predmet:
ISSN:1558-1225
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Open-source software (OSS) has been extensively employed to expedite software development, inevitably exposing downstream software to the peril of potential vulnerabilities. Precisely identifying the version of OSS not only facilitates the detection of vulnerabilities associated with it but also enables timely alerts upon the release of 1-day vulnerabilities. However, current methods for identifying OSS versions rely heavily on version strings or constant features, which may not be present in compiled OSS binaries or may not be representative when only function code changes are made. As a result, these methods are often imprecise in identifying the version of OSS binaries being used. To this end, we propose Libvdiff, a novel approach for identifying open-source software versions. It detects subtle differences through precise symbol information and function-level code changes using binary code similarity detection. Libvdiff introduces a candidate version filter based on a novel version coordinate system to improve efficiency by quantifying gaps between versions and rapidly identifying potential versions. To speed up the code similarity detection process, Libvdiff proposes a function call-based anchor path filter to minimize the number of functions compared in the target binary. We evaluate the performance of Libvdiff through comprehensive experiments under various compilation settings and two datasets (one with version strings, and the other without version strings), which demonstrate that our approach achieves 94.5% and 78.7% precision in two datasets, outperforming state-of-the-art works (including both academic methods and industry tools) by an average of 54.2% and 160.3%, respectively. By identifying and analyzing OSS binaries in real-world firmware images, we make several interesting findings, such as developers having significant differences in their updates to different OSS, and different vendors may also utilize identical OSS binaries.
AbstractList Open-source software (OSS) has been extensively employed to expedite software development, inevitably exposing downstream software to the peril of potential vulnerabilities. Precisely identifying the version of OSS not only facilitates the detection of vulnerabilities associated with it but also enables timely alerts upon the release of 1-day vulnerabilities. However, current methods for identifying OSS versions rely heavily on version strings or constant features, which may not be present in compiled OSS binaries or may not be representative when only function code changes are made. As a result, these methods are often imprecise in identifying the version of OSS binaries being used. To this end, we propose Libvdiff, a novel approach for identifying open-source software versions. It detects subtle differences through precise symbol information and function-level code changes using binary code similarity detection. Libvdiff introduces a candidate version filter based on a novel version coordinate system to improve efficiency by quantifying gaps between versions and rapidly identifying potential versions. To speed up the code similarity detection process, Libvdiff proposes a function call-based anchor path filter to minimize the number of functions compared in the target binary. We evaluate the performance of Libvdiff through comprehensive experiments under various compilation settings and two datasets (one with version strings, and the other without version strings), which demonstrate that our approach achieves 94.5% and 78.7% precision in two datasets, outperforming state-of-the-art works (including both academic methods and industry tools) by an average of 54.2% and 160.3%, respectively. By identifying and analyzing OSS binaries in real-world firmware images, we make several interesting findings, such as developers having significant differences in their updates to different OSS, and different vendors may also utilize identical OSS binaries.
Author Sun, Limin
Wang, Yongpan
Li, Hong
Dong, Chaopeng
Yang, Shouguo
Li, Zhi
Li, Siyuan
Xiao, Yang
Author_xml – sequence: 1
  givenname: Chaopeng
  surname: Dong
  fullname: Dong, Chaopeng
  email: dongchaopeng@iie.ac.cn
  organization: Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security
– sequence: 2
  givenname: Siyuan
  surname: Li
  fullname: Li, Siyuan
  email: lisiyuan@iie.ac.cn
  organization: Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security
– sequence: 3
  givenname: Shouguo
  surname: Yang
  fullname: Yang, Shouguo
  email: yangshouguo@iie.ac.cn
  organization: Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security
– sequence: 4
  givenname: Yang
  surname: Xiao
  fullname: Xiao, Yang
  email: xiaoyang@iie.ac.cn
  organization: Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security
– sequence: 5
  givenname: Yongpan
  surname: Wang
  fullname: Wang, Yongpan
  email: wangyongpan@iie.ac.cn
  organization: Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security
– sequence: 6
  givenname: Hong
  surname: Li
  fullname: Li, Hong
  email: lihong@iie.ac.cn
  organization: Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security
– sequence: 7
  givenname: Zhi
  surname: Li
  fullname: Li, Zhi
  email: lizhi@iie.ac.cn
  organization: Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security
– sequence: 8
  givenname: Limin
  surname: Sun
  fullname: Sun, Limin
  email: sunlimin@iie.ac.cn
  organization: Institute of Information Engineering, Chinese Academy of Sciences, School of Cyber Security
BookMark eNo9T8tKAzEUjaJgrbN24yI_MDXJvXmMO221FgYKtrgtycwNBDSVTBX8e6cors4LDudcsrO8z8TYtRQzKVHfgm6sFjADowDAnLCqsY1DIaxQ0uIpm0itXS2V0hesGoYUhEbQ1iBM2EubwtcixXjHR1Z8-eavVIa0z_zoUqHcEV9-pp56vt5s_tNVT_mQYur84ShT5g8p-5JouGLn0b8NVP3hlG2fHrfz57pdL1fz-7b2CpSpUYXeGIMCjXQ9Kge9H0cHtEFHEUkHEsJYZT1GByKagKMfom9i6LSHKbv5rU1EtPso6X0cv5PjN-cA4Qd5M1Dc
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
ESBDL
RIE
RIO
DOI 10.1145/3597503.3623336
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Xplore Open Access Journals
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798400702174
EISSN 1558-1225
EndPage 802
ExternalDocumentID 10548834
Genre orig-research
GroupedDBID -~X
.4S
.DC
29O
5VS
6IE
6IF
6IH
6IK
6IL
6IM
6IN
8US
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
ARCSS
AVWKF
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
EDO
ESBDL
FEDTE
I-F
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
ID FETCH-LOGICAL-a2326-42bd666404618d4283da400b47b5f0fe5be006727a4f830f6b45f0bfa9fbc5a3
IEDL.DBID RIE
IngestDate Wed Aug 27 01:53:13 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a2326-42bd666404618d4283da400b47b5f0fe5be006727a4f830f6b45f0bfa9fbc5a3
OpenAccessLink https://ieeexplore.ieee.org/document/10548834
PageCount 12
ParticipantIDs ieee_primary_10548834
PublicationCentury 2000
PublicationDate 2024-April-14
PublicationDateYYYYMMDD 2024-04-14
PublicationDate_xml – month: 04
  year: 2024
  text: 2024-April-14
  day: 14
PublicationDecade 2020
PublicationTitle Proceedings / International Conference on Software Engineering
PublicationTitleAbbrev ICSE
PublicationYear 2024
Publisher ACM
Publisher_xml – name: ACM
SSID ssib054357643
ssib055306466
ssj0006499
Score 2.3029566
Snippet Open-source software (OSS) has been extensively employed to expedite software development, inevitably exposing downstream software to the peril of potential...
SourceID ieee
SourceType Publisher
StartPage 791
SubjectTerms Binary codes
Firmware analysis
Information filters
Libraries
Open source software
Security
Semantics
Symbols
Version identification
Vulnerability detection
Title LibvDiff: Library Version Difference Guided OSS Version Identification in Binaries
URI https://ieeexplore.ieee.org/document/10548834
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07b8IwELYK6tCJPqj6loeuoUl8jpOOfdAOiKLCwIbs2CdlgYoCv7_nxFCWDt0iR4qSu5zvs8_fd4zdqxiVAFFGmZMYAfr6rjAYKdSpKwtlZOPpgRoO8-m0GAWyes2Fcc7Vh89cz1_WtXy7KNd-q4winPB1LqDFWkplDVlr-_NIyvtqT1vKt8PJwGOVMC1nhO2Dtk8C8kEQkpax6NEELoQXaN5rrlLnln7nn291zLq_LD0-2uWfE3bg5qess23TwEPUnrHPQWU2LxXiIw8sBR52yfhLaI9CT3pbV9ZZ_jEe7-42JF4Mu3q8mvMnz96ltXWXTfqvk-f3KLRSiDRBpiyC1FhaqICXV8-tF1mzmqLXADkDY3TSuKYoqwFzEWNmgMYN6gJNKbU4Z-35Yu4uGCd8Z5NC52Q0ANBOg6XvL5MYJaYgskvW9SaafTViGbOtda7-GL9mRynhBF-gSeCGtVfLtbtlh-VmVX0v72oX_wDDFaO_
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagIMFUHkW88cCaktTnPBh5lCJCqWiHbpWd-KQsKSptfz_nxC1dGNgiR4qSu5zvs8_fd4zdRj5GAkTmhUaiB2jru0KjF6HqmCyJtKw9nUb9fjweJwNHVq-4MMaY6vCZadvLqpafT7OF3SqjCCd8HQvYZju2dZaja61-H0mZP9pQl7INcUKwaMVNzCGhe6fuE4C8E4SlpS_aNIULYSWaN9qrVNml2_znex2w1i9Pjw_WGeiQbZnyiDVXjRq4i9tj9pkWevlUIN5zx1Pgbp-MP7kGKfSkl0WRm5x_DIfruzWNF92-Hi9K_mD5u7S6brFR93n02PNcMwVPEWgKPejonJYqYAXW49zKrOWK4lcDuQN9NFKbuiyrAGPhY6iBxjWqBHUmlThhjXJamlPGCeHlQaJiMhoAKKMgp-_PAh8ldkCEZ6xlTTT5quUyJivrnP8xfsP2eqP3dJK-9t8u2H6HUIMt1wRwyRrz2cJcsd1sOS--Z9eVu38AfoanCA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Software+Engineering&rft.atitle=LibvDiff%3A+Library+Version+Difference+Guided+OSS+Version+Identification+in+Binaries&rft.au=Dong%2C+Chaopeng&rft.au=Li%2C+Siyuan&rft.au=Yang%2C+Shouguo&rft.au=Xiao%2C+Yang&rft.date=2024-04-14&rft.pub=ACM&rft.eissn=1558-1225&rft.spage=791&rft.epage=802&rft_id=info:doi/10.1145%2F3597503.3623336&rft.externalDocID=10548834