LibvDiff: Library Version Difference Guided OSS Version Identification in Binaries

Open-source software (OSS) has been extensively employed to expedite software development, inevitably exposing downstream software to the peril of potential vulnerabilities. Precisely identifying the version of OSS not only facilitates the detection of vulnerabilities associated with it but also ena...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings / International Conference on Software Engineering s. 791 - 802
Hlavní autoři: Dong, Chaopeng, Li, Siyuan, Yang, Shouguo, Xiao, Yang, Wang, Yongpan, Li, Hong, Li, Zhi, Sun, Limin
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: ACM 14.04.2024
Témata:
ISSN:1558-1225
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Open-source software (OSS) has been extensively employed to expedite software development, inevitably exposing downstream software to the peril of potential vulnerabilities. Precisely identifying the version of OSS not only facilitates the detection of vulnerabilities associated with it but also enables timely alerts upon the release of 1-day vulnerabilities. However, current methods for identifying OSS versions rely heavily on version strings or constant features, which may not be present in compiled OSS binaries or may not be representative when only function code changes are made. As a result, these methods are often imprecise in identifying the version of OSS binaries being used. To this end, we propose Libvdiff, a novel approach for identifying open-source software versions. It detects subtle differences through precise symbol information and function-level code changes using binary code similarity detection. Libvdiff introduces a candidate version filter based on a novel version coordinate system to improve efficiency by quantifying gaps between versions and rapidly identifying potential versions. To speed up the code similarity detection process, Libvdiff proposes a function call-based anchor path filter to minimize the number of functions compared in the target binary. We evaluate the performance of Libvdiff through comprehensive experiments under various compilation settings and two datasets (one with version strings, and the other without version strings), which demonstrate that our approach achieves 94.5% and 78.7% precision in two datasets, outperforming state-of-the-art works (including both academic methods and industry tools) by an average of 54.2% and 160.3%, respectively. By identifying and analyzing OSS binaries in real-world firmware images, we make several interesting findings, such as developers having significant differences in their updates to different OSS, and different vendors may also utilize identical OSS binaries.
ISSN:1558-1225
DOI:10.1145/3597503.3623336