Learning Types for Binaries

Saved in:
Bibliographic Details
Title: Learning Types for Binaries
Authors: Xu, Zhiwu, Wen, Cheng, Qin, Shengchao
Source: Xu, Z, Wen, C & Qin, S 2017, 'Learning Types for Binaries', Lecture Notes in Computer Science, pp. -. https://doi.org/10.1007/978-3-319-68690-5_26
Publication Year: 2017
Description: Type inference for Binary codes is a challenging problem due partly to the fact that much type-related information has been lost during the compilation from high-level source code. Most of the existing research on binary code type inference tend to resort to program analysis techniques, which can be too conservative to infer types with high accuracy or too heavy-weight to be viable in practice. In this paper, we propose a new approach to learning types for recovered variables from their related representative instructions. Our idea is motivated by “duck typing”, where the type of a variable is determined by its features and properties. Our approach first learns a classifier from existing binaries with debug information and then uses this classifier to predict types for new, unseen binaries. We have implemented our approach in a tool called BITY and used it to conduct some experiments on a well-known benchmark coreutils (v8.4). The results show that our tool is more precise than the commercial tool Hey-Rays, both in terms of correct types and compatible types.
Document Type: article in journal/newspaper
File Description: application/pdf
Language: English
Relation: info:eu-repo/semantics/altIdentifier/pissn/0302-9743; info:eu-repo/semantics/altIdentifier/eissn/1611-3349
DOI: 10.1007/978-3-319-68690-5_26
Availability: https://research.tees.ac.uk/en/publications/c980e969-e754-4e31-aa49-cbba165cab1f
https://doi.org/10.1007/978-3-319-68690-5_26
https://research.tees.ac.uk/ws/files/5962684/621527.pdf
https://www.scopus.com/pages/publications/85032471121
https://hdl.handle.net/10149/621527
Rights: info:eu-repo/semantics/openAccess ; http://creativecommons.org/licenses/by-nc-nd/4.0/
Accession Number: edsbas.AA3B89AB
Database: BASE
Description
Abstract:Type inference for Binary codes is a challenging problem due partly to the fact that much type-related information has been lost during the compilation from high-level source code. Most of the existing research on binary code type inference tend to resort to program analysis techniques, which can be too conservative to infer types with high accuracy or too heavy-weight to be viable in practice. In this paper, we propose a new approach to learning types for recovered variables from their related representative instructions. Our idea is motivated by “duck typing”, where the type of a variable is determined by its features and properties. Our approach first learns a classifier from existing binaries with debug information and then uses this classifier to predict types for new, unseen binaries. We have implemented our approach in a tool called BITY and used it to conduct some experiments on a well-known benchmark coreutils (v8.4). The results show that our tool is more precise than the commercial tool Hey-Rays, both in terms of correct types and compatible types.
DOI:10.1007/978-3-319-68690-5_26