Disassembly as Weighted Interval Scheduling with Learned Weights
Disassembly is the first step of a variety of binary analysis and transformation techniques, such as reverse engineering, or binary rewriting. Recent disassembly approaches consist of three phases: an exploration phase, that overapproximates the binary's code; an analysis phase, that assigns we...
Saved in:
| Published in: | Proceedings - IEEE Symposium on Security and Privacy pp. 3033 - 3050 |
|---|---|
| Main Authors: | , , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
12.05.2025
|
| Subjects: | |
| ISSN: | 2375-1207 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Disassembly is the first step of a variety of binary analysis and transformation techniques, such as reverse engineering, or binary rewriting. Recent disassembly approaches consist of three phases: an exploration phase, that overapproximates the binary's code; an analysis phase, that assigns weights to candidate instructions or basic blocks; and a conflict resolution phase, that downselects the final set of instructions. We present a disassembly algorithm that generalizes this pattern for a wide range of architectures, namely x86, x64, arm32, and aarch64. Our algorithm presents a novel conflict resolution method that reduces disassembly to weighted interval scheduling. Additionally, we present a weight assignment algorithm that allows us to learn optimal weights for the various disassembly heuristics in the analysis phase. Learned weights outperform manually tuned weights in most cases while reducing the number of necessary heuristics by 40% (by setting their weights to zero). Our implementation, built on top of Ddisasm, outperforms state-of-the-art disassemblers in several metrics and achieves the largest proportion of perfectly disassembled binaries by a wide margin in all evaluated datasets. |
|---|---|
| ISSN: | 2375-1207 |
| DOI: | 10.1109/SP61157.2025.00192 |