Language interoperability to automate code analysis
Uloženo v:
| Název: | Language interoperability to automate code analysis |
|---|---|
| Patent Number: | 11599,345 |
| Datum vydání: | March 07, 2023 |
| Appl. No: | 17/518971 |
| Application Filed: | November 04, 2021 |
| Abstrakt: | Language interoperability between source code programs not compatible with an interprocedural static code analyzer is achieved through language-independent representations of the programs. The source code programs are transformed into respective intermediate language instructions from which a language-independent control flow graph and a language-independent type environment is created. A program compatible with the interprocedural static code analyzer is generated from the language-independent control flow graph and the language-independent type environment in order to utilize the interprocedural static code analyzer to detect memory safety faults. |
| Inventors: | MICROSOFT TECHNOLOGY LICENSING, LLC. (Redmond, WA, US) |
| Assignees: | MICROSOFT TECHNOLOGY LICENSING, LLC. (Redmond, WA, US) |
| Claim: | 1. A system comprising: one or more processors coupled to a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions that perform actions to: monitor a source code repository for an event associated with a first program that necessitates code review; and upon occurrence of the event: generate a language-independent control flow graph of the first program and a language-independent type environment of the first program; convert the language-independent control flow graph of the first program into a second program, wherein the second program includes analysis language instructions of an interprocedural static code analyzer; and invoke the interprocedural static code analyzer on the second program to detect a memory safety fault or bug in the first program. |
| Claim: | 2. The system of claim 1 , wherein the one or more programs include further instructions that perform actions to: extract binary files of the first program; decompile the binary files of the first program into intermediate language code; and extract a procedural code flow for the first program and a type environment for the first program from the intermediate language code of each of the binary files of the first program, wherein the procedural code flow for the first program includes the analysis language instructions of the interprocedural static code analyzer. |
| Claim: | 3. The system of claim 2 , wherein the one or more programs include further instructions that perform actions to: create the procedural code flow for the first program using a control flow graph schema; and extract the type environment for the first program using a type schema. |
| Claim: | 4. The system of claim 2 , wherein the one or more programs include further instructions that perform actions to: generate language-independent code for the first program from the procedural code flow for the first program and the type environment for the first program. |
| Claim: | 5. The system of claim 4 , wherein the one or more programs include further instructions that perform actions to: decode the language-independent code for the first program into the second program having the analysis language instructions of the interprocedural static code analyzer; and decode the type environment for the first program into types of the analysis language instructions of the interprocedural static code analyzer. |
| Claim: | 6. The system of claim 4 , wherein the language-independent code for the first program is based on a JavaScript Notation (JSON) format. |
| Claim: | 7. The system of claim 2 , wherein the intermediate language code is based on a Common Intermediate Language (CIL). |
| Claim: | 8. A computer-implemented method comprising: extracting binary files of a first program from a version-controlled source code repository; converting the binary files of the first program into intermediate language instructions; analyzing the intermediate language instructions to generate a language-independent control flow graph of the first program and a language-independent type environment of the first program; transforming the language-independent control flow graph of the first program into a second program, wherein the second program includes instructions of an analysis language of an interprocedural static code analyzer; and applying the interprocedural static code analyzer to the second program to identify a memory safety fault or bug in the first program. |
| Claim: | 9. The computer-implemented method of claim 8 , wherein extracting binary files of a first program from a version-controlled source code repository, further comprises: detecting, at the version-controlled source code repository, an event triggering code analysis of the first program; and upon detection of the event triggering code analysis of the first program, obtaining the binary files of the first program from the version-controlled source code repository. |
| Claim: | 10. The computer-implemented method of claim 9 , wherein the event triggering code analysis of the first program includes a commit of the first program to the version-controlled source code repository. |
| Claim: | 11. The computer-implemented method of claim 8 , further comprising: mapping results from application of the interprocedural static code analyzer to source code of the first program. |
| Claim: | 12. The computer-implemented method of claim 8 , wherein the first program is written in a programming language supported by the .NET framework and the interprocedural static code analyzer is written in a programming language not supported by the .NET framework. |
| Claim: | 13. The computer-implemented method of claim 8 , wherein the intermediate language instructions are Common Intermediate Language (CIL) instructions. |
| Claim: | 14. The computer-implemented method of claim 8 , wherein the language-independent control flow graph of the first program and the language-independent type environment of the first program are based on a JavaScript Object Notation (JSON) format. |
| Claim: | 15. A computer-implemented method comprising: detecting an event triggering a code review of a first program in a version-controlled source code repository; and performing the code review of the first program by: converting the first program into language-independent code; decoding the language-independent code into a second program having instructions of an interprocedural static code analyzer; invoking the interprocedural static code analyzer on the second program to detect source code bugs in the second program; mapping the detected source code bugs back to the first program; and outputting the detected source code bugs in the first program. |
| Claim: | 16. The computer-implemented method of claim 15 , wherein converting the first program into the language-independent code, further comprises: converting the first program into intermediate language code; extracting a control flow graph of the first program and a type environment of the first program from the intermediate language code; generating a language-independent control flow graph from the extracted control flow graph of the first program; and generating a language-independent type environment from the extracted type environment of the first program. |
| Claim: | 17. The computer-implemented method of claim 16 , further comprising: prior to converting the first program into the intermediate language code, obtaining binary files of the first program. |
| Claim: | 18. The computer-implemented method of claim 16 , further comprising: serializing data of the language-independent control flow graph and data of the language-independent type environment into byte strings; and deserializing the byte strings of the language-independent control flow graph into the second program having ordered sequences of intermediate analysis instructions. |
| Claim: | 19. The computer-implemented method of claim 18 , further comprising: deserializing the byte strings of the language-independent type environment into a data structure for use by the interprocedural static code analyzer. |
| Claim: | 20. The computer-implemented method of claim 15 , wherein the first program is written in a programming language supported by the .NET framework and the interprocedural static code analyzer is written in a programming language not supported by the .NET framework. |
| Patent References Cited: | 6823507 November 2004 Srinivasan et al. 7900193 March 2011 Kolawa et al. 10303469 May 2019 Neatherway et al. 11392844 July 2022 Rao 20070234297 October 2007 Zorn et al. 20080072214 March 2008 Peyton 20090119648 May 2009 Chess et al. 20190179727 June 2019 Bouissou 20200125478 April 2020 Iyer et al. 20200394028 December 2020 Byrne |
| Other References: | “Automate your Workflow from Idea to Production”, Retrieved from: https://web.archive.org/web/20200509205330/https://github.com/features/actions, May 9, 2020, 11 Pages. cited by applicant “How Static Analysis Works—GrammaTech Code Sonar”, Retrieved From: https://www.verifysoft.com/en_grammatech_how_static_analysis_works.html, Sep. 28, 2017, 2 Pages. cited by applicant “Mono.Cecil”, Retrieved from: http://web.archive.org/web/20200429125717/https://www.mono-project.com/docs/tools+libraries/libraries/Mono.Cecil/, Apr. 29, 2020, 2 Pages. cited by applicant “Notice of Allowance Issued in U.S. Appl. No. 15/931,234”, dated Jul. 16, 2021, 16 Pages. cited by applicant Berdine, et al., “Smallfoot: Modular Automatic Assertion Checking with Separation Logic”, In Proceedings of International Symposium on Formal Methods for Components and Objects, Nov. 1, 2005, 23 Pages. cited by applicant Calcagno, et al., “Infer: An Automatic Program Verifier for Memory Safety of C Programs”, In Proceedings of NASA Formal Methods Symposium, Apr. 18, 2011, pp. 459-465. cited by applicant Jambon, et al., “Welcome to ATD's Documentation!”, Retrieved from: https://web.archive.org/web/20181205001706/https://atd.readthedocs.io/en/latest/, Dec. 5, 2018, 3 Pages. cited by applicant Ma, et al., “SymWalker: Symbolic Execution in Routines of Binary Code”, In Proceedings of the Tenth International Conference on Computational Intelligence and Security, Nov. 15, 2014, pp. 694-698. cited by applicant O'Hearn, Peter, “Separation Logic”, In Journal of Communications of the ACM, vol. 62, Issue 2, Feb. 2019, pp. 86-95. cited by applicant “International Search Report and Written Opinion Issued in PCT Application No. PCT/US21/025839”, dated Jul. 16, 2021, 15 Pages. cited by applicant Tonder, et al., “Static Automated Program Repair for Heap Properties”, In Proceedings of ACM/IEEE 40th International Conference on Software Engineering, May 27, 2018, pp. 151-162. cited by applicant Villard, Jules, “Infer: A Static Analyzer for Catching Bugs before you Ship”, In FOSDEM, Feb. 4, 2017, 41 Pages. cited by applicant |
| Primary Examiner: | Chen, Qing |
| Přístupové číslo: | edspgr.11599345 |
| Databáze: | USPTO Patent Grants |
| Abstrakt: | Language interoperability between source code programs not compatible with an interprocedural static code analyzer is achieved through language-independent representations of the programs. The source code programs are transformed into respective intermediate language instructions from which a language-independent control flow graph and a language-independent type environment is created. A program compatible with the interprocedural static code analyzer is generated from the language-independent control flow graph and the language-independent type environment in order to utilize the interprocedural static code analyzer to detect memory safety faults. |
|---|