Compilation with caching of code analysis result

Gespeichert in:
Bibliographische Detailangaben
Titel: Compilation with caching of code analysis result
Patent Number: 11941,383
Publikationsdatum: March 26, 2024
Appl. No: 17/654059
Application Filed: March 08, 2022
Abstract: Techniques to speed up code compilation may include caching code analysis results such that the analysis of subsequent code having a similar structured can be omitted. For example, a loop-nest construct in the code can be parsed, and an execution statement in the loop-nest construct can be analyzed by a compiler to generate an analysis result indicating a set of execution conditions for the execution statement. A lookup key can be generated from the control statements bounding the execution statement, and the analysis result can be stored with the lookup key in a cache entry of the cache. The execution statement is then modified according to the analysis result for optimization. Instead of having to analyze a subsequent execution statement bounded by the same control statements, the analysis result of the subsequent execution statement can be retrieved from the cache and be used to modify the subsequent execution statement.
Inventors: Amazon Technologies, Inc. (Seattle, WA, US)
Assignees: Amazon Technologies, Inc. (Seattle, WA, US)
Claim: 1. A computer-implemented method for compiling a neural network model, the method comprising: obtaining a description of the neural network model; generating an intermediate representation of the neural network model; parsing the intermediate representation to identify loop-nest constructs; for each of the loop-nest constructs: flushing a code analysis result cache; and for each memory access statement in the loop-nest construct: generating a lookup key from control statements bounding the memory access statement; determining whether the lookup key is stored in an entry of the code analysis result cache; if the lookup key results in a cache miss: performing an affine analysis on the memory access statement to generate an affine analysis result for the memory access statement, the affine analysis result indicating a set of execution conditions for the memory access statement; storing the affine analysis result with the lookup key in the code analysis result cache; and modifying the memory access statement according to the affine analysis result; and if the lookup key results in a cache hit: retrieving a cached affine analysis result associated with the lookup key from the code analysis result cache; and modifying the memory access statement according to the cached affine analysis result; optimizing the intermediate representation of the neural network model based on the modified memory access statement; and compiling the optimized intermediate representation of the neural network model into machine executable code.
Claim: 2. The computer-implemented method of claim 1 , wherein the set of execution conditions represents an iteration space of the memory access statement.
Claim: 3. The computer-implemented method of claim 1 , wherein the lookup key is generated by applying a hash function to the control statements bounding the memory access statement.
Claim: 4. The computer-implemented method of claim 1 , wherein the code analysis result cache uses a least recently used policy to evict a least recently used entry first.
Claim: 5. A computer-implemented method comprising: parsing a loop-nest construct; analyzing an execution statement in the loop-nest construct to generate an analysis result indicating a set of execution conditions for the execution statement, the set of execution conditions including an iteration space of the execution statement and one or more algebraic inequalities to describe a predicate condition; generating a first lookup key from control statements bounding the execution statement; storing the analysis result with the first lookup key in a first cache entry of a cache; modifying the execution statement according to the analysis result, wherein modifying the execution statement includes performing a predicate simplification; and compiling the loop-nest construct with the modified execution statement into machine executable code.
Claim: 6. The computer-implemented method of claim 5 , further comprising: generating a second lookup key for a subsequent execution statement from control statements bounding the subsequent execution statement, the second lookup key being equal to the first lookup key; retrieving the analysis result from the first cache entry of the cache corresponding to the first lookup key; and modifying the subsequent execution statement according to the analysis result.
Claim: 7. The computer-implemented method of claim 5 , further comprising: generating a second lookup key from control statements bounding a subsequent execution statement; determining that the second lookup key does not have a corresponding cache entry in the cache; analyzing the subsequent execution statement to generate a second analysis result indicating a second set of execution conditions for the subsequent execution statement; storing the second analysis result with the second lookup key in a second cache entry of the cache; and modifying the subsequent execution statement according to the second analysis result.
Claim: 8. The computer-implemented method of claim 5 , wherein the cache uses a least recently used cache policy.
Claim: 9. The computer-implemented method of claim 5 , wherein the execution statement is a memory access statement.
Claim: 10. The computer-implemented method of claim 5 , wherein the loop-nest construct is an affine loop-nest.
Claim: 11. The computer-implemented method of claim 5 , wherein the execution statement is a data transformation operation.
Claim: 12. The computer-implemented method of claim 5 , wherein the predicate condition is a function of a loop variable of the loop-nest construct.
Claim: 13. A non-transitory computer readable medium having stored therein instructions that, when executed by one or more processors, cause the one or more processors to execute a compiler, the compiler performing operations including: parsing a loop-nest construct; analyzing an execution statement in the loop-nest construct to generate an analysis result indicating a set of execution conditions for the execution statement, the set of execution conditions including an iteration space of the execution statement and one or more algebraic inequalities to describe a predicate condition; generating a first lookup key from control statements bounding the execution statement; storing the analysis result with the first lookup key in a first cache entry of a cache; modifying the execution statement according to the analysis result, wherein modifying the execution statement includes performing a predicate simplification; and compiling the loop-nest construct with the modified execution statement into machine executable code.
Claim: 14. The non-transitory computer readable medium of claim 13 , wherein the operations further include: generating a second lookup key for a subsequent execution statement from control statements bounding the subsequent execution statement, the second lookup key being equal to the first lookup key; retrieving the analysis result from the first cache entry of the cache corresponding to the first lookup key; and modifying the subsequent execution statement according to the analysis result.
Claim: 15. The non-transitory computer readable medium of claim 13 , wherein the operations further include: generating a second lookup key from control statements bounding a subsequent execution statement; determining that the second lookup key does not have a corresponding cache entry in the cache; analyzing the subsequent execution statement to generate a second analysis result indicating a second set of execution conditions for the subsequent execution statement; storing the second analysis result with the second lookup key in a second cache entry of the cache; and modifying the subsequent execution statement according to the second analysis result.
Claim: 16. The non-transitory computer readable medium of claim 13 , wherein the cache uses a least recently used cache policy.
Claim: 17. The non-transitory computer readable medium of claim 13 , wherein the execution statement is a memory access statement.
Claim: 18. The non-transitory computer readable medium of claim 13 , wherein the loop-nest construct is an affine loop-nest.
Claim: 19. The non-transitory computer readable medium of claim 13 , wherein the execution statement is a data transformation operation.
Claim: 20. The non-transitory computer readable medium of claim 13 , wherein the predicate condition is a function of a loop variable of the loop-nest construct.
Patent References Cited: 20160110171 April 2016 Bikshandi
20190042224 February 2019 Caballero De Gea
Other References: Y. Xing, J. Weng, Y. Wang, L. Sui, Y. Shan and Y. Wang, “An In-depth Comparison of Compilers for Deep Neural Networks on Hardware,” 2019 IEEE International Conference on Embedded Software and Systems (ICESS), Las Vegas, NV, USA, 2019, pp. 1-8 , doi: 10.1109/ICESS.2019.8782480. (Year: 2019). cited by examiner
Primary Examiner: Bui, Hanh Thi-Minh
Attorney, Agent or Firm: Weaver Austin Villeneuve & Sampson LLP
Dokumentencode: edspgr.11941383
Datenbank: USPTO Patent Grants
Beschreibung
Abstract:Techniques to speed up code compilation may include caching code analysis results such that the analysis of subsequent code having a similar structured can be omitted. For example, a loop-nest construct in the code can be parsed, and an execution statement in the loop-nest construct can be analyzed by a compiler to generate an analysis result indicating a set of execution conditions for the execution statement. A lookup key can be generated from the control statements bounding the execution statement, and the analysis result can be stored with the lookup key in a cache entry of the cache. The execution statement is then modified according to the analysis result for optimization. Instead of having to analyze a subsequent execution statement bounded by the same control statements, the analysis result of the subsequent execution statement can be retrieved from the cache and be used to modify the subsequent execution statement.