Run-time parallelization of loops in computer programs by access patterns
Saved in:
| Title: | Run-time parallelization of loops in computer programs by access patterns |
|---|---|
| Patent Number: | 7,171,544 |
| Publication Date: | January 30, 2007 |
| Appl. No: | 10/736343 |
| Application Filed: | December 15, 2003 |
| Abstract: | Parallelization of loops is performed for loops having indirect loop index variables and embedded conditional statements in the loop body. Loops having any finite number of array variables in the loop body, and any finite number of indirect loop index variables can be parallelized. There are two particular limitations of the described techniques: (i) that there are no cross-iteration dependencies in the loop other than through the indirect loop index variables; and (ii) that the loop index variables (either direct or indirect) are not redefined in the loop body. |
| Inventors: | Bera, Rajendra K. (Bangalore, IN) |
| Assignees: | International Business Machines Corporation (Armonk, NY, US) |
| Claim: | 1. A method for executing, by a processor of a computer system, a set of program instructions for a loop, wherein the method comprises the steps of: associating a unique proxy value with each indirect loop index variable of the loop, wherein each unique proxy value is a different prime number; calculating, for each iteration of the loop, an indirectly indexed access pattern based upon the unique values; determining whether cross-iteration dependencies exist between any two iterations of the loop based upon the indirectly indexed access patterns of the two iterations; scheduling the program instructions of the loop across iterations into waves based on the cross-iteration dependencies found; and executing the waves. |
| Claim: | 2. The method of claim 1 , wherein the indirectly indexed access pattern for each iteration is calculated by forming the product of the unique proxy values associated with each of the indirect loop index variables on a decision path of the loop for that iteration. |
| Claim: | 3. The method of claim 2 , wherein the determining of cross-iteration dependencies is accomplished by finding the greatest common divisor between two indirectly indexed access patterns, wherein a greatest common divisor of 1 indicates that no dependencies exist between the two respective iterations. |
| Claim: | 4. The method of claim 2 , wherein an indirect indexed access pattern is calculated for each possible decision path of the loop. |
| Claim: | 5. The method of claim 1 , the indirectly indexed access patterns for the respective iterations of the loop have pattern values, and wherein for each iteration the pattern values of the indirectly indexed access pattern do not exceed three in number regardless of how many statements are in the loop. |
| Claim: | 6. A computer program product for executing, by a processor of a computer system, a set of program instructions for a loop, the computer program product comprising computer software stored on a tangible, computer-readable storage medium for performing the steps of: associating a unique proxy value with each indirect loop index variable of the loop, wherein each unique proxy value is a different prime number; calculating, for each iteration of the loop, an a plurality of indirectly indexed access patterns based upon the unique values; determining whether cross-iteration dependencies exist between any two iterations of the loop based upon the indirectly indexed access patterns of the two iterations; and scheduling the program instructions of the loop across iterations into waves based on the cross-iteration dependencies found, wherein the waves are executed responsive to the scheduling. |
| Claim: | 7. The computer program product of claim 6 , wherein the indirectly indexed access patterns for each iteration is are calculated by forming the product of the unique proxy values associated with each of the indirect loop index variables on a decision path of the loop for that iteration. |
| Claim: | 8. The computer program product of claim 7 , wherein the determining of cross-iteration dependencies is accomplished by finding the greatest common divisor between two indirectly indexed access patterns, wherein a greatest common divisor of 1 indicates that no dependencies exist between the two respective iterations. |
| Claim: | 9. The computer program product of claim 7 , wherein an indirect indexed access pattern is calculated for each possible decision path of the loop. |
| Claim: | 10. The computer program product of claim 6 , the indirectly indexed access patterns for the respective iterations of the loop have pattern values, and wherein for each iteration the pattern values of the indirectly indexed access pattern do not exceed three in number regardless of how many statements are in the loop. |
| Claim: | 11. A computer system having program instructions stored on a computer-readable medium for executing, by a processor of the computer system, a set of the program instructions for a loop, wherein the executing comprises performing the steps of: associating a unique proxy value with each indirect loop index variable of the loop, wherein each unique proxy value is a different prime number; calculating, for each iteration of the loop, an a plurality of indirectly indexed access patterns based upon the unique values; determining whether cross-iteration dependencies exist between any two iterations of the loop based upon the indirectly indexed access patterns of the two iterations; and scheduling the program instructions of the loop across iterations into waves based on the cross-iteration dependencies found, wherein the waves are executed responsive to the scheduling. |
| Claim: | 12. The computer program product of claim 11 , wherein the indirectly indexed access patterns for each iteration is are calculated by forming the product of the unique proxy values associated with each of the indirect loop index variables on a decision path of the loop for that iteration. |
| Claim: | 13. The computer program product of claim 12 , wherein the determining of cross-iteration dependencies is accomplished by finding the greatest common divisor between two indirectly indexed access patterns, wherein a greatest common divisor of 1 indicates that no dependencies exist between the two respective iterations. |
| Claim: | 14. The computer program product of claim 12 , wherein an indirect indexed access pattern is calculated for each possible decision path of the loop. |
| Claim: | 15. The computer program product of claim 11 , the indirectly indexed access patterns for the respective iterations of the loop have pattern values, and wherein for each iteration the pattern values of the indirectly indexed access pattern do not exceed three in number regardless of how many statements are in the loop. |
| Current U.S. Class: | 712/216 |
| Patent References Cited: | 5842022 November 1998 Nakahira et al. |
| Other References: | Huang, Tsung-Chuan and Po-Hsueh Hsu. “A Practical run-time technique for exploiting loop-level parallelism.” Journal of Systems and Software, 54, 2000. p. 259-271. cited by examiner Huang, Tsung-Chuan and Cheng-Ming Yang. “Non-linear array data dependence test.” Journal of Systems and Software, 57. Elsevier: 2001. p. 145-154. cited by examiner Hwang, J.J. et al. “A New Access Control Method Using Prime Factorisation.” The Computer Journal, col. 35, No. 1, 1992. p. 16-20. cited by examiner Tanenbaum, Andrew S. Structured Computer Organization. Second Ed. Prentice: 1984. p. 11-12 & 10. cited by examiner Blume, William and Rudolf Eigenmann. “The Range Test: A Dependence Test for Symbolic, Non-linear Expressions.” IEEE: 1994. 528-537. cited by examiner Kobiltz, Neal. Algebraic Aspects of Cryptography. Springer: 1999. p. 28. cited by examiner U.S. Appl. No. 09/597,478, filed Jun. 20, 2000, International Business Machines Corporation. cited by other |
| Assistant Examiner: | Fiegle, Ryan |
| Primary Examiner: | Chan, Eddie |
| Attorney, Agent or Firm: | England, Anthony V. S. |
| Accession Number: | edspgr.07171544 |
| Database: | USPTO Patent Grants |
| Abstract: | Parallelization of loops is performed for loops having indirect loop index variables and embedded conditional statements in the loop body. Loops having any finite number of array variables in the loop body, and any finite number of indirect loop index variables can be parallelized. There are two particular limitations of the described techniques: (i) that there are no cross-iteration dependencies in the loop other than through the indirect loop index variables; and (ii) that the loop index variables (either direct or indirect) are not redefined in the loop body. |
|---|