Zobraziť v EDS

An Instruction Cache Architecture for Parallel Execution of Java Threads

Uložené v:

Podrobná bibliografia
Názov:	An Instruction Cache Architecture for Parallel Execution of Java Threads
Autori:	Wanming Chu, Yamin Li
Prispievatelia:	The Pennsylvania State University CiteSeerX Archives
Zdroj:	http://cis.k.hosei.ac.jp/~yamin/papers/PDCAT03-cache.pdf.
Zbierka:	CiteSeerX
Predmety:	Cache, Java virtual machine, Java processor, instruction level parallelism, thread level parallelism, multithreading, performance evaluation, trace-driven simulation
Popis:	Designing a Java processor supporting horizontal multithreading has been becoming more attractive as network computing gains importance. Different from the traditional superscalar processors that issue multiple instructions from a single instruction stream to exploit the instruction level parallelism (ILP), the horizontal multithreading Java processors issue multiple instructions (bytecodes) from multiple threads in parallel to exploit not only the ILP but the thread level parallelism (TLP). Such processors have multiple dispatch slots and require the instruction fetch unit to supply instructions with much higher bandwidth than superscalar processors. Using a traditional superscalar cache architecture in a horizontal multithreading Java processor results in high cache miss ratio caused by the interference among the threads. This paper investigates a multibank instruction cache architecture for horizontal multithreading Java processor to meet the requirements of the high instruction fetch bandwidth. In order to evaluate the cache performance as well as the horizontal multithreading Java processor performance, we developed a trace driven simulator. The simulator consists of a trace generator which generates the Java bytecode execution traces and an architectural simulator which reads the traces and evaluates the performance of the instruction cache and the overall performance of the Java processor. Our simulation results show that the performance improvements are obtained by the low cache miss ratio and the high instruction fetch bandwidth of the proposed cache architecture. The IPC (instructions per cycle) performance is about 19 when the numbers of slots and banks both are 8, about 5 times better than one bank cache.
Druh dokumentu:	text
Popis súboru:	application/pdf
Jazyk:	English
Relation:	http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.6909
Dostupnosť:	http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.78.6909 http://cis.k.hosei.ac.jp/~yamin/papers/PDCAT03-cache.pdf
Rights:	Metadata may be used without restrictions as long as the oai identifier remains attached to it.
Prístupové číslo:	edsbas.541DB3E4
Databáza:	BASE

View record from BASE

Nájsť tento článok vo Web of Science

Popis
Abstrakt:	Designing a Java processor supporting horizontal multithreading has been becoming more attractive as network computing gains importance. Different from the traditional superscalar processors that issue multiple instructions from a single instruction stream to exploit the instruction level parallelism (ILP), the horizontal multithreading Java processors issue multiple instructions (bytecodes) from multiple threads in parallel to exploit not only the ILP but the thread level parallelism (TLP). Such processors have multiple dispatch slots and require the instruction fetch unit to supply instructions with much higher bandwidth than superscalar processors. Using a traditional superscalar cache architecture in a horizontal multithreading Java processor results in high cache miss ratio caused by the interference among the threads. This paper investigates a multibank instruction cache architecture for horizontal multithreading Java processor to meet the requirements of the high instruction fetch bandwidth. In order to evaluate the cache performance as well as the horizontal multithreading Java processor performance, we developed a trace driven simulator. The simulator consists of a trace generator which generates the Java bytecode execution traces and an architectural simulator which reads the traces and evaluates the performance of the instruction cache and the overall performance of the Java processor. Our simulation results show that the performance improvements are obtained by the low cache miss ratio and the high instruction fetch bandwidth of the proposed cache architecture. The IPC (instructions per cycle) performance is about 19 when the numbers of slots and banks both are 8, about 5 times better than one bank cache.