Modular matrix computations on multi-linear VLSI arrays

This paper describes the synthesis of modularly extensible matrix algorithms on multi-linear VLSI arrays with application to rectangular matrix multiplication. Multi-linear arrays differ from two-dimensional (2D) arrays in that one of the dimensions in the arrays is independent of the size of the pr...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Parallel computing Ročník 21; číslo 8; s. 1269 - 1284
Hlavný autor: Barada, H.R.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Amsterdam Elsevier B.V 1995
Elsevier
Predmet:
ISSN:0167-8191, 1872-7336
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:This paper describes the synthesis of modularly extensible matrix algorithms on multi-linear VLSI arrays with application to rectangular matrix multiplication. Multi-linear arrays differ from two-dimensional (2D) arrays in that one of the dimensions in the arrays is independent of the size of the problem and depends only on the I/O bandwidth of the host computer. They feature bounded clock skew (unlike 2D arrays) and higher throughput and communication bandwidth than linear arrays. A significant feature of the algorithms discussed in this paper is that they exhibit an interdependence between the number of processing elements (PEs) per row in the multi-linear array, the storage inside each PE, and the size of the problem. This interrelationship permits changes in one of these measures to be accommodated by altering one or more of the other measures. One particular benefit of this feature is that it enables the design of modularly extensible algorithms which can accommodate changes in the problem size by only changing the number of PEs per row in the array. Therefore, for a given host I/O bandwidth, larger problem sizes can be handled by cascading smaller multi-linear arrays, all of which have the same number of rows and are comprised of PEs with a fixed amount of local storage. A second feature of our algorithms is that their performance is also a function of the number of rows of PEs in the array which is in turn a function of the I/O bandwidth of the host. A third feature of our design is that the algorithms are controlled by tag bits inserted with the data and no control memory is required inside the PEs.
ISSN:0167-8191
1872-7336
DOI:10.1016/0167-8191(95)00020-O