In EDS ansehen

Parallel computer system, parallel computing method, and program storage medium

Gespeichert in:

Bibliographische Detailangaben
Titel:	Parallel computer system, parallel computing method, and program storage medium
Patent Number:	10013,393
Publikationsdatum:	July 03, 2018
Appl. No:	15/137238
Application Filed:	April 25, 2016
Abstract:	A parallel computer system including a plurality of processors configured to perform LU factorization in parallel, the system is configured to cause each of the plurality of processors to execute processing including: generating a first panel by integrating a plurality of row panels among panels of a matrix to be subjected to the LU-factorization, the plurality of row panels being processed by the processor, generating a second panel by integrating a plurality of column panels among the panels of the matrix, the plurality of column panels being processed by the processor, and computing a matrix product of the first panel and the second panel. In parallel with the computation of the matrix product, each processor is configured to receive or transmit a column panel to be used for computation of a subsequent matrix product from or to another processor among the plurality of processors.
Inventors:	FUJITSU LIMITED (Kawasaki-shi, Kanagawa, JP)
Assignees:	FUJITSU LIMITED (Kawasaki, JP)
Claim:	1. A non-transitory computer-readable storage medium that stores a program causing a first processor among a plurality of processors configured to perform LU factorization in parallel, to execute processing comprising: generating a first panel by integrating a plurality of row panels among panels of a local array of a matrix to be subjected to the LU-factorization, the plurality of row panels being processed by the first processor; generating a second panel by integrating a plurality of column panels among the panels of the local array, the plurality of column panels being processed by the first processor; and computing a matrix product of the first panel and the second panel; wherein the matrix is composed of a plurality of blocks which are distributed to the plurality of processors, and the blocks distributed to each of the plurality of processors form the local array.
Claim:	2. The storage medium according to claim 1 , wherein in the computing the matrix product, communication processing is executed, in parallel with the computation of the matrix product, to receive or transmit a column panel to be used for computation of a subsequent matrix product from or to another processor among the plurality of processors.
Claim:	3. The storage medium according to claim 2 , wherein in the computing the matrix product, the computation of the matrix product and the communication processing are performed in batches.
Claim:	4. The storage medium according to claim 1 , wherein the program further causes the first processor to execute processing of computing the matrix product using a head block of a column panel with the smallest column number among the plurality of column panels and a row panel with the smallest row number among the plurality of row panels if lengths in a column direction of the plurality of column panels are different.
Claim:	5. The storage medium according to claim 1 , wherein the program further causes an exchange of rows to be executed for a column panel with the smallest column number among the plurality of column panels.
Claim:	6. A parallel computer system comprising a plurality of processors configured to perform LU factorization in parallel, the parallel computer system being configured to cause each of the plurality of processors to execute processing comprising: generating a first panel by integrating a plurality of row panels among panels of a local array of a matrix to be subjected to the LU-factorization, the plurality of row panels being processed by the processor, generating a second panel by integrating a plurality of column panels among the panels of the local array, the plurality of column panels being processed by the processor, and computing a matrix product of the first panel and the second panel; wherein the matrix is composed of a plurality of blocks which are distributed to the plurality of processors, and the blocks distributed to each of the plurality of processors form the local array.
Claim:	7. A parallel computing method of causing each of a plurality of processors configured to perform LU factorization in parallel, to execute processing comprising: generating a first panel by integrating a plurality of row panels among panels of a local array of a matrix to be subjected to the LU-factorization, the plurality of row panels being processed by the processor, generating a second panel by integrating a plurality of column panels among the panels of the local array, the plurality of column panels being processed by the processor, and computing a matrix product of the first panel and the second panel; wherein the matrix is composed of a plurality of blocks which are distributed to the plurality of processors, and the blocks distributed to each of the plurality of processors form the local array.
Patent References Cited:	2004/0193841 September 2004 Nakanishi 2006/0064452 March 2006 Nakanishi 2009/0300091 December 2009 Brokenshire 2009/0319592 December 2009 Nakanishi 2000-339295 December 2000 2006-85619 March 2006 2008-176738 July 2008 2008/136045 November 2008
Other References:	“HPL—A Portable Implementation of the High-Performance Linpack Benchmark for Distributed-Memory Computers”, Innovative Computing Laboratory, 1 pp., Sep. 27, 2000, http://www.netlib.org/benchmark/hpl_oldest/. cited by applicant
Primary Examiner:	Ngo, Chuong D
Attorney, Agent or Firm:	Staas & Halsey LLP
Dokumentencode:	edspgr.10013393
Datenbank:	USPTO Patent Grants

View record in USPTO Patent Grants

Schreiben Sie den ersten Kommentar!