Suchergebnisse - (("Matrix transpose algorithm") OR ("Matrix transport algorithm"))
Andere Suchmöglichkeiten:
- (("Matrix transpose algorithm") OR ("Matrix transport algorithm")) »
-
1
Padding Free Bank Conflict Resolution for CUDA-Based Matrix Transpose Algorithm
ISSN: 2211-7938, 2211-7946, 2211-7946Veröffentlicht: Dordrecht Springer Netherlands 01.01.2014Veröffentlicht in The International journal of networked and distributed computing (Online) (01.01.2014)“… In this paper, two matrix transpose algorithms are proposed to alleviate the aforementioned issues of ensuring coalesced access and conflict free bank access …”
Volltext
Journal Article -
2
Communication efficient adaptive matrix transpose algorithm for FFT on symmetric multiprocessors
ISBN: 0780388089, 9780780388086ISSN: 0094-2898Veröffentlicht: IEEE 2005Veröffentlicht in Proceedings of the Thirty-Seventh Southeastern Symposium on System Theory, 2005. SSST '05 (2005)“… In this paper, we propose an efficient algorithm (the adaptive matrix-transpose algorithm) for transposing matrices, which is based on all-to …”
Volltext
Tagungsbericht -
3
Restructuring and implementations of 2D matrix transpose algorithm using SSE4 vector instructions
Veröffentlicht: IEEE 01.10.2015Veröffentlicht in 2015 International Conference on Applied Research in Computer Science and Engineering (ICAR) (01.10.2015)“… Current general-purpose processors are augmented with vector instructions that can process many elements of matrices and vectors in parallel. Transposing a …”
Volltext
Tagungsbericht -
4
Padding free bank conflict resolution for CUDA-based matrix transpose algorithm
Veröffentlicht: IEEE 01.06.2014Veröffentlicht in 15th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) (01.06.2014)“… In this paper, two matrix transpose algorithms are proposed to alleviate the aforementioned issues of ensuring coalesced access and conflict free bank access …”
Volltext
Tagungsbericht -
5
Parallel matrix transpose algorithms on distributed memory concurrent computers
ISSN: 0167-8191, 1872-7336Veröffentlicht: Amsterdam Elsevier B.V 01.09.1995Veröffentlicht in Parallel computing (01.09.1995)“… This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors …”
Volltext
Journal Article -
6
Matrix transpose on meshes with buses
ISSN: 0743-7315, 1096-0848Veröffentlicht: Elsevier Inc 01.10.2016Veröffentlicht in Journal of parallel and distributed computing (01.10.2016)“… 0.45n for the number of steps required by any matrix transpose algorithm on an n×n mesh with buses. Next we present an algorithm which solves this problem in less …”
Volltext
Journal Article -
7
A 280 mV-to-1.1 V 256b Reconfigurable SIMD Vector Permutation Engine With 2-Dimensional Shuffle in 22 nm Tri-Gate CMOS
ISSN: 0018-9200, 1558-173XVeröffentlicht: New York, NY IEEE 01.01.2013Veröffentlicht in IEEE journal of solid-state circuits (01.01.2013)“… An ultra-low voltage reconfigurable 4-way to 32-way SIMD vector permutation engine is fabricated in 22 nm tri-gate bulk CMOS, consisting of a 32-entry × 256b …”
Volltext
Journal Article Tagungsbericht -
8
Linear-time matrix transpose algorithms using vector register file with diagonal registers
ISBN: 0769509908, 9780769509907ISSN: 1530-2075Veröffentlicht: IEEE 2001Veröffentlicht in Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001 (2001)“… Matrix transpose operation (MT) is used frequently in many multimedia and high performance applications. Therefore, using a faster MT operation results in a …”
Volltext
Tagungsbericht -
9
Parallel matrix transpose algorithms on distributed memory concurrent computers
ISBN: 0818649801, 9780818649806Veröffentlicht: IEEE Comput. Soc. Press 1993Veröffentlicht in Proceedings of the Scalable Parallel Libraries Conference , October 6-8, 1993, Mississippi State, Mississippi (1993)“… This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors …”
Volltext
Tagungsbericht -
10
A 280mV-to-1.1V 256b reconfigurable SIMD vector permutation engine with 2-dimensional shuffle in 22nm CMOS
ISBN: 1467303763, 9781467303767ISSN: 0193-6530Veröffentlicht: IEEE 01.02.2012Veröffentlicht in 2012 IEEE International Solid-State Circuits Conference (01.02.2012)“… Energy-efficient SIMD permutation operations are key for maximizing high-performance microprocessor vector datapath utilization in multimedia, graphics, and …”
Volltext
Tagungsbericht -
11
A Sparse Matrix Fast Transpose Algorithm Based on Pseudo-Address
Veröffentlicht: IEEE 01.12.2019Veröffentlicht in 2019 International Conference on Intelligent Computing, Automation and Systems (ICICAS) (01.12.2019)“… sparse matrix, mainly study the fast transpose algorithm of sparse matrix, and propose a new matrix transpose algorithm on it for the first time-the sparse matrix fast transpose algorithm of pseudo …”
Volltext
Tagungsbericht -
12
A simplified design strategy for mapping image processing algorithms on a SIMD torus
ISSN: 0304-3975, 1879-2294Veröffentlicht: Elsevier B.V 03.04.1995Veröffentlicht in Theoretical computer science (03.04.1995)“… A method is proposed to effectively realize large number of arbitrary, one-to-one, personalized, and concurrent communication between the PEs, by suitably repeating the matrix transpose algorithm …”
Volltext
Journal Article -
13
An O(n) Time-Complexity Matrix Transpose on Torus Array Processor
ISBN: 1457717964, 9781457717963Veröffentlicht: IEEE 01.11.2011Veröffentlicht in 2011 Second International Conference on Networking and Computing (01.11.2011)“… and an efficient matrix transpose algorithm can speed up many applications. In this paper, we propose a new algorithm for n x n matrix transposition on array processors connected in torus network …”
Volltext
Tagungsbericht -
14
A parallel cosmological hydrodynamics code
ISBN: 0897918541, 9780897918541Veröffentlicht: Washington, DC, USA IEEE Computer Society 17.11.1996Veröffentlicht in Proceedings of the 1996 ACM/IEEE conference on Supercomputing (17.11.1996)“… A new, flexible matrix transpose algorithm is used to interchange distributed and local dimensions of the mesh. Timing results from runs on an IBM SP2 supercomputer are given …”
Volltext
Tagungsbericht -
15
A Parallel Cosmological Hydrodynamics Code
ISBN: 0897918541, 9780897918541Veröffentlicht: IEEE 1996Veröffentlicht in Supercomputing '96 conference proceedings : the International Conference on High Performance Computing and Communications : November 17-22, 1996, Pittsburgh, PA (1996)“… , combining a mesh based Eulerian hydrodynamics code and a Particle-Mesh N-body code. A new, flexible matrix transpose algorithm is used to interchange distributed and local dimensions of the mesh …”
Volltext
Tagungsbericht -
16
Random Address Permute-Shift Technique for the Shared Memory on GPUs
ISSN: 0190-3918Veröffentlicht: IEEE 01.09.2014Veröffentlicht in Proceedings of the International Conference on Parallel Processing (01.09.2014)“… The Discrete Memory Machine (DMM) is a theoretical parallel computing model that captures the essence of memory access to the shared memory of a streaming …”
Volltext
Tagungsbericht -
17
The Random Address Shift to Reduce the Memory Access Congestion on the Discrete Memory Machine
ISSN: 2379-1888Veröffentlicht: IEEE 01.12.2013Veröffentlicht in International Symposium on Computing and Networking (Online) (01.12.2013)“… The Discrete Memory Machine (DMM) is a theoretical parallel computing model that captures the essence of memory access of the streaming multiprocessor on …”
Volltext
Tagungsbericht

