Search Results - single‐instruction‐multiple‐threads pattern
-
1
Fully GPU-based electromagnetic transient simulation considering large-scale control systems for system-level studies
ISSN: 1751-8687, 1751-8695Published: The Institution of Engineering and Technology 03.08.2017Published in IET generation, transmission & distribution (03.08.2017)“…As more generators and loads are integrated by power electronic converters with complicated controls, electromagnetic transients (EMTs) simulation becomes an…”
Get full text
Journal Article -
2
A Quantitative Method to Data Reuse Patterns of SIMT Applications
ISSN: 1556-6056, 1556-6064Published: New York IEEE 01.07.2016Published in IEEE computer architecture letters (01.07.2016)“… The emerging Single Instruction Multiple Threads (SIMT) processor adopts a programming model that is fundamentally disparate from conventional scalar processors…”
Get full text
Journal Article -
3
Massively parallel differential evolution—pattern search optimization with graphics hardware acceleration: an investigation on bound constrained optimization problems
ISSN: 0925-5001, 1573-2916Published: Boston Springer US 01.07.2011Published in Journal of global optimization (01.07.2011)“… In this paper, the classical DE was adapted in the data-parallel CPU-GPU heterogeneous computing platform featuring Single Instruction-Multiple Thread (SIMT) execution…”
Get full text
Journal Article -
4
Cluster-based approach for improving graphics processing unit performance by inter streaming multiprocessors locality
ISSN: 1751-8601, 1751-861X, 2095-882X, 1751-861X, 2589-0514Published: Beijing The Institution of Engineering and Technology 01.09.2015Published in Chronic diseases and translational medicine (01.09.2015)“… As GPUs employ multithreading to hide latency, there is a small private data cache in each single instruction multiple thread (SIMT) core…”
Get full text
Journal Article -
5
Parallel ant colony for nonlinear function optimization with graphics hardware acceleration
ISBN: 9781424427932, 1424427932ISSN: 1062-922XPublished: IEEE 01.10.2009Published in 2009 IEEE International Conference on Systems, Man and Cybernetics (01.10.2009)“… `single instruction - multiple thread' (SIMT). The global optimal search of the ACO is enhanced by the classical local pattern search (PS) method…”
Get full text
Conference Proceeding -
6
Contention-Aware Selective Caching to Mitigate Intra-Warp Contention on GPUs
Published: IEEE 01.07.2017Published in 2017 16th International Symposium on Parallel and Distributed Computing (ISPDC) (01.07.2017)“… However, the behavior and effect of the cache on GPUs are different from those on conventional processors due to the Single Instruction Multiple Thread (SIMT…”
Get full text
Conference Proceeding -
7
GPU accelerated multilevel fast physical optics algorithm for radiation from non-planar apertures
Published: IEEE 01.11.2015Published in 2015 IEEE International Conference on Microwaves, Communications, Antennas and Electronic Systems (COMCAS) (01.11.2015)“…Acceleration of the multilevel physical optics (MLPO) algorithm using the single instruction multiple threads (SIMT…”
Get full text
Conference Proceeding -
8
Nonlinear optimization with a massively parallel Evolution Strategy–Pattern Search algorithm on graphics hardware
ISSN: 1568-4946, 1872-9681Published: Elsevier B.V 01.03.2011Published in Applied soft computing (01.03.2011)“… ‘Single Instruction Multiple Thread’ (SIMT). Evolution Strategy is a population-based evolutionary algorithm for solving complex optimization problems…”
Get full text
Journal Article -
9
Accelerated parametric chamfer alignment using a parallel, pipelined GPU realization
ISSN: 1861-8200, 1861-8219Published: Berlin/Heidelberg Springer Berlin Heidelberg 01.10.2019Published in Journal of real-time image processing (01.10.2019)“…Parametric chamfer alignment (PChA) is commonly employed for aligning an observed set of points with a corresponding set of reference points. PChA estimates…”
Get full text
Journal Article -
10
gSoFa: Scalable Sparse Symbolic LU Factorization on GPUs
ISSN: 1045-9219, 1558-2183Published: New York IEEE 01.04.2022Published in IEEE transactions on parallel and distributed systems (01.04.2022)“…Decomposing a matrix <inline-formula><tex-math notation="LaTeX">\mathbf {A}</tex-math> <mml:math><mml:mi…”
Get full text
Journal Article -
11
cuTensor-Tubal: Efficient Primitives for Tubal-Rank Tensor Learning Operations on GPUs
ISSN: 1045-9219, 1558-2183Published: New York IEEE 01.03.2020Published in IEEE transactions on parallel and distributed systems (01.03.2020)“…Tensors are the cornerstone data structures in high-performance computing, big data analysis and machine learning. However, tensor computations are…”
Get full text
Journal Article -
12
Hardware Accelerators for Real-Time Face Recognition: A Survey
ISSN: 2169-3536, 2169-3536Published: Piscataway IEEE 2022Published in IEEE access (2022)“…Real-time face recognition has been of great interest in the last decade due to its wide and varied critical applications which include biometrics, security in…”
Get full text
Journal Article -
13
Using SIMD and SIMT vectorization to evaluate sparse chemical kinetic Jacobian matrices and thermochemical source terms
ISSN: 0010-2180, 1556-2921Published: New York Elsevier Inc 01.12.2018Published in Combustion and flame (01.12.2018)“…) and single-instruction, multiple thread (SIMT) paradigms. These are implemented in pyJac, an open-source, reproducible code generation…”
Get full text
Journal Article -
14
Simultaneous branch and warp interweaving for sustained GPU performance
ISBN: 9781467304757, 1467304751ISSN: 1063-6897Published: IEEE 01.06.2012Published in 2012 39th Annual International Symposium on Computer Architecture (ISCA) (01.06.2012)“…Instruction Multiple-Thread (SIMT) micro-architectures implemented in Graphics Processing Units (GPUs) run fine-grained threads in lockstep by grouping them…”
Get full text
Conference Proceeding -
15
Acceleration of Bilateral Filtering Algorithm for Manycore and Multicore Architectures
ISBN: 9781467325080, 1467325082ISSN: 0190-3918Published: IEEE 01.09.2012Published in 2012 41st International Conference on Parallel Processing (01.09.2012)“… patterns as per the computations to exploit special purpose instructions. We also propose optimizations pertinent to Nvidia's Compute Unified Device…”
Get full text
Conference Proceeding -
16
Using SIMD and SIMT vectorization to evaluate sparse chemical kinetic Jacobian matrices and thermochemical source terms
ISSN: 2331-8422Published: Ithaca Cornell University Library, arXiv.org 04.09.2018Published in arXiv.org (04.09.2018)“…) and single-instruction, multiple thread (SIMT) paradigms. These are implemented in pyJac, an open-source, reproducible code generation…”
Get full text
Paper -
17
An efficient STT-RAM-based register file in GPU architectures
ISSN: 2153-6961Published: IEEE 01.01.2015Published in The 20th Asia and South Pacific Design Automation Conference (01.01.2015)“…Modern GPGPUs employ a large register file (RF) to efficiently process heavily parallel threads in single instruction multiple thread (SIMT) fashion…”
Get full text
Conference Proceeding -
18
Multigrid on GPU: Tackling Power Grid Analysis on parallel SIMT platforms
ISBN: 142442819X, 9781424428199ISSN: 1092-3152Published: IEEE 01.11.2008Published in 2008 IEEE/ACM International Conference on Computer-Aided Design (01.11.2008)“… For the first time, we show how to exploit recent massively parallel single-instruction multiple-thread (SIMT…”
Get full text
Conference Proceeding -
19
Fast 1-itemset frequency count using CUDA
ISSN: 2159-3450Published: IEEE 01.11.2016Published in TENCON ... IEEE Region Ten Conference (01.11.2016)“… Thus there is a need to speed-up this process. One of the techniques to speed-up the process is using the Single Instruction Multiple Thread (SIMT) architecture…”
Get full text
Conference Proceeding -
20
GSoFa: Scalable Sparse Symbolic LU Factorization on GPUs
ISSN: 2331-8422Published: Ithaca Cornell University Library, arXiv.org 09.05.2021Published in arXiv.org (09.05.2021)“…Decomposing matrix A into a lower matrix L and an upper matrix U, which is also known as LU decomposition, is an essential operation in numerical linear…”
Get full text
Paper

