Software acceleration of multi-user MIMO uplink detection on GPU

Uloženo v:
Podrobná bibliografie
Název: Software acceleration of multi-user MIMO uplink detection on GPU
Autoři: Nada, Ali, Ali, Hazem Ismail, Liu, Liang, Alkabani, Yousra
Přispěvatelé: Lund University, Profile areas and other strong research environments, Strategic research areas (SRA), ELLIIT: the Linköping-Lund initiative on IT and mobile communication, Lunds universitet, Profilområden och andra starka forskningsmiljöer, Strategiska forskningsområden (SFO), ELLIIT: the Linköping-Lund initiative on IT and mobile communication, Originator, Lund University, Faculty of Engineering, LTH, LTH Profile areas, LTH Profile Area: AI and Digitalization, Lunds universitet, Lunds Tekniska Högskola, LTH profilområden, LTH profilområde: AI och digitalisering, Originator, Lund University, Faculty of Engineering, LTH, Competence centers, LTH, NEXTG2COM – a Vinnova Competence Centre in Advanced Digitalisation, Lunds universitet, Lunds Tekniska Högskola, Kompetenscentrum, LTH, NEXTG2COM – ett Vinnova kompetenscenter inom Avancerad Digitalisering, Originator
Zdroj: Parallel Computing. 125
Témata: Engineering and Technology, Electrical Engineering, Electronic Engineering, Information Engineering, Telecommunications, Teknik, Elektroteknik och elektronik, Telekommunikation
Popis: This paper presents the exploration of GPU-accelerated block-wise decompositions for zero-forcing (ZF) based QR and Cholesky methods applied to massive multiple-input multiple-output (MIMO) uplink detection algorithms. Three algorithms are evaluated: ZF with block Cholesky decomposition, ZF with block QR decomposition (QRD), and minimum mean square error (MMSE) with block Cholesky decomposition. The latter was the only one previously explored, but it used standard Cholesky decomposition. Our approach achieves an 11% improvement over the previous GPU-accelerated MMSE study. Through performance analysis, we observe a trade-off between precision and execution time. Reducing precision from FP64 to FP32 improves execution time but increases bit error rate (BER), with ZF-based QRD reducing execution time from 2.04μs to 1.24μs for a 128 × 8 MIMO size. The study also highlights that larger MIMO sizes, particularly 2048 × 32, require GPUs to fully utilize their computational and memory capabilities, especially under FP64 precision. In contrast, smaller matrices are compute-bound. Our results recommend GPUs for larger MIMO sizes, as they offer the parallelism and memory resources necessary to efficiently handle the computational demands of next-generation networks. This work paves the way for scalable, GPU-based massive MIMO uplink detection systems.
Přístupová URL adresa: https://doi.org/10.1016/j.parco.2025.103150
Databáze: SwePub
Popis
Abstrakt:This paper presents the exploration of GPU-accelerated block-wise decompositions for zero-forcing (ZF) based QR and Cholesky methods applied to massive multiple-input multiple-output (MIMO) uplink detection algorithms. Three algorithms are evaluated: ZF with block Cholesky decomposition, ZF with block QR decomposition (QRD), and minimum mean square error (MMSE) with block Cholesky decomposition. The latter was the only one previously explored, but it used standard Cholesky decomposition. Our approach achieves an 11% improvement over the previous GPU-accelerated MMSE study. Through performance analysis, we observe a trade-off between precision and execution time. Reducing precision from FP64 to FP32 improves execution time but increases bit error rate (BER), with ZF-based QRD reducing execution time from 2.04μs to 1.24μs for a 128 × 8 MIMO size. The study also highlights that larger MIMO sizes, particularly 2048 × 32, require GPUs to fully utilize their computational and memory capabilities, especially under FP64 precision. In contrast, smaller matrices are compute-bound. Our results recommend GPUs for larger MIMO sizes, as they offer the parallelism and memory resources necessary to efficiently handle the computational demands of next-generation networks. This work paves the way for scalable, GPU-based massive MIMO uplink detection systems.
ISSN:01678191
DOI:10.1016/j.parco.2025.103150