Evaluating pre-trained Large Language Models on zero shot prompts for parallelization of source code

Large Language Models (LLMs) have become prominent in the software development life cycle, yet the generation of performant source code, particularly through automatic parallelization, remains underexplored. This study compares 23 pre-trained LLMs against the Intel C Compiler (icc), a state-of-the-a...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:The Journal of systems and software Ročník 230; s. 112543
Hlavní autoři: Yadav, Devansh, Mondal, Shouvick
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier Inc 01.12.2025
Témata:
ISSN:0164-1212
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Large Language Models (LLMs) have become prominent in the software development life cycle, yet the generation of performant source code, particularly through automatic parallelization, remains underexplored. This study compares 23 pre-trained LLMs against the Intel C Compiler (icc), a state-of-the-art auto-parallelization tool, to evaluate their effectiveness in transforming sequential C source code into parallelized versions. Using 30 kernels from the PolyBench C benchmarks, we generated 667 parallelized code versions to assess LLMs’ zero-shot parallelization capabilities. Our experiments reveal that LLMs can outperform icc in non-functional aspects like speedup, with 26.66% of cases surpassing icc’s performance. The best LLM-generated code achieved a 7.5× speedup compared to icc’s 1.08×. However, only 90 of the 667 generated versions (13.5%) were error-free and functionally correct, underscoring significant reliability challenges. After filtering out versions with compilation errors or data race issues through detailed memory and threading analysis, notable performance gains were observed. Challenges include increased cache miss rates and branch misses with higher thread counts, indicating that simply adding threads does not ensure better performance. Optimizing memory access, managing thread interactions, and validating code correctness are critical for LLM-generated parallel code. Our findings demonstrate that, even without fine-tuning or advanced prompting techniques, pre-trained LLMs can compete with decades-old non-LLM compiler technology in zero-shot sequential-to-parallel code translation. This highlights their potential in automating code parallelization while emphasizing the need to address reliability and performance optimization challenges. •We study source code parallelization tasks performed by pre-trained LLMs.•We examine data race and memory issues in LLM-aided parallelization.•We show LLMs perform competitively with compilers in zero-shot settings.
ISSN:0164-1212
DOI:10.1016/j.jss.2025.112543