CQLLM: A Framework for Generating CodeQL Security Vulnerability Detection Code Based on Large Language Model.
Gespeichert in:
| Titel: | CQLLM: A Framework for Generating CodeQL Security Vulnerability Detection Code Based on Large Language Model. |
|---|---|
| Autoren: | Wang, Le, Chen, Chan, Zhu, Junyi, Zhan, Rufeng, Han, Weihong |
| Quelle: | Applied Sciences (2076-3417); Jan2026, Vol. 16 Issue 1, p517, 17p |
| Schlagwörter: | COMPUTER security vulnerabilities, LANGUAGE models, COMPUTER software correctness, SOFTWARE frameworks, PENETRATION testing (Computer security), CODE generators |
| Abstract: | With the increasing complexity of software systems, the number of security vulnerabilities contained within software has risen accordingly. The existing shift-left security concept aims to detect and fix vulnerabilities during the software development cycle. While CodeQL stands as the premier static code analysis tool currently available on the market, its high barrier to entry poses challenges for meeting the implementation requirements of shift-left security initiatives. While large language model (LLM) offers potential assistance in QL code development, the inherent complexity of code generation tasks often leads to persistent issues such as syntactic inaccuracies and references to non-existent modules, which consequently constrains their practical applicability in this domain. To address these challenges, this paper proposes CQLLM (CodeQL-enhanced Large Language Model), a novel framework for automating the generation of CodeQL security vulnerability detection code by leveraging LLM. This framework is designed to enhance both the efficiency and the accuracy of automated QL code generation, thereby advancing static code analysis for a more efficient and intelligent paradigm for vulnerability detection. First, retrieval-augmented generation (RAG) is employed to search the vector database for dependency libraries and code snippets that are highly similar to the user's input, thereby constraining the model's generation process and preventing the import of invalid modules. Then, the user input and the knowledge chunks retrieved by RAG are fed into a fine-tuned LLM to perform reasoning and generate QL code. By integrating external knowledge bases with the large model, the framework enhances the correctness and completeness of the generated code. Experimental results show that CQLLM significantly improves the executability of the generated QL code, with the execution success rate improving from 0.31% to 72.48%, outperforming the original model by a large margin. Meanwhile, CQLLM also enhances the effectiveness of the generated results, achieving a CWE (Common Weakness Enumeration) coverage rate of 57.4% in vulnerability detection tasks, demonstrating its practical applicability in real-world vulnerability detection. [ABSTRACT FROM AUTHOR] |
| Copyright of Applied Sciences (2076-3417) is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) | |
| Datenbank: | Complementary Index |
| FullText | Text: Availability: 0 CustomLinks: – Url: https://resolver.ebscohost.com/openurl?sid=EBSCO:edb&genre=article&issn=20763417&ISBN=&volume=16&issue=1&date=20260101&spage=517&pages=517-533&title=Applied Sciences (2076-3417)&atitle=CQLLM%3A%20A%20Framework%20for%20Generating%20CodeQL%20Security%20Vulnerability%20Detection%20Code%20Based%20on%20Large%20Language%20Model.&aulast=Wang%2C%20Le&id=DOI:10.3390/app16010517 Name: Full Text Finder Category: fullText Text: Full Text Finder Icon: https://imageserver.ebscohost.com/branding/images/FTF.gif MouseOverText: Full Text Finder – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Wang%20L Name: ISI Category: fullText Text: Nájsť tento článok vo Web of Science Icon: https://imagesrvr.epnet.com/ls/20docs.gif MouseOverText: Nájsť tento článok vo Web of Science |
|---|---|
| Header | DbId: edb DbLabel: Complementary Index An: 190819838 RelevancyScore: 1082 AccessLevel: 6 PubType: Academic Journal PubTypeId: academicJournal PreciseRelevancyScore: 1082.40466308594 |
| IllustrationInfo | |
| Items | – Name: Title Label: Title Group: Ti Data: CQLLM: A Framework for Generating CodeQL Security Vulnerability Detection Code Based on Large Language Model. – Name: Author Label: Authors Group: Au Data: <searchLink fieldCode="AR" term="%22Wang%2C+Le%22">Wang, Le</searchLink><br /><searchLink fieldCode="AR" term="%22Chen%2C+Chan%22">Chen, Chan</searchLink><br /><searchLink fieldCode="AR" term="%22Zhu%2C+Junyi%22">Zhu, Junyi</searchLink><br /><searchLink fieldCode="AR" term="%22Zhan%2C+Rufeng%22">Zhan, Rufeng</searchLink><br /><searchLink fieldCode="AR" term="%22Han%2C+Weihong%22">Han, Weihong</searchLink> – Name: TitleSource Label: Source Group: Src Data: Applied Sciences (2076-3417); Jan2026, Vol. 16 Issue 1, p517, 17p – Name: Subject Label: Subject Terms Group: Su Data: <searchLink fieldCode="DE" term="%22COMPUTER+security+vulnerabilities%22">COMPUTER security vulnerabilities</searchLink><br /><searchLink fieldCode="DE" term="%22LANGUAGE+models%22">LANGUAGE models</searchLink><br /><searchLink fieldCode="DE" term="%22COMPUTER+software+correctness%22">COMPUTER software correctness</searchLink><br /><searchLink fieldCode="DE" term="%22SOFTWARE+frameworks%22">SOFTWARE frameworks</searchLink><br /><searchLink fieldCode="DE" term="%22PENETRATION+testing+%28Computer+security%29%22">PENETRATION testing (Computer security)</searchLink><br /><searchLink fieldCode="DE" term="%22CODE+generators%22">CODE generators</searchLink> – Name: Abstract Label: Abstract Group: Ab Data: With the increasing complexity of software systems, the number of security vulnerabilities contained within software has risen accordingly. The existing shift-left security concept aims to detect and fix vulnerabilities during the software development cycle. While CodeQL stands as the premier static code analysis tool currently available on the market, its high barrier to entry poses challenges for meeting the implementation requirements of shift-left security initiatives. While large language model (LLM) offers potential assistance in QL code development, the inherent complexity of code generation tasks often leads to persistent issues such as syntactic inaccuracies and references to non-existent modules, which consequently constrains their practical applicability in this domain. To address these challenges, this paper proposes CQLLM (CodeQL-enhanced Large Language Model), a novel framework for automating the generation of CodeQL security vulnerability detection code by leveraging LLM. This framework is designed to enhance both the efficiency and the accuracy of automated QL code generation, thereby advancing static code analysis for a more efficient and intelligent paradigm for vulnerability detection. First, retrieval-augmented generation (RAG) is employed to search the vector database for dependency libraries and code snippets that are highly similar to the user's input, thereby constraining the model's generation process and preventing the import of invalid modules. Then, the user input and the knowledge chunks retrieved by RAG are fed into a fine-tuned LLM to perform reasoning and generate QL code. By integrating external knowledge bases with the large model, the framework enhances the correctness and completeness of the generated code. Experimental results show that CQLLM significantly improves the executability of the generated QL code, with the execution success rate improving from 0.31% to 72.48%, outperforming the original model by a large margin. Meanwhile, CQLLM also enhances the effectiveness of the generated results, achieving a CWE (Common Weakness Enumeration) coverage rate of 57.4% in vulnerability detection tasks, demonstrating its practical applicability in real-world vulnerability detection. [ABSTRACT FROM AUTHOR] – Name: Abstract Label: Group: Ab Data: <i>Copyright of Applied Sciences (2076-3417) is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.) |
| PLink | https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edb&AN=190819838 |
| RecordInfo | BibRecord: BibEntity: Identifiers: – Type: doi Value: 10.3390/app16010517 Languages: – Code: eng Text: English PhysicalDescription: Pagination: PageCount: 17 StartPage: 517 Subjects: – SubjectFull: COMPUTER security vulnerabilities Type: general – SubjectFull: LANGUAGE models Type: general – SubjectFull: COMPUTER software correctness Type: general – SubjectFull: SOFTWARE frameworks Type: general – SubjectFull: PENETRATION testing (Computer security) Type: general – SubjectFull: CODE generators Type: general Titles: – TitleFull: CQLLM: A Framework for Generating CodeQL Security Vulnerability Detection Code Based on Large Language Model. Type: main BibRelationships: HasContributorRelationships: – PersonEntity: Name: NameFull: Wang, Le – PersonEntity: Name: NameFull: Chen, Chan – PersonEntity: Name: NameFull: Zhu, Junyi – PersonEntity: Name: NameFull: Zhan, Rufeng – PersonEntity: Name: NameFull: Han, Weihong IsPartOfRelationships: – BibEntity: Dates: – D: 01 M: 01 Text: Jan2026 Type: published Y: 2026 Identifiers: – Type: issn-print Value: 20763417 Numbering: – Type: volume Value: 16 – Type: issue Value: 1 Titles: – TitleFull: Applied Sciences (2076-3417) Type: main |
| ResultId | 1 |
Full Text Finder
Nájsť tento článok vo Web of Science