Scalable Thread-Safety Analysis of Java Classes with CodeQL

Saved in:
Bibliographic Details
Title: Scalable Thread-Safety Analysis of Java Classes with CodeQL
Authors: Jåtten, Bjørnar Haugstad, Jørgensen, Simon Boye, Petersen, Rasmus, Pardo, Raúl
Publication Year: 2025
Collection: ArXiv.org (Cornell University Library)
Subject Terms: Software Engineering
Description: In object-oriented languages software developers rely on thread-safe classes to implement concurrent applications. However, determining whether a class is thread-safe is a challenging task. This paper presents a highly scalable method to analyze thread-safety in Java classes. We provide a definition of thread-safety for Java classes founded on the correctness principle of the Java memory model, data race freedom. We devise a set of properties for Java classes that are proven to ensure thread-safety. We encode these properties in the static analysis tool CodeQL to automatically analyze Java source code. We perform an evaluation on the top 1000 GitHub repositories. The evaluation comprises 3632865 Java classes; with 1992 classes annotated as @ThreadSafe from 71 repositories. These repositories include highly popular software such as Apache Flink (24.6k stars), Facebook Fresco (17.1k stars), PrestoDB (16.2k starts), and gRPC (11.6k starts). Our queries detected thousands of thread-safety errors. The running time of our queries is below 2 minutes for repositories up to 200k lines of code, 20k methods, 6000 fields, and 1200 classes. We have submitted a selection of detected concurrency errors as PRs, and developers positively reacted to these PRs. We have submitted our CodeQL queries to the main CodeQL repository, and they are currently in the process of becoming available as part of GitHub actions. The results demonstrate the applicability and scalability of our method to analyze thread-safety in real-world code bases.
Document Type: text
Language: unknown
Relation: http://arxiv.org/abs/2509.02022
Availability: http://arxiv.org/abs/2509.02022
Accession Number: edsbas.7E7AC643
Database: BASE
Description
Abstract:In object-oriented languages software developers rely on thread-safe classes to implement concurrent applications. However, determining whether a class is thread-safe is a challenging task. This paper presents a highly scalable method to analyze thread-safety in Java classes. We provide a definition of thread-safety for Java classes founded on the correctness principle of the Java memory model, data race freedom. We devise a set of properties for Java classes that are proven to ensure thread-safety. We encode these properties in the static analysis tool CodeQL to automatically analyze Java source code. We perform an evaluation on the top 1000 GitHub repositories. The evaluation comprises 3632865 Java classes; with 1992 classes annotated as @ThreadSafe from 71 repositories. These repositories include highly popular software such as Apache Flink (24.6k stars), Facebook Fresco (17.1k stars), PrestoDB (16.2k starts), and gRPC (11.6k starts). Our queries detected thousands of thread-safety errors. The running time of our queries is below 2 minutes for repositories up to 200k lines of code, 20k methods, 6000 fields, and 1200 classes. We have submitted a selection of detected concurrency errors as PRs, and developers positively reacted to these PRs. We have submitted our CodeQL queries to the main CodeQL repository, and they are currently in the process of becoming available as part of GitHub actions. The results demonstrate the applicability and scalability of our method to analyze thread-safety in real-world code bases.