An Open Data Service for Supporting Research in Machine Learning on Tokamak Data

The increasing complexity and volume of plasma fusion experimental data, coupled with the growing adoption of machine learning in fusion research, necessitate advanced and efficient data management solutions. We propose an open data service for fusion experiments operated by the UKAEA, designed to a...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on plasma science Ročník 53; číslo 9; s. 2440 - 2449
Hlavní autoři: Jackson, Samuel, Khan, Saiful, Cummings, Nathan, Hodson, James, de Witt, Shaun, Pamela, Stanislas, Akers, Rob, Thiyagalingam, Jeyan
Médium: Journal Article
Jazyk:angličtina
Vydáno: IEEE 01.09.2025
Témata:
ISSN:0093-3813, 1939-9375
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The increasing complexity and volume of plasma fusion experimental data, coupled with the growing adoption of machine learning in fusion research, necessitate advanced and efficient data management solutions. We propose an open data service for fusion experiments operated by the UKAEA, designed to address the evolving needs of machine-learning-driven fusion research. Our system provides a framework to organize MAST, MAST upgrade (MAST-U), and Joint European Torus (JET) experimental data in accordance with findability, accessibility, interoperability, and reuse (FAIR) principles, using distributed object storage for scalability and a relational database for efficient metadata indexing. In addition, it offers simplified abstractions through an application programming interface (API), facilitating seamless data access and integration with data analysis and machine learning workflows. Performance evaluation of metrics such as data load time and throughput, across varying numbers of parallel workers, demonstrates the data pipeline's optimization for efficient machine learning application development. Our solution significantly enhances support for data-driven research and machine learning applications in fusion by laying the groundwork for open, FAIR-compliant fusion data, which enables cross-machine analysis, prompts international collaboration, and potentially accelerates advancements in fusion energy research.
ISSN:0093-3813
1939-9375
DOI:10.1109/TPS.2025.3583419