An Open Data Service for Supporting Research in Machine Learning on Tokamak Data

The increasing complexity and volume of plasma fusion experimental data, coupled with the growing adoption of machine learning in fusion research, necessitate advanced and efficient data management solutions. We propose an open data service for fusion experiments operated by the UKAEA, designed to a...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on plasma science Ročník 53; číslo 9; s. 2440 - 2449
Hlavní autori: Jackson, Samuel, Khan, Saiful, Cummings, Nathan, Hodson, James, de Witt, Shaun, Pamela, Stanislas, Akers, Rob, Thiyagalingam, Jeyan
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: IEEE 01.09.2025
Predmet:
ISSN:0093-3813, 1939-9375
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The increasing complexity and volume of plasma fusion experimental data, coupled with the growing adoption of machine learning in fusion research, necessitate advanced and efficient data management solutions. We propose an open data service for fusion experiments operated by the UKAEA, designed to address the evolving needs of machine-learning-driven fusion research. Our system provides a framework to organize MAST, MAST upgrade (MAST-U), and Joint European Torus (JET) experimental data in accordance with findability, accessibility, interoperability, and reuse (FAIR) principles, using distributed object storage for scalability and a relational database for efficient metadata indexing. In addition, it offers simplified abstractions through an application programming interface (API), facilitating seamless data access and integration with data analysis and machine learning workflows. Performance evaluation of metrics such as data load time and throughput, across varying numbers of parallel workers, demonstrates the data pipeline's optimization for efficient machine learning application development. Our solution significantly enhances support for data-driven research and machine learning applications in fusion by laying the groundwork for open, FAIR-compliant fusion data, which enables cross-machine analysis, prompts international collaboration, and potentially accelerates advancements in fusion energy research.
ISSN:0093-3813
1939-9375
DOI:10.1109/TPS.2025.3583419