Ad Hoc File Systems for High-Performance Computing

Storage backends of parallel compute clusters are still based mostly on magnetic disks, while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory (NVRAM) are deployed within compute nodes. Including these new storage technologies into scientific workfl...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of computer science and technology Ročník 35; číslo 1; s. 4 - 26
Hlavní autoři: Brinkmann, André, Mohror, Kathryn, Yu, Weikuan, Carns, Philip, Cortes, Toni, Klasky, Scott A., Miranda, Alberto, Pfreundt, Franz-Josef, Ross, Robert B., Vef, Marc-André
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Springer US 01.01.2020
Springer
Springer Nature B.V
Zentrum für Datenverarbeitung, Johannes Gutenberg University Mainz, Mainz 55128, Germany%Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA 94550, U.S.A.%Department of Computer Science, Florida State University, Tallahassee, FL 32306, U.S.A.%Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, IL 60439, U.S.A.%Department of Computer Architecture, Universitat Politecnica de Catalunya, Barcelona 08034, Spain%Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, U.S.A.%Computer Science Department, Barcelona Supercomputing Center, Barcelona 08034, Spain%Fraunhofer Institute for Industrial Mathematics ITWM, Fraunhofer-Platz 1, Kaiserslautern 67663, Germany
Springer Nature
Témata:
ISSN:1000-9000, 1860-4749
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Storage backends of parallel compute clusters are still based mostly on magnetic disks, while newer and faster storage technologies such as flash-based SSDs or non-volatile random access memory (NVRAM) are deployed within compute nodes. Including these new storage technologies into scientific workflows is unfortunately today a mostly manual task, and most scientists therefore do not take advantage of the faster storage media. One approach to systematically include nodelocal SSDs or NVRAMs into scientific workflows is to deploy ad hoc file systems over a set of compute nodes, which serve as temporary storage systems for single applications or longer-running campaigns. This paper presents results from the Dagstuhl Seminar 17202 “Challenges and Opportunities of User-Level File Systems for HPC” and discusses application scenarios as well as design strategies for ad hoc file systems using node-local storage media. The discussion includes open research questions, such as how to couple ad hoc file systems with the batch scheduling environment and how to schedule stage-in and stage-out processes of data between the storage backend and the ad hoc file systems. Also presented are strategies to build ad hoc file systems by using reusable components for networking and how to improve storage device compatibility. Various interfaces and semantics are presented, for example those used by the three ad hoc file systems BeeOND, GekkoFS, and BurstFS. Their presentation covers a range from file systems running in production to cutting-edge research focusing on reaching the performance limits of the underlying devices.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
Spanish Ministry of Science and Innovation (MICINN)
European Union (EU)
AC02-06CH11357; 1561041; 1564647; 1744336; 1763547; 1822737; 2014-SGR-1051; TIN2015-65316; 671591; AC52-07NA27344
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR)
German Research Foundation (DFG)
LLNL-JRNL-779789
National Science Foundation (NSF)
USDOE National Nuclear Security Administration (NNSA)
ISSN:1000-9000
1860-4749
DOI:10.1007/s11390-020-9801-1