LPCC: Hierarchical Persistent Client Caching for Lustre

Most high-performance computing (HPC) clusters use a global parallel file system to enable high data throughput. The parallel file system is typically centralized and its storage media are physically separated from the compute cluster. Compute nodes as clients of the parallel file system are often a...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	SC19: International Conference for High Performance Computing, Networking, Storage and Analysis s. 1 - 14
Hlavní autoři:	Qian, Yingjin, Li, Xi, Ihara, Shuichi, Dilger, Andreas, Thomaz, Carlos, Wang, Shilong, Cheng, Wen, Li, Chunyan, Zeng, Lingfang, Wang, Fang, Feng, Dan, Sub, Tim, Brinkmann, Andre
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	ACM 17.11.2019
Témata:	File systems hierarchical storage management High performance computing intelligent prefetching Layout Lustre Media persistent caching Prefetching Throughput
ISSN:	2167-4337
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Most high-performance computing (HPC) clusters use a global parallel file system to enable high data throughput. The parallel file system is typically centralized and its storage media are physically separated from the compute cluster. Compute nodes as clients of the parallel file system are often additionally equipped with SSDs. The node internal storage media are rarely well-integrated into the I/O and compute workflows. How to make full and flexible use of these storage media is therefore a valuable research question. In this paper, we propose a hierarchical Persistent Client Caching (LPCC) mechanism for the Lustre file system. LPCC provides two modes: RW-PCC builds a read-write cache on the local SSD of a single client; RO-PCC distributes a read-only cache over the SSDs of multiple clients. LPCC integrates with the Lustre HSM solution and the Lustre layout lock mechanism to provide consistent persistent caching services for I/O applications running on client nodes, meanwhile maintaining a global unified namespace of the entire Lustre file system. The evaluation results presented in this paper show LPCC's advantages for various workloads, enabling even speed-ups linear in the number of clients for several real-world scenarios.
ISSN:	2167-4337
DOI:	10.1145/3295500.3356139