From Failure to Insight: Analyzing Disk Breakdowns in Large-Scale HPC Environments

Disk failure data provides valuable insights for preventing failures, enhancing storage robustness, guiding system design and deployment, and ensuring reliable operations at data centers. This paper introduces two disk failure datasets collected from large-scale HPC production environments over the...

Full description

Saved in:
Bibliographic Details
Published in:SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis pp. 484 - 495
Main Authors: George, Anjus, Wang, Meng, Hanley, Jesse, Ransom, Garrett Wilson, Bent, John, Zimmer, Christopher
Format: Conference Proceeding
Language:English
Published: IEEE 17.11.2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first