Thursday, September 24 • 11:35am - 12:25pm
RAIDShield: Characterizing, Monitoring, and Pro-actively Protecting Against Disk Failures

Modern storage systems orchestrate a group of disks to achieve their performance and reliability goals. Even though such systems are designed to withstand the failure of individual disks, failure of multiple disks poses a unique set of challenges. We empirically investigate disk failure data from a large number of production systems, specifically focusing on the impact of disk failures on RAID storage systems. Our data covers about one million SATA disks from 6 disk models for periods up to 5 years. We show how observed disk failures weaken the protection provided by RAID. The count of reallocated sectors correlates strongly with impending failures.

Learning Objectives

Empirical investigation of hard disk failures in production systems
Proactive protection of individual disk drives
Proactive protection of RAID storage system
Deployment results of the proactive protection in production system


Ao Ma

Principal Engineer, EMC
Ao Ma is a principal engineer at the Advanced Development Group of EMC Core Technology Division, part of the CTO office, where he works on innovations in file system and storage technologies. | | He published papers on top storage system conferences and journals. Ao also has 1 patent granted, 6 patents filed, and was awarded the Excellence@EMC Platinum Award (2014) and Excellence@EMC Gold Award(2013) for his contribution to the company.

Thursday September 24, 2015 11:35am - 12:25pm
Cypress Room

