Why not RAID-0? It’s about Time and Snowflakes



Before panicking – disk failures are rare. Google’s 2007 paper, Failure Trends in a Large Disk Drive Population, reported that in their datacenters, 1.7% of disks failed in the first year of their life, while three-year-old disks were failing at a rate of 8.6%. About 9% isn’t a good number.(超过三年的硬盘发生问题的概率在9%) 8块超过3年的磁盘同时使用出现问题的概率在1-(1-0.086)^8 = 0.513,这个几率还是相当高的。这个还不是主要的问题,因为JBOD: Just a Box of Disks也会遇到这个问题。

主要问题是,如果一旦一块磁盘出现问题的话,那么所有的磁盘上的数据都需要进行replication.因为RAID0是strip存储的,每个disk上面可能存储一个small block(64KB),而HDFS使用64MB作为block。这就意味着1个HDFS block在10 RAID0 disks上面的话会分摊在10个disk上面,如果一个disk出现问题的话,那么所有的HDFS block都发生损坏就都要进行replication

Every Disk is a Unique Snowflake