你的关于硬盘相关的知识,全部都是错的!Everything You Know About Disks Is Wrong
你的关于硬盘相关的知识,全部都是错的!
http://storagemojo.com/?p=383
"The Google engineers just published a paper on Failure Trends in a Large Disk Drive Population. Based on a study of 100,000 disk drives over 5 years they find some interesting stuff. To quote from the abstract: 'Our analysis identifies several parameters from the drive's self monitoring facility (SMART) that correlate highly with failures. Despite this high correlation, we conclude that models based on SMART parameters alone are unlikely to be useful for predicting individual drive failures. Surprisingly, we found that temperature and activity levels were much less correlated with drive failures than previously reported.'"
google的工程师edpin,wolf,luiz调查研究了 5 年间10万片硬盘的使用情况,发表了一篇关于硬盘的论文,论文见附件
得到的结论是:
在单个硬盘上,SMART所报的硬盘当前状态参数和硬盘失效没有任何相关性,SMART根本无法预测或报告当前硬盘的健康状况。
更让人吃惊的是:
硬盘温度和硬盘的工作强度(activity levels)和硬盘失效请注意称呼用词关系都没有。
另一个研究机构的报告也得出了相似的结论:
http://www.usenix.org/events/fas ... der_html/index.html
以上两个报告表明:
* Expensive 'enterprise' drives don't have notably better reliability than their 'consumer' counterparts (consider this conclusion in the context of my past recommendation of Western Digital 10,000 RPM Raptor SATA HDDs as a credible alternative to other manufacturers' much more costly SAS drives)
* S.M.A.R.T. error reporting only encompasses a fraction of all experience HDD failure mechanisms, and, specifically to this writeup's theme,
* RAID 1 and 5 are less robust than might appear to be the case at first glance...particularly when (as in my case...ahem) all of the drives in the RAID array come from the same manufacturer, and especially when they come from the same manufacturing lot. If one drive fails, the likelihood that a second drive will fail shortly thereafter is uncomfortably...likely.
*相对消费级的硬盘,昂贵的企业级硬盘驱动器并没有表现出更好的可靠性,因此,在这种情况下,我曾推荐的西数10000 RPM的猛禽消费级SATA硬盘,理所当然地可以作为一个相对更为可靠的昂贵企业级SAS(Serial-Attached SCSI )硬盘驱动器的替代
* s.m.a.r.t.错误报告只涵盖了一小部分硬盘失效的机制,特别是在我描述的这种情况下。
*RAID 1和5.更加显得脆弱 ,尤其是如同这种情况时:所有的硬盘来自同一制造商,尤其是当他们来自同一个批次,如果一个驱动器出故障,有可能第二个驱动器就会在此后不久出现令人不安的失效
如果你懂点硬盘常识,以下内容你会认为都是对的:
* Costly FC and SCSI drives are more reliable than cheap SATA drives.
* RAID 5 is safe because the odds of two drives failing in the same RAID set are so low.
* After infant mortality, drives are highly reliable until they reach the end of their useful life.
* Vendor MTBF are a useful yardstick for comparing drives.
我们想当然的硬盘知识:
*昂贵的光纤接口或SCSI硬盘比廉价的SATA硬盘更可靠。
*因为在同一RAID卡上的两个硬盘同时失效的概率是如此之低,所以RAID5是安全的,
*在经过最初的高失效期后,硬盘就具有高度可靠性,直至他们达到可用寿命。
*厂商给的MTBF(平均无故障使用时间)是一个有用的比较驱动器可靠性的尺度。
实际上的情况,以上认识全是错的!! |