The Meaning of Mean Time Between Failures (MTBF)

One of my guys emailed me this, I wish I knew where it came for so I could give credit. This goes a long way towards explaining why we’ve seen so many problems with supposedly good hardware.

You’ve probably seen hardware manufacturers talk about the Mean Time Between Failures (MTBF) for their equipment. For example, disk drive manufacturers often claim MTBFs of several hundred thousand hours. This statistic sounds great (and it is, compared with hardware MTBFs from 10 or 15 years ago), but if you have more than a handful of servers or disks, you’ll quickly find that the MTBFs have to be carefully considered.

In fact, it helps to know how disk manufacturers calculate the MTBF in the first place—they take a batch of drives (several hundred to a few thousand) and test them under fixed environmental conditions. When the first drive in the batch fails, they use the test run time to calculate the MTBF. Let’s say that there are 1250 disks in the batch, and they run for 28 days before the first one fails. The MTBF can thus be cited as 28 * 24 * 1250 = 840,000 hours. Remember, this is the average time between failures; it’s not a guarantee.

What does this mean for your availability? Say that you have ten servers. Each server has a dozen disks, the MTBF of which is 100,000 hours each. Simply put, that means that you can expect any individual disk to last for 100,000 hours, or about 11 years. However, the MTBF of any one of your servers is actually the sum of the MTBFs of its components: twelve disks, each with a 100,000-hour MTBF, means that on average you can expect one drive to fail every 100,000/12 = 8333 hours, or slightly less than once per year. As you increase the number of servers and disks (and as you factor in other electromechanical components, such as fans and power supplies), you can see that adding servers and disks actually increases the odds of a failure.

Previous Post
Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: