Backblaze logo

4 TB USB Restore Drives Are Here: Yay!


blog-4TB-restores
Backblaze is increasing the maximum size of our USB Hard Drive restores to use 4 TB external drives. That means when you order a USB Hard Drive restore, you can now select to restore up to 3.5 terabytes of data. That’s about a 30% increase over the previous 3 TB drives that topped out at 2.6 terabytes of data.

The price is $189, the same as before. That price includes a USB hard drive with your data, FedEx next day shipping once your data is ready, and as always you get to keep the drive. There’s no extra charge for the next day shipping and if next day shipping is not available, we’ll use the fastest means available to us via FedEx. To date we’ve shipped restore drives to Backblaze customers in all 50 states and many countries around the globe including Canada, Spain, Germany, Japan, China, France, Australia, Italy, Belgium, New Zealand, Great Britain, Trinidad and Tobago, Bermuda, Israel, Qatar, and more. In short, you’ll get your data fast and you get to keep the drive – easy.

A few of things to know

  • As noted, the 4 TB drive maxes out at about 3.5 TB of data. If you have more data than that, you will need to order additional USB Hard Drive restores. Each drive you order is $189.
  • If you have multiple computers that you need to restore from, you will need to order one drive for each separate restore.
  • If you purchase a USB Hard Drive restore, we will ship you a drive large enough to accommodate your data. For example, if you are restoring 2.1 TB of data we could ship you a 3 TB drive. The price will be $189 no matter what size drive we send.

The different ways to restore data with Backblaze

  • Web Browser – For free, you select the files/folders you want to restore using your web browser to download the files. This is good for small amounts of data, typically 1GB or less as the web browser itself is prone to timeouts and errors.
  • Backblaze Downloader – For free, you select the files/folders you want to restore using your web browser and then download and use the Backblaze Downloader to stream and checkpoint the data download. This is similar to apps like iTunes and Netflix in how data is downloaded. Be aware that larger amounts of data will consume lots of network bandwidth and will take time.
  • USB Flash Drive – For $99, you select up to 110 GB of files/folders you want to restore and get your files on 128 GB USB flash drive. We send it to you next day express (within the US) so you get your data fast and you get to keep the drive.
  • USB External Hard Drive – For $189, you select up to 3,500,000 MB of files/folders you want to restore and get your files on an external USB hard drive large enough to hold your data, up to a 4 TB drive. Once prepared, we send it to you next day express (within the US) and you get to keep the drive..

 



Hard Drive Temperature – Does It Matter?


blog-drive-temperature

How much does operating temperature affect the failure rates of disk drives? Not much.

The unlimited online backup service provided by Backblaze requires a lot of storage. In fact, we recently passed the 100-petabyte mark in our data center. This means we use disk drives. A lot of disk drives.

The Backblaze Storage Pod is designed to provide good airflow over the disk drives, so they don’t get too hot. Still, different locations inside a pod, and different locations within a data center will have different temperatures, and we wondered whether that was a problem for the drives.

What Other People Say

Google and Microsoft have both done studies on disk drive temperature in their data centers. Google found that temperature was not a good predictor of failure, while Microsoft and the University of Virginia found that there was a significant correlation.

Disk drive manufacturers tell Backblaze that in general, it’s a good idea to keep disks cooler so they will last longer.

All Drives: No Correlation

After looking at data on over 34,000 drives, I found that overall there is no correlation between temperature and failure rate.

To check correlations, I used the point-biserial correlation coefficient on drive average temperatures and whether drives failed or not. The result ranges from -1 to 1, with 0 being no correlation, and 1 meaning hot drives always fail.

Correlation of Temperature and Failure: 0.0

Disk Drive Temperature Range

It turns out that different drive models run at different temperatures, and this can throw off the stats when looking at the entire population. If in a given ambient air temperature, drive model A runs warmer than drive B, and drive A fails more, that will make it look like there is a correlation when there isn’t.

This table shows the average temperature, in degrees Celsius, of different drive models:

Model Avg. Temp (C)
Seagate Barracuda LP (ST31500541AS) 21.92
Seagate Desktop HDD.15 (ST4000DM000) 22.10
Seagate Barracuda Green (ST1500DL003) 22.86
Western Digital Red (WDC WD30EFRX) 23.05
Seagate Barracuda LP (ST32000542AS) 23.27
Western Digital Caviar Green (WDC WD30EZRX) 23.46
Seagate Barracuda 7200.14 (ST3000DM001) 24.71
Western Digital Caviar Green (WDC WD10EACS) 25.23
Seagate Barracuda XT (ST33000651AS) 25.40
Hitachi Deskstar 5K4000 (Hitachi HDS5C4040ALE630) 25.42
Seagate Barracuda 7200.11 (ST31500341AS) 25.73
Toshiba DT01ACA Series (TOSHIBA DT01ACA300) 25.82
Hitachi Deskstar 5K3000 (Hitachi HDS5C3030ALA630) 26.46
Hitachi Deskstar 7K3000 (Hitachi HDS723030ALA640) 26.75
HGST Deskstar 7K4000 (HGST HDS724040ALE640) 27.22
Hitachi Deskstar 7K2000 (Hitachi HDS722020ALA330) 27.39
HGST Megascale 4000 (HGST HMS5C4040ALE640) 27.84
Western Digital Caviar Green (WDC WD10EADS) 27.93
Seagate Barracuda XT (ST4000DX000) 30.54

Continue reading…

Comments Closed


Storage Pod 4.0: Direct Wire Drives – Faster, Simpler and Less Expensive


blog-pod40-header
For the first time since the original Storage Pod, Backblaze is announcing a completely redesigned approach with the introduction of the first “direct wire” Storage Pod. This new Storage Pod performs four times faster, is simpler to assemble, and delivers our lowest cost per gigabyte of data storage yet. And, once again, it’s open source.

The Original Storage Pod

In order to provide unlimited online backup for $5 per month when Backblaze first started in 2007, we needed to figure out the least expensive way to “attach a hard drive to the Internet.” This meant cost-efficiently attaching as many drives as possible to a single motherboard. We tried USB hubs, daisy-chaining Firewire, and various other approaches. In the end, we found one that worked: port multiplier backplanes.

We used these port multipliers to design a system with 9 five-drive NAS backplanes that connected via 3 SATA cards to the motherboard. The results were incredibly dense storage, currently 180TB in a 4U rack, from off the shelf consumer commodity parts. This design has served us through Storage Pod versions 1.0, 2.0, and 3.0, and today stores 100 petabytes of customer data in the Backblaze cloud.

However, the port multiplier backplanes had three key issues:

  1. They were one of the least reliable hardware components,
  2. When they had an issue, they affected 5 drives at once, and
  3. They were not completely a commodity part, thus making them somewhat difficult to buy (especially for someone building a single Storage Pod on their own.)

The New Storage Pod Design
Continue reading…

Comments Closed


100 Petabytes of Cloud Data


blog-100-petabytes
Wow. Backblaze is now storing 100 petabytes of customer data in our cloud storage.

100 petabytes is a hard number to wrap our heads around, so…

How much data is 100 petabytes?

100 petabytes
100,000 terabytes
100,000,000 gigabytes
100,000,000,000 megabytes

Here are a few comparisons to help contextualize what 100 petabytes:
* 1/4th as much data as Facebook stores today for its 1+ billion users.
* 11,415 years of HD video watched 24×7 could be stored.
* $51,600,000 spent annually to store this much data on Amazon S3.
* 33 billion songs stored, or all of the songs iTunes has 1270 times over.

All of this data is stored on Backblaze’s custom-built and open-sourced Storage Pods, filled with approximately 30,000 hard drives (many of which were “farmed” and from which we analyzed “which hard drive you should buy“), and all to provide unlimited online backup.

What’s also crazy is that in Jan 2011, Backblaze had just 10 petabytes:
* It took 2.5 years to get from 0 to 10 petabytes.
* It took 3.5 years to get from 10 petabytes to 100 petabytes.
infographic-100-petabytes2

Wondering where these comparisons came from?
* June 2013, Facebook announced it stored 250 petabytes of data and was adding 15 petabytes per month. 10 months have passed, so Facebook should be storing:
=> 250 petabytes + (15 petabytes/month * 10 months) = 400 petabytes.
* HD takes up about 1 GB per video-hour:
=> 100 petabytes is 100,000,000 GB, or 100,000,000 hours = 11,415 years.
* In Northern California, Amazon S3 is priced at $0.094/GB/month to start:
=> 100,000,000 GB * $0.094/GB/month * 12 months = $112,800,000/year.
However, as you store more data, S3 gets cheaper and other regions cost less. Picking the lowest cost region and the lowest cost tier of pricing, we get:
=> 100,000,000 GB * $0.043/GB/month * 12 = $51,600,000/year.
* An average song takes up 3 megabytes, resulting in 33 billion songs fitting into 100 petabytes. iTunes has 26 million songs available, or:
=> 1/1270th of the number that can fit in Backblaze’s current cloud storage.

Comments Closed


Our Secret Data Center


Sacramento DC - Wall of Pods

We have a secret. A 500 Petabyte secret. Back in August 2012, Backblaze posted our need for a new data center. At the time we had about 40 Petabytes of storage and our current caged facility in Oakland was bursting at the seams. After our post, proposals rolled in from around the US – Utah, Texas, California, and even Iowa to name a few. After several weeks of phone calls and meetings with several providers we selected a winner and got busy. Electric had to be run, cabinets had to be built and networks had to wired, all before we could install Backblaze Storage Pods.

    Sac DC Blog 1A copy

    Empty Cabinets Waiting for Backblaze
    Sac DC Blog 2A

    Operations Gets to Work Installing Backblaze Storage Pods

The new data center is located just outside of Sacramento California. For those of you who worry about such things, the data center is located in one of the most stable geographic locations in California and is outside of earthquake fault zones and flood plains. The data center location is a “Very Low Risk” for tornadoes and to the best of our knowledge the data center has never experienced a plague of locusts. The data center also has SAS 70 Type II and ISO 9001 certifications and is PCI-DSS compliant.
blog-california-earthquake

Besides the lack of locusts, what attracted us to the Sacramento location? It was inexpensive from an operational point of view and also practical from a staffing point of view. You see we like to keep our hands on everything and Sacramento was close enough so that we could do just that. Yes, our operations staff logged a few I-80 miles between Oakland and Sacramento to make sure that everything went as smoothly as possible, but I think you’ll agree its been worth it. In the next couple of days we’ll post a position description for a data center technician at our new data center, stay tuned.

    Sac DC Blog 4A

    2/21/2013 – The First 5 Gbps Network Segment Comes Online in Sacramento

The Sacramento data center started accepting customer data way back in February 2013 and by September 2013 all new customer accounts were being serviced there. You probably didn’t notice. We expect to be able to store about 1/2 an Exabyte (500 Petabytes) of customer data at our new data center. That should last us a little while but just in case the operations folks are already scouting locations in Lake Tahoe for the next data center, after all it is just a short ride east from Sacramento.

 

Comments Closed


What Hard Drive Should I Buy?


blog-which-drive-to-buy

My last two blog posts were about expected drive lifetimes and drive reliability. These posts were an outgrowth of the careful work that we’ve done at Backblaze to find the most cost-effective disk drives. Running a truly unlimited online backup service for only $5 per month means our cloud storage needs to be very efficient and we need to quickly figure out which drives work.

Because Backblaze has a history of openness, many readers expected more details in my previous posts. They asked what drive models work best and which last the longest. Given our experience with over 25,000 drives, they asked which ones are good enough that we would buy them again. In this post, I’ll answer those questions.

Drive Population

At the end of 2013, we had 27,134 consumer-grade drives spinning in Backblaze Storage Pods. The breakdown by brand looks like this:

Hard Drives by Manufacturer Used by Backblaze
Brand Number
of Drives
Terabytes Average
Age in Years
Seagate 12,765 39,576 1.4
Hitachi 12,956 36,078 2.0
Western Digital 2,838 2,581 2.5
Toshiba 58 174 0.7
Samsung 18 18 3.7

As you can see, they are mostly Seagate and Hitachi drives, with a good number of Western Digital thrown in. We don’t have enough Toshiba or Samsung drives for good statistical results.

Why do we have the drives we have? Basically, we buy the least expensive drives that will work. When a new drive comes on the market that looks like it would work, and the price is good, we test a pod full and see how they perform. The new drives go through initial setup tests, a stress test, and then a couple weeks in production. (A couple of weeks is enough to fill the pod with data.) If things still look good, that drive goes on the buy list. When the price is right, we buy it.

We are willing to spend a little bit more on drives that are reliable, because it costs money to replace a drive. We are not willing to spend a lot more, though.

Excluded Drives

Some drives just don’t work in the Backblaze environment. We have not included them in this study. It wouldn’t be fair to call a drive “bad” if it’s just not suited for the environment it’s put into.

We have some of these drives running in storage pods, but are in the process of replacing them because they aren’t reliable enough. When one drive goes bad, it takes a lot of work to get the RAID back on-line if the whole RAID is made up of unreliable drives. It’s just not worth the trouble.

The drives that just don’t work in our environment are Western Digital Green 3TB drives and Seagate LP (low power) 2TB drives. Both of these drives start accumulating errors as soon as they are put into production. We think this is related to vibration. The drives do somewhat better in the new low-vibration Backblaze Storage Pod, but still not well enough.

These drives are designed to be energy-efficient, and spin down aggressively when not in use. In the Backblaze environment, they spin down frequently, and then spin right back up. We think that this causes a lot of wear on the drive.

Failure Rates

We measure drive reliability by looking at the annual failure rate, which is the average number of failures you can expect running one drive for a year. A failure is when we have to replace a drive in a pod.

blog-fail-drives-manufacture

This chart has some more details that don’t show up in the pretty chart, including the number of drives of each model that we have, and how old the drives are:

Number of Hard Drives by Model at Backblaze
Model Size Number
of Drives
Average
Age in
Years
Annual
Failure
Rate
Seagate Desktop HDD.15
(ST4000DM000)
4.0TB 5199 0.3 3.8%
Hitachi GST Deskstar 7K2000
(HDS722020ALA330)
2.0TB 4716 2.9 1.1%
Hitachi GST Deskstar 5K3000
(HDS5C3030ALA630)
3.0TB 4592 1.7 0.9%
Seagate Barracuda
(ST3000DM001)
3.0TB 4252 1.4 9.8%
Hitachi Deskstar 5K4000
(HDS5C4040ALE630)
4.0TB 2587 0.8 1.5%
Seagate Barracuda LP
(ST31500541AS)
1.5TB 1929 3.8 9.9%
Hitachi Deskstar 7K3000
(HDS723030ALA640)
3.0TB 1027 2.1 0.9%
Seagate Barracuda 7200
(ST31500341AS)
1.5TB 539 3.8 25.4%
Western Digital Green
(WD10EADS)
1.0TB 474 4.4 3.6%
Western Digital Red
(WD30EFRX)
3.0TB 346 0.5 3.2%
Seagate Barracuda XT
(ST33000651AS)
3.0TB 293 2.0 7.3%
Seagate Barracuda LP
(ST32000542AS)
2.0TB 288 2.0 7.2%
Seagate Barracuda XT
(ST4000DX000)
4.0TB 179 0.7 n/a
Western Digital Green
(WD10EACS)
1.0TB 84 5.0 n/a
Seagate Barracuda Green
(ST1500DL003)
1.5TB 51 0.8 120.0%

The following sections focus on different aspects of these results.

1.5TB Seagate Drives

The Backblaze team has been happy with Seagate Barracuda LP 1.5TB drives. We’ve been running them for a long time – their average age is pushing 4 years. Their overall failure rate isn’t great, but it’s not terrible either.

The non-LP 7200 RPM drives have been consistently unreliable. Their failure rate is high, especially as they’re getting older.

1.5 TB Seagate Drives Used by Backblaze
Model Size Number
of Drives
Average
Age in
Years
Annual
Failure
Rate
Seagate Barracuda LP
(ST31500541AS)
1.5TB 1929 3.8 9.9%
Seagate Barracuda 7200
(ST31500341AS)
1.5TB 539 3.8 25.4%
Seagate Barracuda Green
(ST1500DL003)
1.5TB 51 0.8 120.0%

The Seagate Barracuda Green 1.5TB drive, though, has not been doing well. We got them from Seagate as warranty replacements for the older drives, and these new drives are dropping like flies. Their average age shows 0.8 years, but since these are warranty replacements, we believe that they are refurbished drives that were returned by other customers and erased, so they already had some usage when we got them.

Bigger Seagate Drives

The bigger Seagate drives have continued the tradition of the 1.5Tb drives: they’re solid workhorses, but there is a constant attrition as they wear out.

2.0 to 4.0 TB Seagate Drives Used by Backblaze
Model Size Number
of Drives
Average
Age in
Years
Annual
Failure
Rate
Seagate Desktop HDD.15
(ST4000DM000)
4.0TB 5199 0.3 3.8%
Seagate Barracuda
(ST3000DM001)
3.0TB 4252 1.4 9.8%
Seagate Barracuda XT
(ST33000651AS)
3.0TB 293 2.0 7.3%
Seagate Barracuda LP
(ST32000542AS)
2.0TB 288 2.0 7.2%
Seagate Barracuda XT
(ST4000DX000)
4.0TB 179 0.7 n/a

The good pricing on Seagate drives along with the consistent, but not great, performance is why we have a lot of them.

Hitachi Drives

If the price were right, we would be buying nothing but Hitachi drives. They have been rock solid, and have had a remarkably low failure rate.

Hitachi Drives Used by Backblaze
Model Size Number
of Drives
Average
Age in
Years
Annual
Failure
Rate
Hitachi GST Deskstar 7K2000
(HDS722020ALA330)
2.0TB 4716 2.9 1.1%
Hitachi GST Deskstar 5K3000
(HDS5C3030ALA630)
3.0TB 4592 1.7 0.9%
Hitachi Deskstar 5K4000
(HDS5C4040ALE630)
4.0TB 2587 0.8 1.5%
Hitachi Deskstar 7K3000
(HDS723030ALA640)
3.0TB 1027 2.1 0.9%

Western Digital Drives

Back at the beginning of Backblaze, we bought Western Digital 1.0TB drives, and that was a really good choice. Even after over 4 years of use, the ones we still have are going strong.

We wish we had more of the Western Digital Red 3TB drives (WD30EFRX). They’ve also been really good, but they came after we already had a bunch of the Seagate 3TB drives, and when they came out their price was higher.

Western Digital Drives Used by Backblaze
Model Size Number
of Drives
Average
Age in
Years
Annual
Failure
Rate
Western Digital Green
(WD10EADS)
1.0TB 474 4.4 3.6%
Western Digital Red
(WD30EFRX)
3.0TB 346 0.5 3.2%
Western Digital Green
(WD10EACS)
1.0TB 84 5.0 n/a

What About Drives That Don’t Fail Completely?

Another issue when running a big data center is how much personal attention each drive needs. When a drive has a problem, but doesn’t fail completely, it still creates work. Sometimes automated recovery can fix this, but sometimes a RAID array needs that personal touch to get it running again.

Each storage pod runs a number of RAID arrays. Each array stores data reliably by spreading data across many drives. If one drive fails, the data can still be obtained from the others. Sometimes, a drive may “pop out” of a RAID array but still seem good, so after checking that its data is intact and it’s working, it gets put back in the RAID to continue operation. Other times a drive may stop responding completely and look like it’s gone, but it can be reset and continue running.

Measuring the time spent in a “trouble” state like this is a measure of how much work a drive creates. Once again, Hitachi wins. Hitachi drives get “four nines” of untroubled operation time, while the other brands just get “two nines”.

Untroubled Operation of Drives by Manufacturer used at Backblaze
Brand Active Trouble Number of Drives
Seagate 99.72 0.28% 12459
Western Digital 99.83 0.17% 933
Hitachi 99.99 0.01% 12956

Drive Lifetime by Brand

The chart below shows the cumulative survival rate for each brand. Month by month, how many of the drives are still alive?

blog-36-month-drive-survival-rate

Hitachi does really well. There is an initial die-off of Western Digital drives, and then they are nice and stable. The Seagate drives start strong, but die off at a consistently higher rate, with a burst of deaths near the 20-month mark.

Having said that, you’ll notice that even after 3 years, by far most of the drives are still operating.

What Drives Is Backblaze Buying Now?

We are focusing on 4TB drives for new pods. For these, our current favorite is the Seagate Desktop HDD.15 (ST4000DM000). We’ll have to keep an eye on them, though. Historically, Seagate drives have performed well at first, and then had higher failure rates later.

Our other favorite is the Western Digital 3TB Red (WD30EFRX).

We still have to buy smaller drives as replacements for older pods where drives fail. The drives we absolutely won’t buy are Western Digital 3TB Green drives and Seagate 2TB LP drives.

A year and a half ago, Western Digital acquired the Hitachi disk drive business. Will Hitachi drives continue their excellent performance? Will Western Digital bring some of the Hitachi reliability into their consumer-grade drives?


Correction: Hitachi’s 2.5″ hard drive business went to Western Digital, while the 3.5″ hard drive business went to Toshiba.

At Backblaze, we will continue to monitor and share the performance of a wide variety of disk drive models. What has your experience been?

Comments Closed


Enterprise Drives: Fact or Fiction?


blog-enterprise-vs-consumer

Last month I dug into drive failure rates based on the 25,000+ consumer drives we have and found that consumer drives actually performed quite well. Over 100,000 people read that blog post and one of the most common questions asked was:

“Ok, so the consumer drives don’t fail that often. But aren’t enterprise drives so much more reliable that they would be worth the extra cost?”

Well, I decided to try to find out.

In the Beginning
As many of you know, when Backblaze first started the unlimited online backup service, our founders bootstrapped the company without funding. In this environment one of our first and most critical design decisions was to build our backup software on the premise of data redundancy. That design decision allowed us to use consumer drives instead of enterprise drives in our early Storage Pods as we used the software, not the hardware, to manage redundancy. Given that enterprise drives were often twice the cost of consumer drives, the choice of consumer drives was also a relief for our founders’ thin wallets.

There were warnings back then that using consumer drives would be dangerous with, people saying:

    “Consumer drives won’t survive in the hostile environment of the data center.”
    “Backblaze Storage Pods allow too much vibration – consumer drives won’t survive.”
    “Consumer drives will drop dead in a year. Or two years. Or …”

As we have seen, consumer drives didn’t die in droves, but what about enterprise ones?

Failure Rates
In my post last month on disk drive life expectancy, I went over what an annual failure rate means. It’s the average number of failures you can expect when you run one disk drive for a year. The computation is simple:

Annual Failure Rate = (Number of Drives that Failed / Number of Drive-Years)

Drive-years a measure of how many drives have been running for how long. This computation is also simple:

Drive-Years = (Number of Drives x Number of Years)

For example, one drive for one year is one drive-year. Twelve drives for one month is also one drive-year.

Backblaze Storage Pods: Consumer-Class Drives
We have detailed day-by-day data about the drives in the Backblaze Storage Pods since mid-April of 2013. With 25,000 drives ranging in age from brand-new to over 4 years old, that’s enough data to slice the data in different ways and still get accurate failure rates. Next month, I’ll be going into some of those details, but for the comparison with enterprise drives, we’ll just look at the overall failure rates.

We have data that tracks every drive by serial number, which days it was running, and if/when it was replaced because it failed. We have logged:

    14719 drive-years on the consumer-grade drives in our Storage Pods.
    613 drives that failed and were replaced.

Commercially Available Servers: Enterprise-Class Drives
We store customer data on Backblaze Storage Pods which are purpose-built to store data very densely and cost-efficiently. However, we use commercially available servers for our central servers that store transactional data such as sales records and administrative activities. These servers provide the flexibility and throughput needed for such tasks. These commercially available servers come from Dell and from EMC.

All of these systems were delivered to us with enterprise-class hard drives. These drives were touted as solid long-lasting drives with extended warranties.

The specific systems we have are:

  • Six shelves of enterprise-class drives in Dell PowerVault storage systems.
  • One EMC storage system with 124 enterprise drives that we just brought up this summer. One of the drives has already failed and been replaced.
  • We have also been running one Backblaze Storage Pod full of enterprise drives storing users’ backed-up files as an experiment to see how they do. So far, their failure rate, has been statistically consistent with drives in the commercial storage systems.

    In the two years since we started using these enterprise-grade storage systems, they have logged:

      368 drive-years on the enterprise-grade drives.
      17 drives that failed and were replaced.

    Enterprise vs. Consumer Drives
    At first glance, it seems the enterprise drives don’t have that many failures. While true, the failure rate of enterprise drives is actually higher than that of the consumer drives!

    Enterprise Drives Consumer Drives
    Drive-Years of Service 368 14719
    Number of Failures 17 613
    Annual Failure Rate 4.6% 4.2%

    It turns out that the consumer drive failure rate does go up after three years, but all three of the first three years are pretty good. We have no data on enterprise drives older than two years, so we don’t know if they will also have an increase in failure rate. It could be that the vaunted reliability of enterprise drives kicks in after two years, but because we haven’t seen any of that reliability in the first two years, I’m skeptical.

    You might object to these numbers because the usage of the drives is different. The enterprise drives are used heavily. The consumer drives are in continual use storing users’ updated files and they are up and running all the time, but the usage is lighter. On the other hand, the enterprise drives we have are coddled in well-ventilated low-vibration enclosures, while the consumer drives are in Backblaze Storage Pods, which do have a fair amount of vibration. In fact, the most recent design change to the pod was to reduce vibration.

    Overall, I argue that the enterprise drives we have are treated as well as the consumer drives. And the enterprise drives are failing more.

    So, Are Enterprise Drives Worth The Cost?
    From a pure reliability perspective, the data we have says the answer is clear: No.

    Enterprise drives do have one advantage: longer warranties. That’s a benefit only if the higher price you pay for the longer warranty is less than what you expect to spend on replacing the drive.

    This leads to an obvious conclusion: If you’re OK with buying the replacements yourself after the warranty is up, then buy the cheaper consumer drives.

    Comments Closed


    Farming hard drives: 2 years and $1M later


    We are two years removed from the horrific flooding that caused the Thailand Drive Crisis that created a worldwide shortage of hard disk drives. Prices for hard drives spiked and have remained stubbornly high, only returning to pre-crisis levels in the last couple of months. The market and competitive forces that over the past 30 years have predictably driven the cost per gigabyte of storage down took a vacation. During these last two years Backblaze faced an additional $1 million in data storage costs if we went with the status quo. We didn’t. Instead here’s what we did.

    The End of a 30-Year Trend

    In the last 30 years the cost for a gigabyte of storage has decreased from over $1 million in 1981 to less than $0.05 in 2011. This is evidenced by the work of Matthew Komorowski. In addition, the cost per gigabyte also declined in an amazingly predictable fashion over that time.

    Beginning in October 2011 those 30-years of history went out the window.

    Since starting our service, Backblaze “worked the curve” to keep our costs low as we committed to providing unlimited online backup for just $5/month. We started building our Storage Pods with 1TB drives moving to 1.5TB, then 2TB, and then 3TB drives as the cost per gigabyte dropped from $0.11 to a little over $0.04 a gigabyte for the hard drives we purchased.

    All that changed in October 2011 with the onset of the Thailand Drive Crisis as seen below which depicts the cost per gigabyte that Backblaze paid for the hard drives we purchased.

    Cost per GB for Hard Drives

    In September 2011, our cost per gigabyte was $0.044. That low water mark would not be achieved again until September 2013. In that two-year period, our cost ran as high as $0.064 per gigabyte. While $0.02 per gigabyte doesn’t seem like a lot, Backblaze added about 50 Petabytes of storage during that period. When you do the math, a $0.02 increase per gigabyte would translate to a $1M increase in storage costs, but that’s not the whole story…

    Reality Bytes

    When we looked at the effect of the drive crisis on Backblaze, we considered three states:

    • Status Quo – This would be the cost to Backblaze if we ignored the drive crisis and continued to do the same thing we were doing. For example, buying the same hard drives from the same manufacturer regardless of the price.
    • Reality – This is our cost per gigabyte based on our actual purchases.
    • Historical – The expected cost as if nothing had happened and the cost per gigabyte continued downward at or near the historical rate.

    The chart below visualizes these three states over the time period of the drive crisis.

    Hard Drive Cost Comparison

    Prior to the drive crisis Backblaze almost exclusively purchased Hitachi 3TB Internal drives (Model: 0S03230). In September 2011 we could get these Hitachi hard drives for about $130 each ($0.044 per gigabyte). In November 2011 our cost for the same drive was $249 each ($0.083 per gigabyte), 88% higher, and they were really hard to find. Availability and prices varied over the next several months until the model of the Hitachi drives we were buying disappeared entirely with no suitable substitute from Hitachi.

    Beginning in October 2012, Backblaze transitioned from using 3TB drives to using 4TB drives in our Storage Pods. As in the past, moving to a higher density drive leads to a temporary increase in the cost per gigabyte of the hard drive cost. The switch to 4TB drives is seen as a jump in cost per gigabyte from October through December 2012.

    The “Reality” line of the graph is our actual purchases. When Hitachi drives got too expensive we switched to Seagate and Western Digital. When internal drives got too expensive we turned to external drives. When hard drive availability became an issue we turned to Drive Farming.

    Dollars and Cents

    Using the data from the chart above we can determine the costs for the “Status Quo”, “Reality” and “Historical” trend lines from October 2011 through September 2013.

      Status Quo – $2.92M
      Reality – $2.31M
      Historical – $1.78M

    If we had done nothing (Status Quo) these last two years we would have spent $1.14M more for hard drives versus their expected (Historical) cost. In reality the actions we took in the wake of the Drive Crisis “saved” Backblaze about $610,000. These actions included:

    • Switching hard drive manufacturers from Hitachi to Seagate and Western Digital. The assumption here is that all drives are the same and as we’ve learned that’s not really the case.
    • Buying and “shucking” external drives instead of higher priced internal drives.
    • Having our friends, family and customers become Backblaze Drive Farmers.

    What Else Could We Have Done?

    We had other paths of action we could have taken when the drive crisis hit:

    • Raise Prices – This was the response of most companies that use hard drives in bulk.
    • Throttle Upload Speeds – Less data going to the data center means less storage.
    • Freeze or Limit Signups – A lottery for new customers or maybe no new customers at all until things got better.

    In the end we did “none of the above”, instead choosing to shoulder the responsibility ourselves by doing drive farming, “shucking” external drives, and changing drive manufacturers to so we could keep with our commitment to $5/month unlimited online backup.

    Is It Over Yet?

    As we noted earlier, our price per gigabyte is finally about the same as it was before the Drive Crisis. You can see from the chart below that the trend line for the 4TB drives we are now using seems to align with the drives purchased outside of the Drive Crisis period (Oct 2011-September 2013). Since a majority of the drives we purchased during the Drive Crisis were 3TB, that trend line clearly shows the detrimental affect the Drive Crisis had over that two year period.

    Trends over time of Hard Drive purchases

    Closing Questions

    When the Drive Crisis started, industry pundits estimated that the hard drive market would take anywhere from 3 months to 1 year to recover. No one guessed two years. Was the delay simply an issue in rebuilding and/or relocating the manufacturing and assembly facilities? Did the fact that the two industry leaders, Seagate and Western Digital, had to integrate large acquisitions slow down the recovery and subsequent innovation? What about the dramatic shift towards tablets and away from Desktops and Laptops, has that changed the hard drive market and the declining cost per gigabyte trend line forever?

    Whatever lies ahead, we’ll adapt. “SSD Farming” anyone? Just kidding, for now…

    Comments Closed