Backblaze logo

Get well Amit!


Amit Gupta
Several years ago a guy sent us an email saying he liked our service and was planning on recommending Backblaze on his newsletter, Photojojo. We’d never heard of him or his newsletter but we said, “Great!”

Shortly thereafter he wrote “Backblaze: Backup Software You’ll Actually Use” and almost 10,000 people showed up to try the service!

Amit was not only a fan and a great boost for us, he’s also just a great guy. Unfortunately, two weeks ago Amit was diagnosed with Acute Leukemia and is now in for a long slog of chemo and more.

He’s asking for South Asians to do a free bone marrow check by mail. (Just run a cotton swab on your cheek.)

Apparently, Amit also likes “colorful photos, pizza, crafty projects, and macaroni,” and, of course, photos. Send him some cheery thoughts by emailing ([current year] at amit gupta dot com).

And Super Amit, from a few of us at Backblaze: Get well!



$94 trillion petabyte


IBM first hard drive
IBM is celebrating the 55th anniversary of the first hard drive. At the time, this was a breakthrough that would change the path of technology.

However, to put into perspective how far we have all come:

The hard drive IBM shipped in 1956:
* Stored 5 megabytes (MB)
* Cost $11,000 per megabyte
* Was 60 inches long x 68 inches high x 29 inches deep
* Weighed about 1 ton

In today’s dollars that would mean:

A $179 16 GB iPod Nano:
* Stores 3,200 times more data
* Would cost: $1,429,176,320
* Requires 8 semi-truck shipping containers to hold the data

A petabyte of storage would:
* Cost: $93,662,499,307,520
* Require a building the size of 10,814 football fields to hold the drives
* Require 472 of the world’s largest data centers to hold the drives

Compare that to being able to get a petabyte of storage for $117,000 and store it in a single rack. Of course, IBM researchers from the 50′s are clearly some of the giants on whose shoulders we stand.

Note: $1 in 1956 is $7.93 in 2010 adjusted for inflation.



Backblaze on EFactor network


A few months ago Backblaze joined EFactor, a worldwide network of entrepreneurs. Many entrepreneurs are busily creating content, writing code, preparing business plans, etc. – all while working from a coffee shop or traveling back-and-forth from meetings. At Backblaze, we wanted to offer all of these entrepreneurs our online backup service in the hope that they do not have one of those “Noooooo!!” moments when they realize the marketing plan they have spent many late nights working on just vanished with the laptop that was stolen out of the taxi cab.

In addition to being available as a resource on the EFactor site, Backblaze was invited to demo last week at the BCSSA Entrepreneurial Pitch Competition and interviewed on EFactor Radio. We hope to help many other great entrepreneurs through our relationship with EFactor.

If you are a current (or budding) entrepreneur, consider joining EFactor. As a bit of extra incentive, they are currently approaching the million-member mark and are awarding a 3-day trip to NYC or Amsterdam for the 1,000,000th member.



Server spring cleaning


Maintenance

When we started Backblaze almost four years ago, we signed a one-year lease on a 1/2 cabinet in a data center. At the time that seemed like a fairly significant commitment. Over the years we grew very quickly, adding cabinet after cabinet.

As such, we signed up for a large space with fantastic room for expansion and have been growing into it. However, we never moved out of original space that has a few cabinets. We decided to now make the move – a bit of spring cleaning in the winter.

What this means is that you may notice some outages. Our plan is:
* Some rolling maintenance for a couple days during which a small percentage of users will not always be backing up. The Backblaze Control Panel will say “Computer Offline” because it is not able to connect to our servers. In the future, we will update the Backblaze Control Panel so that it will differentiate between offline and maintenance.
* A maintenance of about 2 hours on Thursday afternoon and one of about 4 hours that will likely take place this weekend, during which parts of our website will not be available. Computers that are already backing up will continue to backup during this time until they stop backing up or lose an Internet connection. However, the ability to sign-in, purchase, add new computers, or request files for restore will be unavailable.

Your backed up files will be safe and your computer will automatically resume backing up as soon as the service is restored without any interaction from you.

If you would like to stay up to date with the maintenance, we will post additional notifications on our Backblaze Twitter account.

Thank you for being patient with us during this time.



Scheduled maintenance: Lessons learned


Sorry and Thank You
Last week we had a maintenance window that was scheduled for 12 hours. Instead, our core services were offline for a day and a half with some backups being throttled for an additional day.

I am very sorry for any inconvenience that this caused. Rest assured, we take any disruptions in service very seriously. I also wanted to thank our customers; I was amazed at how calm and supportive they were during this time.

As of Saturday at 3pm, everything has been working and the service for all users is live. Please note that at no time was your backed up data at risk.

What follows is more technical detail on what happened and what we intend to do.

The Original Maintenance Plan
We worked on our “central authority” cluster that maintains customer metadata, handles billing, prepares restores, etc. This is unrelated to the storage pods where all the backed up data is stored.

The maintenance plan was to migrate the metadata to another server running an upgraded OS and then to update permissions on that data. Operations like this with large volumes of data take time. We estimated the time based on a previous maintenance and wrote a multi-threaded script to update the permissions in attempt to accelerate the process.

What Happened
Due to the large data growth since our previous maintenance, we did not properly account for the time required. Then, the permissions update script failed to update all the files because there were too many threads. Rather than trying to fix the multi-threaded script in a rush, we ran the script single-threaded. This took quite a bit longer to run, but was safer than trying to rewrite code in a hurry.

We brought the site back up on Friday afternoon, but all customers starting to backup concurrently overwhelmed the system. We brought the service down briefly and started slowly allowing customers to backup again. (During this time, restores and other services were fully functional.) By 3pm Saturday, all customers were fully operational.

Takeaways
Taking a step back, we had a few basic lessons learned:

1. Estimate better.
This is not just an “eat your vegetables” approach. We have the data to produce better estimates. Specifically, we will factor in data growth rather than using previous maintenance experience.

2. Limit to 20 threads the permissions updating process.
Threads are good. Too many threads are not.

3. Bring the site back online in stages.
We have a lot of users. They have a lot of data. When the site comes online there is a massive flood of data and requests that strain the service. Bring users back incrementally.

Again, thank you for your patience and we hope to keep helping you protect your data for a long time to come.



Scheduled maintenance – Thursday 9/23/10


Backup Maintenance

UPDATE 6: All services are live! We are fully back online – new installs, trials, backups, purchases, and restores. We really appreciate how patient everyone has been with us during this time frame. Over the next couple days we will be digging in to everything that happened and working to figure out our lessons learned and how to prevent this from happening again. I will publish a blog post with the summary when we have worked through that. Again – thank you everyone for your patience and continued trust.

UPDATE 5: Website is back up! All pages of the website are now back up. However, as we are bringing all users back online, the load burst is significant. As such, many of these pages will be very slow. As a kind gesture to your fellow friends, please do not use the service yet unless urgent. We will be bringing new backups online as quickly as possible as well. Again, thank you for being patient with us.

UPDATE 4: We continue to work diligently to complete the maintenance, but it persists on taking longer than expected. We will not be online at 5p US/Pacific. We will post an update when we have a better sense of timing. We have moved our central authority server database to a new, faster and larger drive shelf. The data has been migrated, but the 8 million files on the volume need to have their permissions set. This takes time to complete, however there is no danger of data loss. Thank you for your patience.

UPDATE 3: We are still working to complete the maintenance. Your data is safe, and the system will resume to normal service when it is brought back online. We anticipate this to occur at 5p US/Pacific. At this time, customers will be able to install Backblaze and restore files. Existing clients will slowly resume their backups and might show offline for a short period. Thank you, again, for your patience.

UPDATE 2: We are working diligently to complete the maintenance, but it is taking longer than expected. We will post an update when we have a better sense of timing. Thank you for your patience.

UPDATE: Maintenance is taking longer than expected. We currently estimate full service restoration at 10p US/Pacific. Thank you for your patience.

We will be doing maintenance this Thursday from 3am to 3pm US/Pacific. Much of the service will be unavailable during this time.

Our home page, other “static” pages, and the blog will be accessible. However, all “dynamic” functions will be unavailable including the ability to download the application, subscribe to the service, access your account, and request a restore. Backups that are in progress when the maintenance starts will continue unless interrupted (such as by your computer shutting down or your broadband connection being disrupted.)

We apologize for any inconvenience this may cause and will work diligently to bring all services back up ahead of schedule. If you have any questions, feel free to email us at helpme at backblaze.com.



Don’t push that button


Dont Push the Big Red Button

As some of you noticed, Backblaze experienced an outage. I want to provide some detail on what happened and the current status.

The Cause
Backblaze stores your backed up data in a top-tier data center facility. Last night at 7:35 p.m., a security guard entered the facility. The door slammed, causing the protective covering to open on an “Emergency Power Off” switch and setting off alarms. While this had no impact, in a moment of confusion, the guard hoping to turn the alarms off, pressed the Big Red Button, and shut off all power to that zone. At 7:36 p.m., the duty engineer escalated the situation and a resolution plan was designed. By 8:03 the power was fully restored.

Backblaze Response
As soon as the power went out, Backblaze’s monitoring systems alerted us to the issue and we mobilized the company. Most of us went immediately to the data center, while others double-teamed in support to help instantly address any questions. We then started the phased procedure of bringing the service up again: Static web content (home page, help pages), dynamic web content (account pages, restore selection, billing), and finally all of the actual cloud storage.

We could have brought everything up very quickly, but we believe it’s critical to carefully check every system first. With over 5,000 spinning hard drives, this process takes a little while. Much of the team worked diligently through the night to bring the service back as quickly as possible.

Status
The static web pages were live within minutes of the power coming back online. We ran thorough tests throughout the system and fully brought the dynamic pages up this morning. This means you can browse the entire site, sign-in to your account, browse the files you have backed up, and even request (but not yet receive) a restore.

We expect to finish checking enough of the cloud storage systems later this afternoon to turn on the ability for backups to resume. At that point, most requests to restore data will also be fulfilled. However, some restores will be delayed a bit longer if they contain data on systems that we have not finished testing. As soon as we’re done, all restores will complete.

At this point, everything is progressing smoothly and we expect to have every piece of the service restored to complete operating procedure sometime this evening. While it is tempting to lock the Emergency Power Off switches, that would obviously defeat their purpose. However, we are looking at ways to speed the process in the future of performing all necessary tests in order to recover more quickly from any type of unplanned shutdown. Thank you for being patient with us as we work through this.



Backups in Baghdad:
Protecting data around the world


“Jason”, not his real name, is an electrical engineer contracting for the military. I wanted to share the comments and photos he sent us about his experience with Backblaze in Bagdad:

My being here is a once in a lifetime opportunity, and I
don’t want something like a hard drive failure to sour it. My computer
is my livelihood, and not having to worry about my irreplaceable data
allows me to focus on the mission in support of the Coalition
soldiers.

That being said, bandwidth is at a premium, and Backblaze has been
great to let me know EXACTLY what’s backed up, what needs to back up,
and for being able to adjust how much bandwidth it uses has also been
critical.

My family will be very happy to see all the photos that I
have taken over my travels and stay while in Baghdad, and I’m glad
knowing that even if my computer is damaged or destroyed, I will be
able to get that data back.

Here are a few of Jason’s photos of Baghdad:
Baghdad Backup 1483small
Baghdad Backup 1849small
Baghdad Backup 1445small
Baghdad Backup 1849small



Close