Leedberg.com

The online home for Greg Leedberg, since 1995.

Sunday, May 07, 2006

My Backup Strategy

My computer is the centerpoint of my life. Partly, that's because I'm a geek, and a software engineer. But, I would wager a bet that many people, including non-geeks, are in the same situation. These days, our computers are home to at least our digital photographs, emails, word processed documents, music, financial data, and lots more depending on what you do with your computer. That being the case, it's easy to see that if any one of us were to lose our computer one day, we'd also be losing a lot of important data and memories.

Which is why everyone should back up their stuff. And, everyone should have a sensible backup strategy that protects in both small-scale and large-scale data loss scenarios. So today, I'm going to describe to you my personal backup strategy, which has developed over the span of my computing life, and I now think is pretty good.

First, what do I back up? Some people back up every bit on their hard drive. I don't. What's important to me is exactly what I described above -- my pictures, documents, music, etc. Basically, my data. Not my program installations, registry settings, and so on. On my computer, I have a very strict separation of program data and user data. All of my personal data is kept on one partition (which just happens in this case to be an entire drive), and my "My Documents" directory is mapped to that partition. My Windows installation directory and all of my programs are installed to a separate partition / drive. I figure, if I lose my hard drive, I can always re-install my programs, so what's really important is to make sure I don't lose my data. This also significantly cuts down on the size of my backups.

Now that it's clear what I've backing up, it's important to think of just what sorts of scenarios we're trying to protect our data from. The most common data loss scenario is simply a hard drive dying. This happens quite frequently. Another data loss scenario is that something physically happens to the computer itself -- a power surge, you drop the entire computer, the power supply catches fire and the case fills up with smoke, etc. This happens less frequently, but in thie worst case this causes the entire computer to be unusable. Another data loss scenario is that something happens to the entire house/building where you store your computer -- fire, flooding, etc. Lastly, it's worth considering the scenario of a large-scale natural disaster, such as earthquake, hurricance, or even military attack. In those sorts of cases, your data probably won't be first thing on your mind, but months later you'd probably start to wish that you had your old digital pictures and documents.

How do we protect against the most common scenario, of a hard drive failing? For this, I have two hard drives. One has all of my programs and OS on it, and one has all of my data. However, on my program-only drive, I also have a large partition that serves as a backup for the data-only drive. Every single night, I have an rsync script that performs an incremental backup of my data onto the backup partition. Since it is incremental, I only am moving the data that has changed, not the entire 40GB of data. Obviously, this plan requires that my backup partition be the same size as my data drive, and so my program drive needs to be significantly bigger than the data drive (so it can store all of my programs, plus the data backup). You just have to take that into account when upgrading drives. This backup is performed every night, since this is the most common type of data loss. So, if my data drive failed tomorrow, I would have a backup of that drive that is no more than 24 hours old.

With isolated drive failure covered, what about if the entire computer was incapacitated? i.e., the power supply catches fire, smoke fills the case, and every component is killed (thus eliminating both my data drive and the backup of the data drive). To protect against this, once a month I backup my data drive to DVD-RW media. This way, the backup is stored outside of the case, but is still accessible. I only do this once a month because this can't be automated with a script, so it requires more effort and time on my part. It's worth noting that DVD-RW (and CD-RW) media can't neccessarily be trusted, so to make this backup more reliable, I actually use two different DVD-RW discs, which I alternate between each month. So I always have one disc which is no more than 1 month old, and one disc which is no more than 2 months old. This is better than using just one, only to find out when I need it that it's actually not been working for the past several months. It's also worth noting that since obivously a 4.7GB DVD is not enough to hold all of my data, this is actually just a selective backup. I leave out big media items, like ripped music files and home movies. In a crunch, I could do without those items.

So now we've covered every data loss scenario except loss of the entire building, and large-scale disaster. I use just one backup method to protect against these two scenarios. For this, once a year I backup all of my data (including media) to a set of several DVDs, and then store this DVD set as far away from my computer as I possibly can. When I was at college, for instance, I would store these discs at home. The idea is, in the more likely case (of these two scenarios) that the building is lost due to fire or flood, you've got a backup set stored somewhere outside of the building that you can fall back on. Even in the more extreme case that your entire region is affected, hopefully the set is far enough away that it is still safe. Due to the lower chance of these scenarios happening, this set is only created once a year. So, worst case, you revert back to your data as of no more than 1 year ago. That's still better than no data at all. I use DVD-Rs here because they tend to be more reliable long-term than the re-writable kind. And since you can't erase them, once they are no longer the "latest" annual backups, you can keep them with the computer just so you can have some extreme roll-back capability, possibly spanning several years of these backups.

So, that's my backup strategy. You'll notice I don't use any special backup programs. Just rsync (a free, open source file sync'ing tool that comes with Linux and Cygwin) and any DVD/CD burning application. I think, on the whole, that backup programs are a scam.

If you don't already have a backup strategy, I hope that this will prompt you to start doing some sort of backup. And if you already do backups, I hope that maybe I've explored some scenarios you hadn't thought or given you some new ideas.

Labels: ,