If data can be lost, it will

Murphy was right. If something can go wrong, it will… eventually. Earlier this week, the water main connected to our house burst, submerging much of the laundry room in inches of water. The house’s main shut-off valve was little help, since the break was between the house and the main line from the city, so I spent several hours bailing until the water could be turned off completely.

The laundry room sits just behind my office and, thankfully, a little bit below it. The water level didn’t rise high enough to trickle into the Benchmarking Sweatshop. It did soak a few older motherboards and graphics cards, though; the laundry room also doubles as storage for the endless stream of hardware that FedEx delivers to my door.

Fortunately, the damage appears to be minimal. It certainly could have been worse. Not that long ago, my file server was sitting in what became the flood zone. But it, too, suffered a spectacular failure. While I was visiting family over the Christmas holidays, two of the three drives in the systems’s RAID 5 array died. A sagging 5V rail in the PSU was to blame, and my 2TB array was toast. I probably should have been keeping a closer eye on the system, but home file servers are the sort of thing one stuffs into a closet and kind of forgets about. This one had been running for years without so much as a hiccup.

My initial response was panic. Two terabytes of data was gone: high-bitrate MP3s ripped carefully from my collection of over 500 CDs, countless digital photos, priceless home movies, a decade’s worth of TR-related files, and a healthy helping of, er, Linux ISOs that would take forever to grab off BitTorrent again.

Wait, I have backups!

A couple months earlier, I’d backed up the entire file server to a single 2TB hard drive that was sitting on the shelf in my office. It wasn’t completely up to date, but almost everything that was missing was sitting on other machines. My desktop has had its contents protected by a RAID 1 array for years, and it’s a dumping ground for most of my data. A fairly recent version of the essential stuff is also kept on my laptop and on the USB key on my keyring. There’s an old notebook drive sitting at my parents’ house loaded with my most critical data, too.

In the end, I only ended up losing a couple days worth of benchmark data and a few frantic hours. But then I’ve always been pretty good about keeping things backed up. It all started with my high-school computer lab teacher, who would randomly turn off entire banks of machines to make sure we saved our work regularly. Thanks for the compulsive Ctrl+S tick, Mr. Knowles.

For years, my data was protected by a mix of RAID 1 on the desktop and a closet file server with its own array. Scheduled DOS batch files copied gigabytes from my desktop nightly, and the server was backed up to a separate hard drive periodically. When I moved to Windows 7, the batch files were replaced with the OS’s built-in backup routine, which has the handy ability to create an entire system image instead of just saving a selection of files.

Although I’ve considered resurrecting my file server, the home-theater PC in the living room has been filling in admirably. I tossed in an extra low-RPM hard drive, which doesn’t add much noise, and I could even do RAID if this turns into a permanent solution. It likely will, if only because that will save me the trouble of putting together—and monitoring—a new box. Plus, most of my home storage is media, which makes sense to have in the living room.

Windows can shuffle files between systems on a home network easily enough, but getting them onto an external drive is an extra step. It’s also an additional backup job on top of my nightly network copy. This creates problems for Windows 7, whose backup routine doesn’t support multiple jobs.

There are, of course, numerous external hard drives that come with their own backup software. Thing is, an external drive is really no safer than the secondary drive in my HTPC, which at least sits in a different room than my desktop. To be truly secure, data really needs to be duplicated at an off-site location. Doing that manually takes actual effort.

Fortunately, cloud-based storage has become a viable solution… provided you trust anyone else with your data. Even if you don’t, files can be encrypted beforehand and uploaded once scrambled. Free options abound, with Dropbox, SkyDrive, and now Google Drive offering gigabytes of remote storage. None of those services have enough free capacity to meet my needs, though. Ideally, I need hundreds of gigabytes to keep all my precious data safe.

While I’m loathe to shell out for online storage when I have terabytes of disk capacity sitting idle in my lab, I’m also realistic about how often off-site backups happen around here—and how many close calls I’ve had in the last few months. It’s worthwhile for me to pay to have software take care of the problem. Since our resident developer likes CrashPlan so much, I gave the free trial a shot. After a month of it sitting unobtrusively in my system tray, silently backing up files without me even noticing its presence, I sprang for two years of unlimited storage for $90. That’s less than the cost of the average terabyte hard drive, and it comes with a lot more peace of mind.

I’m not really worried about someone hax0ring my CrashPlan account and digging through my data, but it’s nice to know that the service has strong encryption and the ability to set a private password that even the company’s techs won’t know. The CrashPlan app allows local backups, too, but there’s no native support for networked shares, which is a little annoying. I’m more interested in CrashPlan’s ability to use other computers as backup sites. The app needs to be running on the target systems, but users can create their own cloud to complement—or supplant—CrashPlan’s own servers. This feature is included in the free version, although it’s limited to once-a-day backups rather than the real-time approach taken by the full-blown product.

Once my initial dump has finished uploading to CrashPlan’s servers, I’ll have three layers of protection, all fully automated. I’ll sleep better at night knowing I’m no longer the weak link for my off-site backups. But I’ll also keep refreshing my USB key and filling the occasional backup drive because, hey, you never know. If Gmail can go down, surely CrashPlan can, too.

Comments closed
    • demani
    • 7 years ago

    I use the free Crashplan for site to site backup. So I have a drive I seeded at home, then brought it into work and set up there (I’m the gatekeeper, so I asked permission, and granted it). Anyway, I have my mother doing something similar with her best friend- they started their backups, and after the first pass moved the drives to the other and backup to eachother’s homes. So now should anything happen to the house at all, the data is already at the location they are most likely to go to (and someplace they go to on a regular basis even if it isn’t a full-on catastrophe). And they don’t have to worry about someone else manipulating data-if they can’t trust the other person, then they can’t trust anyone.

    • Walkintarget
    • 7 years ago

    Took me 18 years toying around with PCs, but my backup solution is now quite solid. I’m running an HP MSS (WHS) with multiple drives, with folder redundancy set on the important folders. All data is still kept on the 5 PCs scattered about the house, but it is backed up nightly to the WHS. Worst case scenario – I lose the newest stuff created in the last 23 hours before the last backup ran.

    The WHS is in the laundry room under my workbench, a good 8″ off the ground. Unless my Boxer chews it to pieces, I am good to go. Maybe 3 times a year, I archive the user created folders on the WHS to an external 500gb pocket drive.

    The initial investment ran me $600, and I kept thinking of all the gear I could buy for that scratch, but each year my wife manages to lose some important files (Quicken, Itunes), and each time I thank the lucky stars that I bought it and its running so well.

    Having said all this, and knowing how the computer Gods frown upon one crowing about his up-time, I fully expect a drive or two to die within the month. 😉

    • Nutmeg
    • 7 years ago

    My current backup scheme is to move everything across every time I buy a new computer. Hasn’t failed me yet!

    I don’t really have a lot of important data that I couldn’t easily download again though.

    • Kurotetsu
    • 7 years ago

    My current backup scheme is Microsoft SyncToy + Dropbox (soon to be replaced by SkyDrive). The only thing I have that’s really worth backing up is my KeePass Portable database, which is a few kilobytes in size. With the free 25GB SkyDrive gave me I’m going have to start looking for other stuff to back up….

    • esc_in_ks
    • 7 years ago

    I’ve enjoyed reading about what everyone else does for backups. Here’s my plan and how I do it. It’s not necessarily any better than anyone else’s, but it’s what makes me feel comfortable.

    I have a FreeNAS box with 4×2 TB drives in it… NOT in a RAID array on purpose. Two of the 2 TB drives rsync nightly to the other two. So, if I accidentally blow away a folder, RAID isn’t happily syncing my mistake over to the other disk. rsync goes at 2am, so I generally have time to fix my mistake when I realize it.

    Plus, RAID is a complication and when it comes to my data I want no complications. I use plain old FreeBSD UFS filesystems on my disks. I can take any individual disk out and mount it in any FreeBSD or Linux system and see my files. Can’t do that with RAID.

    I have yet another two 2 TB disks and I rotate them with two of the ones in my FreeNAS box, storing them in my office at work about 20 miles away. I have the two extra drives for swapping in a nice, high quality set of swap bays on my FreeNAS box, so that makes it easy.

    I generally get around to swapping the disks between the office and home once every 4-6 weeks, or more often if a major event happens, like a bunch of digital pictures from a trip get put onto it.

    Backups to the FreeNAS are either rsync from Linux or FreeBSD or Windows 7 Professional’s Backup (which is a piece of junk, I need a better network capable Windows backup).

    • LoneWolf15
    • 7 years ago

    I have an HP Mediasmart server, which backs up our network-attached PCs nightly. I can bare-metal restore a backed-up client if something goes wrong. That server has an external eSATA drive that I back it up to as well.

    The last thing I have to get good at is backup solutions for my laptop and my wife’s. Hers spends most of the time at work for her, and mine is connected wirelessly, which makes backup more difficult. I think I just need to buckle down and connect mine wired occasionally (and have her bring hers home once every so often to do the same).

    • willyolio
    • 7 years ago

    ah, law of entropy as applied to fate and destiny.

    • Deanjo
    • 7 years ago

    [quote<] Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it 😉 Torvalds, Linus (1996-07-20). Post. linux.dev.kernel newsgroup. Google Groups. Retrieved on 2006-08-28. [/quote<]

    • Krogoth
    • 7 years ago

    I think the most difficult to part with backing up and is determining which data is worth the energy and effort to go extra mile of backing it up at a frequent basis.

    Budget and space are a serious consideration. Unless you are running some kind of business, it is kinda hard to justify going the extra mile to back-up TBs worth of personal data and that data is mostly comprised of games, videos and [sub<]p0rn[/sub<]. I do suppose that the best way to answer question is telling yourself "What happens if you lose that data?".

    • just brew it!
    • 7 years ago

    This reminds me that it has been quite a long time since I updated the off-site copy of my backups…

    • XDravond
    • 7 years ago

    Backup whats that?….

    Reminds me of what I been doing when bored save “almost” lost data… and guess what it could have been unnecessary if the ehm person done as I said… Simple thing really a bunch (500+) food/cake/cookie/etc recipes on a USB total less than 10GB worth of data, so I said make a copy and save on the computers hdd since it’s a rather empty 1TB drive (also auto backuped…). Did the person do that when told, no. And would you guess what the USB stick didn’t break it was lost completely out in city (rather big one)…
    My first question was why *he f*ck did you have the USB stick in your rather shallow pocket in which you also keep your mobile you fish up every now and then?…
    Second question didn’t you do a backup like I told you more than 5 times in the past 2 weeks….

    But “luckily” most of the files were just deleted from the HDD and possible to recover though without file names… So now someone might listen to what I say… Or wait I forgot parents don’t listen to their kids since it’s so much more fun seeing the kid work his as of trying to find the data again among a few hundred thousand files…

    I’m not the best in backup my data but I use a external hdd to save a copy of my files every now and then plus things like google music is pretty sweet for music…Up to 20k songs stored for free sure google will think of something to do with them but hey it’s free so I don’t complain… Also 25GB skydrive for my most important data, no not the porn stash (it’s to big and has it’s own external hdd 😉 )

    • obarthelemy
    • 7 years ago

    Convincing people to do backups, even nerds, is still such a hard sell. Especially semi-nerds who want to believe RAID = backup.

    I’m glad you didn’t lose too much. Stroke of luck this time. The IT press, especially the mainstream one, should do a concerted “backup week” awareness campaign. I’m tired of seeing devastated people losing most everything and come to me crying for help.

    • Lazier_Said
    • 7 years ago

    Cloud storage is great for documents and email.

    Restoring even a 1TB library at 10-20Mbps would take more than a week. Running my connection at 10-20Mbps would use up my monthly data cap in less than 2 days. Many ISPs give you even less than that. Not ready for prime time by a factor of 10.

    My home library is a QNAP NAS (RAID5. Having learned more, the next iteration of this will be RAID1) that auto backs up to a duplicate NAS. The most important folders also exist on at least one PC and are manually copied as dated images, not overwrites, to an external 2TB disc which is off and unplugged the rest of the time.

    Yes, I learned the hard way.

      • mcnabney
      • 7 years ago

      Better have an off-site solution. Fire, theft, tornadoes and other disaster… these things happen.

        • bthylafh
        • 7 years ago

        Yup. Put enough distance between yourself and your backups that a flood, hurricane, or nuclear weapon can’t get both copies at once.

    • Bauxite
    • 7 years ago

    The most common storage assumption is wrong: RAID (all kinds) is not a backup at all! but it does help with availability 🙂

    A painful lesson many have yet to learn is it will happily copy any corruption you have and obliterate your precious files. Real backups are separate both physically and logically.

    Once you realize how scary things are at a low level and you start getting a sizable chunk of irreplaceable data (e.g. actual creations, not your media) start looking at things like ZFS for “live” data 🙂

    For offline backup…there’s various things out there but they sure haven’t kept up with the convenience of cheap drives 🙁 all the tape types suck, optical sucks, etc….but you gotta do something.

      • bcronce
      • 7 years ago

      How I want to do it.

      HyperV3 with a ReFS 2 disk mirrored slab storage-pool as the SAN device, mirrored to another HyperV3 machine with the same setup, with both of those machines periodically backed-up to a FreeBSD box sharing out an iSCSI block device backed by ZFS with 3-disk mirroring.

      My reduced version is a single HyperV system with 3-disk mirrored slabs with the same FreeBSD back-up device.

      I am not sure how to handle off-site back-ups because I aiming for 10TB+.

      • The Wanderer
      • 7 years ago

      That depends on your definition of “backup”. It doesn’t provide what might be more accurately called “backup in depth”, but it certainly does protect against data loss due to some kinds of hardware failure, by providing a second copy of the data to fall back to – in other words, a “backup copy”.

      There are many definitions of the word “backup”, and while RAID certainly doesn’t satisfy many of them, it does satisfy others. I find the kneejerk repetition of “RAID is not backup”, without even a grudging nod to the limited ways in which it [i<]is[/i<] by those lesser definitions, to be exceedingly annoying.

        • bcronce
        • 7 years ago

        While I agree a lot with what you said, I do have to agree that redundancy is not a back-up.

        I guess it would be better to say that a back-up is never “live” data. Backed-up data should never be accessed except in the case of recovery.

        • Bauxite
        • 7 years ago

        “Real backups are separate both physically and logically.”

        Its a painful lesson, so instead learn from the mistakes of others. All things fail, RAID just has better uptime.

    • ludi
    • 7 years ago

    [quote<]It all started with my high-school computer lab teacher, who would randomly turn off entire banks of machines to make sure we saved our work regularly. Thanks for the compulsive Ctrl+S tick, Mr. Knowles.[/quote<] Cruel, scarring, and therefore effective. I think I like this guy. Any chance you can dig him out of retirement to start another TR blogspace?

    • willmore
    • 7 years ago

    RAID6, keeping the server well above floor level, a sump pump, a battery backed backup pump, and a exact model replacement to the main sump new in box right next to the whole thing. No, I don’t care what happens, we’re not doing *that* again.

    Live and learn or die trying.

      • NarwhaleAu
      • 7 years ago

      +1 for sheer preparedness. Only thing missing is zombie spray repellant.

        • willmore
        • 7 years ago

        I have a gun and lots of ammo. 🙂

          • Scrotos
          • 7 years ago

          I guess the “spray” part depends on the rate of fire! 😉

      • highlandr
      • 7 years ago

      But what about a fire?

        • willmore
        • 7 years ago

        There’s a tarp over it and it’s in one of the most structural parts of the house.

        Plus, backups. Break the really important data into 3G chunks, store those in TrueCrypt files and process those with dvdisaster, then burn three coppies and mail them to different (remote) family memebers for safe keeping.

        I used to design redundant systems which processed billing records for cellular billing systems.

        A machine is not more reliable than the building it is in. No single datacenter is ‘five nines’. One little tornado/flood/earthquake can ruin your whole day.

        • Alexko
        • 7 years ago

        Or an electrical surge due to a lightning strike? Or just the PSU screwing up?

        I think the best option (i.e. the best compromise of cost, safety and simplicity) is probably a couple of external hard drives in a water-&-fire-proof safe.

      • mcnabney
      • 7 years ago

      RAID is uptime, NOT BACKUP. Duplicate all data offsite.

    • Chrispy_
    • 7 years ago

    Luckily I only have about 75GB of actual, valuable data. I keep it only on RAID1 arrays, and in a minimum of two different sites in case I get burgled/flooded or the building burns down. The “master copy” uses robocopy to push backups out automatically with a scheduled task, but I sync to my laptop manually in case I accidentally delete stuff and that deletion gets replicated to the other copies.

    You can’t put a price on two decades of work, photos and memories. It’s definitely worth more than a couple of discs and ten minutes of batch scripting though.

    • flip-mode
    • 7 years ago

    Shouldn’t title read “it will [i<]be[/i<]"? or maybe "If something can [i<]go[/i<] lost, it will" Bah, nevermind.

      • MrJP
      • 7 years ago

      If we’re going to be pedantic (hey, we’re on TR after all), then “If data can be lost, they will”.

        • Scrotos
        • 7 years ago

        [url<]http://dictionary.reference.com/browse/data?s=t[/url<] Usage note Data is a plural of datum, which is originally a Latin noun meaning “something given.” Today, data is used in English both as a plural noun meaning “facts or pieces of information” ( These data are described more fully elsewhere ) and as a singular mass noun meaning “information”: Not much data is available on flood control in Brazil. It is almost always treated as a plural in scientific and academic writing. In other types of writing it is either singular or plural. ------------ Blah blah living language blah blah always evolving etc.

          • MrJP
          • 7 years ago

          I knew I should have added a smiley, but thanks for the clarification.

      • Scrotos
      • 7 years ago

      It’s an implied verb. Perhaps not formal but this isn’t a formal paper, innit? And even writers of great acclaim structure language however they wish:

      [url<]http://www.perseus.tufts.edu/hopper/text?doc=Perseus%3Atext%3A1999.03.0080%3Asection%3D4%3Asubsection%3D12%3Aparagraph%3D349[/url<] Some nice examples here: [url<]http://www.pitt.edu/~atteberr/comp/0150/grammar/advclauses.html[/url<] Near the end it shows some of the implied verbs being put back. Not the exact same example, no, but just sayin' this isn't an alien concept.

    • Peldor
    • 7 years ago

    Man, that was a long post just to shill for Crashplan. (kidding)

    I also use their service (after Mozy bumped their prices way up). I think it took around 4 weeks of constant uploading to get it seeded. I seriously considered tethering my phone because the average upload on Verizon was faster than my cable provider at the time.

    • Left_SHifted
    • 7 years ago

    Damaged Lab 😛

      • Chrispy_
      • 7 years ago

      Surely, Dissonated?

        • dashbarron
        • 7 years ago

        Facetiously Foxed!

    • SnowboardingTobi
    • 7 years ago

    My backup scheme consists of multiple hard drives I dock into a Plugable Technologies USB 3.0 dock. The most precious of data I have get backed up to 2 different drives. The original plan was to store one of those drives offsite somewhere and cycle them back into rotation, but I haven’t exactly gotten around to doing that. haha

    I’d love to use Crashplan as another backup destination but right now I have over 1.5TB (and growing) of data I’d want to backup to them, but their initial seed option is only a 1TB drive and my data isn’t very compressible. Unfortunately my upload speeds aren’t as speedy as some of the other gerbils on here (I’m on DSL) so uploading that extra 500GB+ would take forever.

    I’ve also thought of perhaps buying a Synology DS1512+ for use as a file server and doing RAID 6, but I’ve been waiting for the Tech Report to do a in depth review of it *hint, hint*

      • ante9383
      • 7 years ago

      I, too would love to read a Techreport review of the most common NAS devices from Qnap, Synology etc.

      All my personal files (photos, documents, music) are stored on a 1TB RAID1 array, they are then backed up in real-time to a 500GB USB 2 drive. In addition, the system disk is backed up weekly to a 1TB eSata drive.

      A month ago I sprung for a 4-year unlimited data contract with Crashplan, so all of my personal files are backed up to the cloud, too.
      Initial upload took a while (a week or so), but now it’s all backed up in real-time.

      Can anybody think of something else I could be doing to keep my stuff safe?

        • bitcat70
        • 7 years ago

        Such a review would be great! While at it maybe they could throw in FreeNAS running on, say, HP Proliant MicroServer?

        Your data is being backed up on-site and off-site. That’s plenty. I would be happy with that but, of course, the more copies the less chance of it disappearing. BTW, is there a way to validate your off-site backup (with Crashplan) periodically or on-demand to make sure it’s not been corrupted?

      • frumper15
      • 7 years ago

      If you have the option to use another computer offsite (work, family, friend, etc.) that is willing to let you use some of their bandwidth (you can return the favor by letting them backup to you as well) you can actually seed your own drive with 1.5TB of data and bring it over to their house so you’ll have the beginning of an offsite option. From there, you can do the 1TB crashplan seed and finish uploading over time with some insurance already in place. I think the question might be how much of that 1.5TB is truly irreplaceable – I guess it could be a whole lotta family movies, but if it’s just video and music, etc… maybe you can do everything to the friends/family that is easily accessible and just do the really important stuff (pictures, docs, etc) to crashplan if your upload is truly painful. I say that because the reality is that even if you do get 1.5TB uploaded to CP, should you actually need to get to that in the future the download will be painfully long and not particularly useful.

Pin It on Pinterest

Share This