That’s The Way The RAID Crumbles
Just about two years ago I built a RAID system to use as my media/file server. I’m fairly paranoid about shutting it down, since I’m afraid that I’ll lose a drive. But when the A/C went out I shut it down to prevent it from overheating.
The A/C is now back on (fortunately it turned out to be something simple—the fan start capacitor had burned out and shorted out a couple of other wires, but nothing that couldn’t be fixed), and as I feared, it lost a drive on bootup. I’ve got a new drive on order, and until it gets here the system is still operational, although it’s no longer fault tolerant.
At the time I thought that a RAID setup would be a good way to prevent losing data, especially given the costs of tape. But I’m not so sure anymore. Even though it allows for a single drive failure, it still makes you nervous to pull a drive out of the array and replace it, then rebuild the array. You always feel like you’re hanging by a thread.
I’ve got some old systems laying around. I think I’m going to add some disks to one of them and run it as a backup (maybe a daily rsync or something similar). Hopefully, both systems wouldn’t die at the same time, should something happen to either one. Of course, that doesn’t take into account something like a lightning strike. What I really need is an offsite backup. Perhaps I could persuade one of my friends with high-speed Internet to let me place a system at their place in return for backing up their data to my systems (i.e. cross-site backups). Or maybe I’m just being overly paranoid. But I’m starting to feel like someone carrying a bunch of eggs in a frayed basket, so I’m going to have to do something.
Dude—
BACK UP YOUR SH**!! (Yes, I know all caps implies shouting. I AM…)
RAID is useful in that it partially covers your butt for one of the components of a data storage system that fails most often—a physical disk drive. It does not protect you, though, from the myriad single points of failure that can completely hose your system—and, incidentally, you. Right off the top of my head (and I assume here that you are using some flavor of SCSI drives), a failure in your SCSI controller can cause bad writes to ALL of your disks simultaneously… Oh, SH**!!!
The folks I work for (unidentified field office for government agency) shelled out big bucks to Dell/EMC for a really nifty multiple switch/multiple diskset fibre channel setup for our main data storage. I’m assured by their tech support that the setup is damn near bulletproof. Sure it is.
You bet your a** I back it up to tape—daily differentials, weekly fulls. Why? Well, WHEN everything goes blooey (note that I didn’t say “if”), I’ll still have a job…
Yeah, I know. I’d prefer tape, or some other similar bulk storage medium. But it becomes rather expensive for someone who’s doing this on his own, rather than as part of a datacenter in a business.
I’ve gone ahead and ordered some more drives to put into a second system and as soon as I get them I’m going to set up a nightly rsync of the important directories I don’t want to lose (like my photos).
In the meantime, I’m not going to shut down the RAID system until I’ve backed it up. I’m afraid of another drive not coming back.