Backing up using big, cheap hard disks and some nifty Unix tools

We all have our favourite methods of data backup. Certainly DVD is a good archiving and distribution medium. But for routine backup, I look for reliability, low cost, ease of use, high-speed, high capacity and redundancy – so I am not reliant on just one device.

One of my favourite methods is a complete off-site dataset consisting of a hotchpotch of USB/eSATA/Firewire disks. Only disadvantage is that they live 20km away and are only updated every couple of weeks. So I keep a further two 1Tb Samsung disks with recent data in my camera rucksack. These are updated daily. Cost around £80.00 each. (So no excuse for not backing up!)

But considering HD failure is the most common source of failure and that I am a naturally lazy git, my backup methodology needs to be really really easy and almost instant. Also I have become very reliant on my media server and my business can’t really function well without it. So I figured I needed a more radical solution…

So I decided to build a second media server that was a near identical clone of the first. They both use:-
  • 1x “so-so” Asus P4P800 SE mother board (c/w) 2x on-board SATA), £75.00.
  • 2GB generic RAM, £70.00.
  • 5x 1TB Samsung HDs (cheap and offering very low power consumption), £ 80.00 each.
  • 1x Promise TX4 150 SATA card, £40.00.
  • 1x IDE Samsung DVD writer, £20.00.
  • I boot from recycled 120GB IDE drives so that five or even six SATA drives in each system can be used entirely for data.
These systems are not particularly special – except for the massive storage space. They are really just giant NAS (network attached storage) boxes. Both systems are housed in reclaimed maxi-tower cases and they both run Kubuntu Linux. So no viruses, superb performance, a nice easily-used interface, zero software licensing costs and lots of really nice data-management tools. Cost: under £700 each server.

Of course you can scale this basic concept up or down to suit your particular requirements. The point is that you do not need to spend a lot of money in order to create a massive amount of reliable data storage. And if you use Linux instead of Windows then you not merely save on the cost of Windows, you also get all the software you need for free. And it runs much faster and more reliably than Windows does. My servers enjoy average up-times of several months. I generally only reboot to add new hardware or because an updated kernel has come out.

This solution also  factors in plenty of redundancy. That is, if one server fails for whatever reason, then I can use the other machine while I fix or replace the first. Another important consideration, is low cost – because I’m a tightwad! The use of low cost components means repairs are also low-cost and relatively easily undertaken. I.e…

  1. Switch off server
  2. remove cover
  3. disconnect & slide-out knackered HD
  4. insert new HD & reconnect it
  5. replace cover
  6. switch on server
  7. partition, name & format HD as necessary
  8. rsync with other server
  9. have a cup of tea | cold beer | beverage of your choice!

Now, rsync is a simple, easily understood and very secure Unix application that I am already using successfully both in-house and to shuffle data between base and a number of web servers on-line. At the moment my pair of media servers are at opposite ends of the same building while I learn more about the amazing things one can do with Unix – and I do still have a lot to learn! But eventually I plan to build a clone server in a very remote location, perhaps even 1500 km away and rsync it using the internet.

Perhaps more interesting is a Unix protocol called ssh (secure shell) allows one to control remotely any authorised Unix-like machine, anywhere on the planet. So in theory, secondary and tertiary backup servers can be placed pretty much wherever I want whilst I control then from base. Moreover, these remote boxes can be completely headless (i.e. no mouse keyboard or monitor). All the person at the other end would ever need to do is physically change a duff hard drive. I’d control the rest from base, including partitioning & formatting the drive! Well, that’s the theory.

This is not intended as a “mine’s bigger than yours” feature. Rather, I hope this helps and/or inspires someone to make better backups. Also I am not a Unix expert – more a Unix perpetual student.

Related Images: