Mervyn Thomas said:
Can someone tell me how to cover the risk of a complete motherboard / hard
disk failure that could restore a mission critical stand alone PC.
Somehere else I have asked questions about bootable external hard drives
and got confusing answers. Surely there must be a way to collect
everything on a drive and to be able to replace it in a new off the shelf
PC!
If it is a mission critical host, backups are not your first concern (but
definitely your second concern). The first concern is reliability in uptime
(i.e., hardware disaster recovery). For that, you should be looking for a
system with RAID 10 (RAID 1+0) or RAID 5 so the system stays up and can be
brought up very fast. Other factors would be dual power supplies, or a
supply with built-in UPS, or an external UPS to keep the system up (since a
local power outage may not also cause a network outage). Surge protection
is best performed back in the electrical system, not using endpoint devices
at the host, especially since this usually results in multiple surge devices
used for devices connected to one host that could be 10 feet apart on the
separate devices and incur a 400V+ surge across that 10-foot span of cords
between the surge devices. If the power is protected and if the hardware
isn't duplicated then it really is not a mission critical host.
Don't think RAID-1 (mirroring) provides data backup. It provides hardware
backup; i.e., you can use the mirrored drive when the primary drive fails
but the mirrored drive has the SAME files as the primary drive had (i.e., no
data backup). Same for all RAID; i.e., RAID is not use for data backup.
You could use logical backup programs (i.e., they backup/restore files and
do so through the file system in the operating system) but that backs up the
users' data. Obviously if the target for the backup files is the hard
drive(s) in the host then hardware loss in the host could result in losing
your backups, so you need to backup to a different host through the network.
Alternatively, require that all data files be created, stored, and modified
on a file server host (because nothing on the local host gets backed up).
Obviously how well this works depends on how fat is the pipe to the file
server: too little bandwidth and your users will be screaming about delays
to their work while waiting for files to update or contention over the
limited resources of the file server.
You could use something like RestoreIt or GoBack to provide the equivalent
of imaging that also provides incremental updates, akin to file versioning
on mainframes. That would allow the host's user to restore back to a known
state. The System Restore provided in Windows XP is not reliable.
Alternatively, you could simply save disk or partition images (but, again,
not on the same drive and preferrably off the host to avoid loss when
hardware fails). Acronis TrueImage permits full disk and partition images
along with incremental updates; however, I don't know if it works well to
save the image files over a network to another host. Incremental updates
only work well when using its hidden partition on an internal hard drive, so
you could lose your incrementals (so periodically wipe them by doing full
images to removable media).
If you think using an external hard drive provides reliable backup, remember
that when the mechanicals of the hard drive fail then you lose all your
backups. You are making logical data backups to circumvent user error
(i.e., deletes caused by the user) or hardware failure. Why do backups of
your mechanical hard drive to another mechanical hard drive? An external or
auxilliary internal hard drive that is used only for backups should be used
only for short-term backups, like daily incrementals for one week so, at
most, when the mechanical drive fails then you lose only one week's worth of
data. Permanent or long-term backups should be stored on media that is
separate of the drive mechanicals, like tape, CD-R, DVD-R, Zip, or whatever.
If the backup drive's hardware fails, you can buy another one and continue
to use the removable media.
NEVER do a backup with enabling its verify option. What good is a backup
that you cannot read later to retrieve the files on it? This doubles the
time to do the backup, but an unverified backup can be a huge waste of time.
You end up spending hundreds of hours doing backups only to find out later
that you cannot read from [some of] them.
Since you mention your host is a mission critical host but neglect to
provide ANY details of the network under which it operates, we really don't
know what you mean or how much protection would be sufficient or what is
your budget.