OK. Most of the time, it's fine to talk about *nix covering Unix, Linux,
the BSD's, and other related OS's. But given your experience with
Solaris (which really is UNIX), and your references to older limitations
of Unix, I thought you were making a distinction.
Most of the advanced software or hardware RAID setups that I've ever
seen were on Solaris & HP-UX systems, attached to SANs. Linux boxes were
mainly self-sufficient quick-setup boxes used for specific purposes.
Linux (and probably *nix
won't make repairs to a filesystem without
asking first, although it will happily tidy things up based on the
journal if there's been an unclean shutdown. And if it is not happy with
the root filesystem, then that means running in single-user mode when
doing the repair.
/Any/ filesystem check which does repairs is a bit disconcerting. But
when talking about "manual" or advanced repairs, I've been thinking
about things like specifying a different superblock for ext2/3/4, or
rebuilding reiserfs trees.
This is the sort of thing that Windows' own chkdsk handles without too
many questions. It's not to say some really bad filesystem problems
don't happen to NTFS that require extra attention, but somehow the
chkdsk can ask one or two questions at the start of the operation and go
with it from that point on. It may run for hours, but it does the
repairs on its own without any further input from you.
I'm not sure if it's because NTFS's design allows for simpler questions
to be asked by the repair utility, than for other types of filesystems.
Or if it's because the repair utility itself is just designed to not ask
you too many questions beyond an initial few.
I would suspect it's the latter, as Windows chkdsk only has two
filesystems to be geared for, NTFS or FAT. Whereas Unix fsck has to be
made generic enough to handle several dozen filesystems, and several
that can be added at some future point without warning. They usually
implement fsck as simply a frontend app for several filesystem-specific
background utilities.
I haven't tried the Linux NTFS repair programs - I have only heard that
they have some limitations compared to the Windows one.
Well, I just haven't personally encountered any major issues with them
yet. I'll likely encounter something soon and totally change my mind
about it.
I don't quite agree here. There are different reasons for having
different RAID schemes, and there are advantages and disadvantages of
each. Certainly there are a few things that are useful with software
raid but not hardware raid, such as having a raid1 partition for
booting. But the ability to add or remove extra disks in a fast and
convenient way to improve the safety of a disk replacement is not an
unnecessary complication - it is a very useful feature that Linux
software raid can provide, and hardware raid systems cannot. And layered
raid setups are not over-complicated either - large scale systems
usually have layer raid, whether it be hardware, software, or a
combination.
It's not necessary, all forms of RAID (except RAID 0 striping), are
redundant by definition. Any disk should be replaceable whether it's
hardware or software RAID. And nowadays most are hot-swappable. In
software RAID, you usually have to bring up the software RAID manager
app, and go through various procedures to quiesce the failed drive to
remove it.
Usually inside hardware RAID arrays, in the worst cases, you'd have to
bring up a hardware RAID manager app, and send a command to the disk
array to quiesce the failed drive. So it's not much different than a
software RAID. But in the best cases, in hardware RAID, all you have to
do is go into a front control panel on the array itself to quiesce the
failed drive, or even better there might be a stop button right beside
the failed drive right next to a blinking light telling you which drive
has failed.
When you replace the failed drive with a new drive, the same button
might be used to resync the new drive with the rest of its volume.
Other advanced features of software raid are there if people want them,
or not if they don't want them. If you want to temporarily add an extra
disk and change your raid5 into a raid6, software raid lets you do that
in a fast and efficient manner with an asymmetrical layout. Hardware
raid requires a full rebuild to the standard raid6 layout - and another
rebuild to go back to raid5.
I see absolutely /no/ reason to suppose that software raid should be
less reliable, or show more false failures than hardware raid.
Well there are issues of i/o communications breakdowns, as well as
processors that are busy servicing other hardware interrupts or just
busy with general computing tasks. Something like that might be enough
for the software RAID to think a disk has gone offline and assume it's
bad. It's getting less of a problem with multi-core processors, but
there are certain issues that can cause even all of the cores to break
down and give up, such as a Triple Fault. The computer just core dumps
and restarts at that point.
There was a time when hardware raid meant much faster systems than
software raid, especially with raid5 - but those days are long past as
cpu power has increased much faster than disk throughput (especially
since software raid makes good use of multiple processors).
Not really, hardware raid arrays are still several orders of magnitude
faster than anything you can do inside a server. If this wasn't the
case, then companies like EMC wouldn't have any business. The storage
arrays they sell can service several dozen servers simultaneously over a
SAN. Internally, they have communication channels (often optical) that
are faster than any PCI-express bus and fully redundant. The redundancy
is used to both increase performance by load-balancing data over
multiple channels, and as fail-over. A busy server missing some of the
array's i/o interrupts won't result in the volume being falsely marked
as bad.
The processors inside hardware raid units simplify the raid schemes
because it's easier to make accelerated hardware for simple schemes, and
because the processors are wimps compared to the server's main cpu. The
server's cpu will have perhaps 4 cores running at several GHz each - the
raid card will be running at a few hundred MHz but with dedicated
hardware for doing raid5 and raid6 calculations.
I'm not talking about a RAID card, I'm talking about real storage arrays.
Yousuf Khan