Newbie Question re hardware vs software RAID

  • Thread starter Thread starter Gilgamesh
  • Start date Start date
You're still good for a laugh.

You know you're right.
All raid controllers and all drives are exactly identical and reliable.
All drivers and software are eqaully reliable so all you have to do is find it.
$400 buys you a high end raid card
All raid levels are equally well implemented in all controllers
It's OK when raid has problems 'cause there's always data recovery
a TB of data is just as easy to recover as a few gigs
It's always easy or not important to find small problems hiding in a TB of data
the only problems drives ever suffer is immediate and sudden failure
it's smart to "backup" on the same drive and machine as the data you use
speed and capacity are the only criteria everyone should use in choosing storage.

;)

You're really funny. At first I thought you were a troll but it's
clear your just some kid shooting his mouth off cause he just got a
promise card for his bday. If you put together "you bet your job"
solutions you wouldn't be able to even jokingly imply these things.
And yes "Home" use gets nearly as serious when you have that much data
(for a single user to create, manipulate, & manage anyway) that is so
important that you are nervous about the backup strategy.

For the OP I again say a good backup strategy is what you need and not
raid. If you feel you NEED to have raid it is generally uneconomical
to not do it "right." Since I'm not responsible for your data feel
free to try whatever pleases you. If your needs are more basic than I
have been assuming, nearly anything will do, just like for "kony."
 
You're really funny. At first I thought you were a troll but it's
clear your just some kid shooting his mouth off cause he just got a
promise card for his bday.

<yawn, more vague marketing drivel snipped>

Still no evidence?
Did you feel that your post was a substitute?

[HINT] It's not. [/HINT]

I realize that with no evidence you can't possibly stay
on-topic more than providing marketing-style blurbs, but
someday if you have enough data you may realize that there
is no substitute for redundancy, and all the most expensive
SCSI (err, "Do it right" solution) does is limit the amount
of redundancy that the budget allows.

Suppose there is no budget, the sky is the limit... is SCSI
then the best choice? Possibly due to the higher number of
drives supported, none of the BS you claimed, but even more
important would be the ability to pay salary for an
administrator that knows a few FACTS instead of just
nonsense marketing, so they can effectively maintain it
without needing to rely on the "support" you claimed was a
benefit of your proposed solution.

Well I'm done with this thread, feel free to throw a few
more off-topic insults at me instead of providing any real
evidence.
 
Short list of reliability/useability concerns I have with ATA Raid:
- multiple issues surrounding disk write back cache
- quality control /more lax component rejection levels
- cable issues with PATA
- lacking advanced features with most PATA cards
- SAF-TE compliance
- NVRAM use for transactions
- on board thorough diagnostics & NVRAM logging
- Continuous background defect scanning by drives & controller
- lack of write verification or parallel transactions in RAID 5
- lack of RAID 3 or 4 as more secure alternative to RAID 5
- I question full verification in some RAID 1, 10
- multiple drive firmware issues
- really good Linux & Novell support
- ability to flash upgrade without reboot
- Online array conversion/expansion
- ECC ram, battery backup, >64/33 pci
- support in the form of highly stable & tested software and
replacement with new rather than used parts with prompt turnover &
reps who would be actually able to tell you basic low level details
about how _their_ raid levels are implemented
- SNMP traps, hot swap, hot spare, etc.
- redundant dedicated fail-over or cache-mirroring paths

A few of the newest and best SATA cards deal with a lot/most of these
issues. A company like LSI is porting decent scsi raid technology to
ATA inherited from Mylex, IBM, & Buslogic. The main key is that it is
new & unproven whether there are unforseen kinks tripping up aspects
of the port. For example the MegaRAID SATA 300-8X looks pretty
serious but it is a "first to market" SATA 2 implementation. :(
Other Hybridized ATA-only raid 'levels' from other companies are too
new to trust. It is all just too new to be seen as a "tried & true"
for someone totally nervous about their data.


<yawn, more vague marketing drivel snipped>

What you snipped was the summary of your core assertions & inferences-
all of which are flat wrong & you clearly still endorse. Identifying
your lack of knowledge is not "vague marketing drivel." What IS
"marketing drivel" are notions like "raid 5 is raid 5" or "a drive is
a drive" or "having redundant data always yields complete protection."

If drive reliability was the only factor, and all drives were equal,
than all scsi or fibre arrays would be significantly less reliable per
MB than all ATA (according to array MTBF calculation). The increase
in I/Os would be severely offset by likelihood of multiple failures.
It's simply not the case that say a Promise external box is greatly
more reliable than a scsi or fcal array in say a netapp or clariion.

"Reliability" is a non-specific and relative term. The prevailing
MTBF characterization of "array reliability" is both problematic &
inadequate. (You seem to be inferring MTBF is equal without providing
any specific numbers or even really any specific characterization of
your reliability criteria or measurement.)

Reliability as it applies to user expectations about a running system
include not only a MTBF-type characterization but also the extent to
which data integrity is preserved (even with odd kinds of failure &
rare transient error) and also to some extent it overlaps with overall
data availability (including maintenance and build convenience
features/aspects). Time down or with restricted use during
maintenance means the raid has failed to fulfill its primary purpose
(availability). Just because you can fix a problem doesn't mean you
may still see the thing as 'reliable' (because while you fixed it it
was not). Time is valuable, even "free time."

Doing it "right" means using mature technology which is sophisticated
enough to adequately deal with all "reliability" needs over and above
basic fault-tolerance for certain limited events. Yes 'Historically'
there were no viable ata alternatives to scsi/fibre. Very recently
the best of current sata products seem to be closing the gap with
better scsi products, but it is not clear that the gap is fully closed
or if there is enough of a track record to truly say they are equal.

If that is what you want me to "prove" with "specifics" it belongs in
a new thread. Throwing light on your incorrect assumptions, rudeness,
and lack of knowledge have already taken us too far away from 'which
sata raid 5 cards are full firmware/ghost compatible.'

I'll say again it is always a _safe_ recommendation to say that if you
are VERY nervous and unforgiving when it comes to data you should
steer clear of "entry level" or "first-to-market" products.
Furthermore I am also suggesting avoidance of raid 5 except in the
best implementations and even then with caution. A suggestion like
scsi raid 1+0 causes someone to rethink their operating and backup
strategy (without raid) or consider how much data do they actually
NEED to be so nervous about to _require_ RAID.


If you really have evidence that all raid implementations are equal &
all drive failure & error rates and tolerance of failing media,
transient error, power failure, etc. are the same I'd still like to
see some real comparative evidence (in a new thread). The group would
benefit from a substantive comparison that would explain your belief
and demonstrate you are familiar with _both_ product types. "You're
wrong' & 'I don't like your details or explanation' is hardly an
argument or a position to fight for. That's what makes you sound like
a child. If I'm so stupid, then educate me. Ranting just scares ppl
away from the thread. Pretending I am unclear or vague convinces no
one that you have a real, valid position or even that I am wrong.
 
Short list of reliability/useability concerns I have with ATA Raid:
- multiple issues surrounding disk write back cache
- quality control /more lax component rejection levels
- cable issues with PATA
- lacking advanced features with most PATA cards
- SAF-TE compliance
- NVRAM use for transactions
- on board thorough diagnostics & NVRAM logging
- Continuous background defect scanning by drives & controller
- lack of write verification or parallel transactions in RAID 5
- lack of RAID 3 or 4 as more secure alternative to RAID 5
- I question full verification in some RAID 1, 10
- multiple drive firmware issues
- really good Linux & Novell support
- ability to flash upgrade without reboot
- Online array conversion/expansion
- ECC ram, battery backup, >64/33 pci
- support in the form of highly stable & tested software and
replacement with new rather than used parts with prompt turnover &
reps who would be actually able to tell you basic low level details
about how _their_ raid levels are implemented
- SNMP traps, hot swap, hot spare, etc.
- redundant dedicated fail-over or cache-mirroring paths

A few of the newest and best SATA cards deal with a lot/most of these
issues. A company like LSI is porting decent scsi raid technology to
ATA inherited from Mylex, IBM, & Buslogic. The main key is that it is
new & unproven whether there are unforseen kinks tripping up aspects
of the port. For example the MegaRAID SATA 300-8X looks pretty
serious but it is a "first to market" SATA 2 implementation. :(
Other Hybridized ATA-only raid 'levels' from other companies are too
new to trust. It is all just too new to be seen as a "tried & true"
for someone totally nervous about their data.




What you snipped was the summary of your core assertions & inferences-
all of which are flat wrong & you clearly still endorse. Identifying
your lack of knowledge is not "vague marketing drivel." What IS
"marketing drivel" are notions like "raid 5 is raid 5" or "a drive is
a drive" or "having redundant data always yields complete protection."

If drive reliability was the only factor, and all drives were equal,
than all scsi or fibre arrays would be significantly less reliable per
MB than all ATA (according to array MTBF calculation). The increase
in I/Os would be severely offset by likelihood of multiple failures.
It's simply not the case that say a Promise external box is greatly
more reliable than a scsi or fcal array in say a netapp or clariion.

"Reliability" is a non-specific and relative term. The prevailing
MTBF characterization of "array reliability" is both problematic &
inadequate. (You seem to be inferring MTBF is equal without providing
any specific numbers or even really any specific characterization of
your reliability criteria or measurement.)

Reliability as it applies to user expectations about a running system
include not only a MTBF-type characterization but also the extent to
which data integrity is preserved (even with odd kinds of failure &
rare transient error) and also to some extent it overlaps with overall
data availability (including maintenance and build convenience
features/aspects). Time down or with restricted use during
maintenance means the raid has failed to fulfill its primary purpose
(availability). Just because you can fix a problem doesn't mean you
may still see the thing as 'reliable' (because while you fixed it it
was not). Time is valuable, even "free time."

Doing it "right" means using mature technology which is sophisticated
enough to adequately deal with all "reliability" needs over and above
basic fault-tolerance for certain limited events. Yes 'Historically'
there were no viable ata alternatives to scsi/fibre. Very recently
the best of current sata products seem to be closing the gap with
better scsi products, but it is not clear that the gap is fully closed
or if there is enough of a track record to truly say they are equal.

If that is what you want me to "prove" with "specifics" it belongs in
a new thread. Throwing light on your incorrect assumptions, rudeness,
and lack of knowledge have already taken us too far away from 'which
sata raid 5 cards are full firmware/ghost compatible.'

I'll say again it is always a _safe_ recommendation to say that if you
are VERY nervous and unforgiving when it comes to data you should
steer clear of "entry level" or "first-to-market" products.
Furthermore I am also suggesting avoidance of raid 5 except in the
best implementations and even then with caution. A suggestion like
scsi raid 1+0 causes someone to rethink their operating and backup
strategy (without raid) or consider how much data do they actually
NEED to be so nervous about to _require_ RAID.


If you really have evidence that all raid implementations are equal &
all drive failure & error rates and tolerance of failing media,
transient error, power failure, etc. are the same I'd still like to
see some real comparative evidence (in a new thread). The group would
benefit from a substantive comparison that would explain your belief
and demonstrate you are familiar with _both_ product types. "You're
wrong' & 'I don't like your details or explanation' is hardly an
argument or a position to fight for. That's what makes you sound like
a child. If I'm so stupid, then educate me. Ranting just scares ppl
away from the thread. Pretending I am unclear or vague convinces no
one that you have a real, valid position or even that I am wrong.


Well I still don't agree but feel this additional post you
made was much more useful than those prior. However a lot
of the things you're promoting are not typically needed for
the environments you're suggesting as good candidates for
the "do it right... SCSI RAID solution". No immature
technology is a good choice and yet there is no assurance
that any particular, specific SCSI controller tech is more
mature than an ATA.. but it was never meant to be a direct
comparison between the two, rather than "reliability" is NOT
the same thing as features.

I've already posted that I was done with this thread but
posted again only to compliment you on taking the time to
more cleary express your concerns with the differences...
regardless of whether I happen to agree with them in the
context used.
 
I'm glad we're both calmer now and can come together for real
discussion. I don't expect everyone to always agree with me and
that's fine. This is just one last time to make sure I'm fully
understood. I sometimes wonder if my longer posts are
counter-productive since they're not really geared to the
reading/writing style of Usenet. I also think we got a little more
into word parsing than ideas and drives rather than raid. I'll try to
not to be too long or repetitive. (without promises)


Well I still don't agree but feel this additional post you
made was much more useful than those prior. However a lot
of the things you're promoting are not typically needed for
the environments you're suggesting as good candidates for
the "do it right... SCSI RAID solution".

My _opinion_ is that if you don't NEED _extremely_ high availability
or _absolute_best_ data integrity protection you probably don't need
RAID. If you do NEED RAID than it should be a "top-notch"
implementation or not to bother. I say this primarily because just
because a card or software generates redundant data it isn't
necessarily a foolproof fail-safe that will end up saving you time &
effort down the line or increase performance. Granted I could
probably take "ability to flash upgrade without reboot" & "redundant
dedicated fail-over or cache-mirroring paths" off the list - but I'm
not sure I would take much more off even though its for home or SOHO
use. I guess its fair to expect some disagreement on this. I will
characterize Linux, Novell, and SNMP support as 'optional' - depending
on the home network.

Partial explanation/support:
RAID requires a decent chunk of time doing research about it &
products, experimenting with stripe size, levels and other
configuration options, observing behavior/speed, testing recovery
scenarios/strategy (& that's in the best case scenario without
compatibility problems or discovering bugs). Overall productivity &
free time goes down significantly if you spend a lot of time setting
up and administering something you totally don't need or as insurance
for an event that may never happen during a short service life, or
which may be ill-prepared for certain failures you are unlucky enough
to have - all while reducing overall storage MTBF characterization.

The RAID must therefore deliver a very significant timesavings & data
protection benefit to offset this initial time, $$ investment (incl
parts, additional electricity, building in a better box, etc.) and
risk from increased complexity. No product can be seen as "reliable"
if it has difficulty or potential difficulty meeting its core
purposes/promises (in this case it is a data
security/reliability/availability/performance 'boosting' product).
Because the standard of function that must be met in order for it to
"keep all it's promises" is so high there is much less of a problem of
something being overkill than insufficient. Being able to get a bunch
of drive to work together and generate ECC data is not really the
whole poop.

Don't get me wrong RAID can be fine to use anywhere - even in home -
if you are comfortable with it and have very valuable/limited time
relative to data quantity so it has enough potential to be of real
help. But I just can't make that same big distinction between few
users/home & many users/business for general "reliability". For the
most part availability is availability and for the whole part data
integrity is data integrity regardless. You may have less users
relying on the data but you also have less human resources to manage
and secure it. Time is just as valuable when you're home. It's
probably more valuable because there is so little of it. (and yet I
keep typing...)

Call me paranoid but I'm also always suspicious of products with big
promises that try to lull me into a sense of security esp when there
are huge price discrepancies (& I mean suspicion across the whole
price range).
No immature technology is a good choice
True.

and yet there is no assurance
that any particular, specific SCSI controller tech is more
mature than an ATA..

No assurance- well OK. That's why I've been qualifying "better scsi"
or the "best scsi products" instead of claiming "all".

"Mature" is a tricky word 'cause it implies two things.
1. Track record: With many scsi product lines it's hard to argue
"track record" because the companies got bought out so often so the
product lines are interrupted. With others it is easy and they win
hands down.
2. In terms of "robustness" most scsi beat all ata hands down until
rather recently. Now its more case by case- except at the top tier
(like some reputable san stuff, etc) which blows away the best SATA
hands down.

Now there is currently a place for SATA RAID in the enterprise, but
it's mainly near-line storage, caching for tape libraries, etc. It's
a hard sell for more important roles partly for performance and partly
for not yet being "tried & true." Robustness is generally hard to
convey and compare for a client who isn't already confident in a
"track record".

If a company (like LSI for example) is simply migrating the same
technology from SCSI to ATA it is the same & just as "mature" (as far
as "robustness" but not "track record")- provided, of course that the
entire feature set is ported and there are no kinks in the process
that haven't yet been worked out or cause them to make very large
revisions/redesigns.

Implementation of RAID levels is for the most part proprietary; it
isn't entirely standardized. So I still think different offerings
merit close scrutiny esp. from companies who haven't done this type of
thing before for 'enterprise' use- so their ATA raid design goal from
day one could very well be to sell cheap, sexy storage to
'enthusiasts' and they therefore feel different customer obligations
and pressures. (I can hear the flame being typed now) There also is
one or two mutant raid levels only available on ata which makes me
weary of their claims of "robustness" as there is basically is no
track record and its a hard comparison. If they are indeed bad "disk
quality" will have nothing to do with array "reliability" in those
cases.

You may not agree but for personal storage, apart from certain
performance and design differences, quality control is relaxed because
of the relative tradeoff in profitability / defect rates. For ATA
drive manufacturing the percentile component rejection rate is
generally around 5x less rigorous than scsi drives. Since ATA drives
ship at a rate of around 6 to 1 over scsi, that amounts to a huge
difference in total questionable units. I don't know which components
tend to have higher failure rates and which supply sources are less
reputable and how that fits to individual lines. Trying to pin this
down more specifically without discussing 'inside information' is not
really possible. I can only legitimately talk about anecdotal
experience about operational success- which is what I meant early in
the thread when I talked about experiential vs engineering info in
relation to you contention about drive quality & reliability.

Of course independent reliability surveys put both kinds of devices
all over the map. That's why I didn't say all scsi drives are more
reliable than all ata ones and tried to focus on a line I've had good
experience with which is corroborated by "independent" parties (to try
to synthesize a somewhat representative sample).

This 'quality' difference makes sense for many pro vs consumer
products because with larger profit margins you can afford to tighten
quality control as well as employ better programmers and engineers and
devote more resources to testing & development, etc. In addition you
have more savvy customers who demand more from the products who you
have to satisfy with more conservative products from a reliability
standpoint. You're right, though, that there is no _assurance_ that
companies will always "do the right thing" for their consumers and the
"top of the line" is often exploitative of consumers with deep
pockets. Furthermore less units are produced so the actual difference
in profit margin isn't exactly what it outwardly appears.

Believe me though you don't want a bad _batch_ of drives in an array.
Having drives that are less susceptible to dings and vibrations and
quieter (generally FDB over rotary voice coil) more geared to heavy
constant use, and with advanced features to preserve data integrity
(like background defect scanning), and with flexibility to make the
raid work better/more compatible like upgradeable firmware and mode
page settings all come together to make a more robust "do it right"
type solution with scsi drives. (of course you first have to agree
with my cost/benefit overview to see the necessity for this & specific
recommendations should indeed change as available product attributes
change)
but it was never meant to be a direct comparison between the two

But we have to compare the two in order to settle your objection to a
good scsi RAID 1 (or variants) using a well regarded scsi drive family
is a safer bet towards the reliability end over ATA offerings as well
as whether ATA and SCSI drive quality & reliability are equal

"there isn't some unique failure point or 'lower quality' part on
ATA drives that makes them more susceptible to failure."


As far as the initial recommendation - I could have also included RAID
3 & 4 but it requires better HW. I didn't mention exact cards because
with the price it kinda depends on whether what falls in his lap is
acceptable. I also wanted to slow him down because a "newbie" looking
for cheap raid 5 is in for surprises as he learns more about raid.
"reliability" is NOT the same thing as features.

Yes in terms of semantics. No in terms of the idea I'm trying to
convey.

We're going to have to agree to disagree on this point. If you have
to fiddle with it or take it down once in a while it is not "reliable"
because it is not meeting its main purpose of very high availability.
Likewise without a full feature set that ensures data integrity it
would not be "reliable" when or if corruption is generated during disk
failure, power failure, flaky devices, noise, or whatever the specific
vulnerabilitie(s). It also isn't "reliable" if you are operating with
a controller problem & there are no diagnostics to pick it up (I have
personal experience with diagnostics mitigating loss) or with a bad
configuration backup/restore features that interfere with getting an
array back on-line - (availability expectations/time expense). You
get my point - I won't go through every line item.

The main problem is that array "reliability" is inversely related to #
spindles and also by the same calculation lower than a single disk.
These "features" that combat _all_ the reliability concerns are
vitally important for raid to bring real benefit over a single disk
and enough benefit to justify _all_ inherent costs.
I've already posted that I was done with this thread but
posted again only to compliment you on taking the time to
more cleary express your concerns with the differences...
regardless of whether I happen to agree with them in the
context used.

Thank you for taking the time to read my post and comment on it
putting aside our harsh disagreement. I hope we can continue this
tone in future threads. This sounds bizarre but I'm glad we still
don't agree. We all learn in these forums by presenting & hearing
different views - so long as they are forthright and explained. (yeah
I know that sounds like insincere cheesy BS but it's not really)


Part of why I kept responding was that confronting misconceptions
about raid as well as exploring what "reliability" means is of benefit
to the group. Another was purely selfish, as I was really hoping to
force out compelling evidence that large cheap SATA raid is "proven"
and ready to replace some other more expensive installations. I'm not
saying this to keep contention alive. In fact you made me re-look at
and reconsider product lines and a few newer products seem much more
compelling than ones I saw available even a few months ago. I still
think though I'm going to wait at least another product cycle or two
before putting a new sata in the test lab again in plans for use and
recommendation. But the day is drawing closer...

I hope we haven't scared the group off continuing to discuss more
detail of raid in future threads. I may have overdone it here. I bet
everyone is sorry you pressed me so hard for "details" (if there is
anyone still reading).
 
Back
Top