Poor raid 1 performance?

  • Thread starter Thread starter Mark
  • Start date Start date
If you compare to a non-RAID setup, RAID-1 might result in a better
throughput, for example if the connection link between the controller and
the drive is a limiting factor (ATA33 IIRC).

That is very rare and ussually erroneous. Nobody wants to use disk
attachment with interface throughput lower than disk media throughput.
However, in the common case where the limiting factor is media access,
RAID-1 will result in about the same figures as non-RAID when seen from
outside (as you explained long and wide, thanks.)

That is wrong. Mostly seen in random read performance, but also in
sequential read performance; under some, more sophisticated RAID1
controllers. Unless you meant something else by using word "figures".
OTOH, RAID-0 may deliver an higher throughput, because it can use more of
the available bandwidth on the connection link.

No, it is mostly because controller combines media throughput of both drives
(easy to implement in RAID0 - stripes).
 
Antoine Leca said:
To make a long story short:

You wish.
In news:[email protected], Folkert Rienstra va escriure:
No, you said "non-Raid".

Okay, I found the origin of the misunderstanding.

I was specifically referring to RAID variations:
] The difference between RAID 0 and RAID 1 is the distance between one
stripe
] and the next on disk.

There are no stripes on RAID1. You are obsessed with stripes. Get rid of it.
I made a little mistake while replying: I wrote
] [...] which should give a small penalty with respect to a non-RAID
] configuration (assuming media access is the bottleneck, that is).
I meant "non-RAID1" but did not make it clear.

Non-RAID1 is not any clearer either.
It clearly should have been ``contiguous reading'' (or plain ``RAID-0''
if you prefer), would have been much clearer.

Still makes no sense to me.
There is a huge difference between contiguous reading from a single
physical (ie non-RAIDed) drive and a RAIDed (logical) drive, whether
that's RAID0 or RAID1 with split reads.
I am sorry for the mistake.

OTOH, I basically assumed a 1-sector stripe (which is not real.)
Right.

Or that the stripe size is lower than available free cache on the RAID
controller (and we hope so.)

Cache has nothing got to do with it either except for when speaking of
RAID on the same channel and long sequential reads. Cache-ahead in that
case allows reading data without the read command being issued yet.
When the delayed read command for the second drive finally arrives the
data comes from cache and no extra accesstime is incurred because of that
necessarily delayed command.
My point, exactly.

I doubt it. The comment was about RAID1 read in stripe fashion ...
Since there is two drives, we end up at twice this throughput when data are
delivered to the main system by the RAID controller, and cannot make better
while reading sequencially.

.... where this comment is about RAID0. And the RAID controller doesn't do
anything magical, it is a standard controller augmented with special firmware
and drivers. Except for hardware generated checksums everything a RAID
controller can do can be done in software also.
If you compare to a non-RAID setup, RAID-1 might result in a better
throughput, for example if the connection link between the controller
and the drive is a limiting factor (ATA33 IIRC).

Nope. Has nothing got to do with it.
However, in the common case where the limiting factor is media access,
RAID-1 will result in about the same figures as non-RAID when seen from
outside

Has nothing got to do with that.
(as you explained long and wide, thanks.)

Only for the continuous benchmark example. Not for the occasional single file read.
OTOH, RAID-0 may deliver an higher throughput, because it can use more of
the available bandwidth on the connection link.

Nonsense. So can RAID1.
The interface is never a bottleneck, bandwidth wise, unless there is a deliberate
interface mismatch between drives and host interface.
 
In news:p[email protected], Peter va escriure:
That is wrong. Mostly seen in random read performance,

We were not speaking about random access, but about sequential.
but also in sequential read performance; under some, more
sophisticated RAID1 controllers.

Sorry, I cannot make sense of your point. So I shall ask plain and clear:
what is relative performance of RAID-1 versus non-RAID for (long) sequential
reads?

If I understand Folkert correctly (and he will correct me), the performances
should be about the same.

No, it is mostly because controller combines media throughput of both
drives (easy to implement in RAID0 - stripes).

No what? "combined" means "more", doesn't it?


Antoine
 
However, in the common case where the limiting factor is media
We were not speaking about random access, but about sequential.

You were not specific enough.
Sorry, I cannot make sense of your point. So I shall ask plain and clear:
what is relative performance of RAID-1 versus non-RAID for (long) sequential
reads?

Can you define "(long) sequential reads"?

There are controllers which deliver sequential read performance of a RAID1
subsystem, higher then sequential read performance of a single drive.
If I understand Folkert correctly (and he will correct me), the performances
should be about the same.



No what? "combined" means "more", doesn't it?

"No" to "because it can use more of the available bandwidth on the
connection link".
"Connection link" is per your "connection link between the controller and
the drive ... (ATA33)".

Maximum data transfer rate of a single hard disk times a number of disks, is
the limiting factor for a maximum throughput of RAID0.
 
In Folkert Rienstra va escriure:
Cache has nothing got to do with it either except for when speaking of
RAID on the same channel and long sequential reads.

Consider a single I/O command to read a number of sectors, say 2 stripes.
The RAID0 controller will issue two commands to each drive, each for a
single stripe. The drives will return the sectors, in order, _at the same
time_. The controller can pass down the sectors from the drive 0 directly to
the original poster, but it should buffer the sectors received from drive 1,
to pass them down _after_ all the sectors from drive 0 have been
transmitted.

And I only pointed out that if the memory on the controller is less than a
stripe, the situation is less clear.

I doubt it. The comment was about RAID1 read in stripe fashion ...


... where this comment is about RAID0.

No, it's about RAID1 (RAID0 can perform better than twice the
half-throughput of the drive, can't it?)

Nonsense. So can RAID1.

But you pointed out earlier that RAID1 was only reading half the sectors, so
delivered half the throughput...

What I do not know is if actual drives (with their internal caches) can
deliver the same throughput when you ask them to read say 10,000 continuous
sectors, as when you ask them to read all even-numbered sectors from 100,000
to 119,998 (count is 10,000 too). Former is typical of RAID-0, latter of
RAID-1.

The interface is never a bottleneck, bandwidth wise, unless there is
a deliberate interface mismatch between drives and host interface.

Yeah, I get that; I even believe there is design objective to have the
interface a few years in advance to drive technology, to allow the devices
implementing the interface still behaves "correctly" at the end of their
normal lifetime.

I seem to remember a time (around 1999) where the "usual" interface was at a
limit, and the newer drives then (ATA5-capable) were able to deliver more
bandwidth, so there was a waste; but I do not remember exactly where was the
bottleneck, ATA 40-wire or more plainly the PCI bus...


Antoine
 
Antoine said:
In Peter va escriure:



We were not speaking about random access, but about sequential.




Sorry, I cannot make sense of your point. So I shall ask plain and clear:
what is relative performance of RAID-1 versus non-RAID for (long) sequential
reads?
And the answer is -- it depends on the implementation, and on the workload.

Clearly, RAID1 has more hardware (seek mechanisms and on-disk R/W channels)
to deploy than a single HD, but there is no RAID1 standard which mandates
how that additional hardware will be used. A N-HD RAID1 implementation
could
be up to ~N times as fast as a single HD in read performance, or it could be
slightly slower; it could be nearly as fast as a single HD in write
performance,
or it could be as bad as ~N times slower.

One workload dependency is related to R-W ratio, since reads might be much
faster with RAID1 and since writes might be much slower with RAID1. Another
workload dependency is sequentiality, since seeks hurt RAID and non-RAID.
Another workload dependency is locality of reference, since caching matters.

Still another workload dependency for long sequential reads is how reads
which
are long as seen by the app are seen by the RAID controller and by the HDs:
long reads could be either passed through intact by the OS, or they could be
split and handed off to the RAID controller approximately in parallel,
or they
could be split and handed off to the RAID controller (completely or
partially)
serially; the RAID controller, in turn, has the same choices of whether
and how
to split its commands for handing off to the HDs; finally, for HDs with
NCQ/TCQ,
the HD has some choices for how to deal with multiple outstanding commands.

All of the entities (OS/drivers, RAID controllers, HD controllers)
between the
app and the magnetic stuff have finite resources; and all were designed
by folks
with limited time, imagination, and budget. Implementation does matter, and
it does vary.

Sorry, but life is not simple.
 
Antoine said:
Sorry, I cannot make sense of your point. So I shall ask plain and clear:
what is relative performance of RAID-1 versus non-RAID for (long)
sequential reads?

There is no answer, because it depends a lot on the specific implementation
of RAID1. If the RAID1 controller only reads from one disk, there is no
difference. If the RAID1 controller reads from both disks, it can be up to
double the speed (because it can read different data from both disks
concurrently, adding their throughput), depending on how the read commands
come from the system. In any case, the RAID1 performance can be equal to
the RAID0 performance, because the RAID1 controller can read the data
almost exactly as a RAID0 controller would read it.
No what? "combined" means "more", doesn't it?

Exactly. In this respect, RAID1 and RAID0 are the same. The main difference
is that RAID0 is more storage-efficient whereas RAID1 is more
error-tolerant.

Gerhard
 
Antoine Leca said:
In Folkert Rienstra va escriure:

Consider a single I/O command to read a number of sectors, say 2 stripes.
The RAID0 controller will issue two commands to each drive, each for a
single stripe. The drives will return the sectors, in order,
_at the same time_.

No. Virtually at the same time.
The controller can pass down the sectors from the drive 0 directly to
the original poster,

But not at interface speed. It transfers them in ~50:50 bursts.
but it should buffer the sectors received from drive 1,

Nonsense. The second drive also transfers in bursts.
The second drive bursts in between the first drive's bursts.
It uses the bandwidth that isn't used by a single drive.
That's on a single channel and isn't even necessary on seperate
channels. It's still true however on a PCI burst perspective.
to pass them down _after_ all the sectors from drive 0 have been
transmitted.

Nope. Ever heard of time sharing, multitasking?
And I only pointed out that if the memory on the controller is less than a
stripe, the situation is less clear.

You are clueless. Drives have enough buffer to buffer a single command.
Controllers have enough buffer and PCI bus bandwidth to keep up
with a drive buffer full of data at bus speed at both channels.
No, it's about RAID1 (RAID0 can perform better than twice the
half-throughput of the drive, can't it?)

Then your comments make absolutely no sense.
Then you constantly contradict yourself.
But you pointed out earlier that

.... IF ...
RAID1 was only reading half the sectors, so

.... it would have ...
delivered half the throughput...

Nope, didn't say that. I said more that that, but since you obviously
didn't understand what else I said you banned that out of your mind.
What I do not know

and lots more
is if actual drives (with their internal caches) can deliver the same
throughput when you ask them to read say 10,000 continuous sectors,

It can only read 256 sectors in one go.
as when you ask them to read all even-numbered sectors from 100,000
to 119,998 (count is 10,000 too).
Former is typical of RAID-0, latter of RAID-1.

Utter nonsense. Both read 256 (max.) contiguous sectors per go.
Yeah, I get that; I even believe there is design objective to have the
interface a few years in advance to drive technology, to allow the devices
implementing the interface still behaves "correctly" at the end of their
normal lifetime.

I seem to remember a time (around 1999) where the "usual" interface was
at a limit,

There never is for a single drive.
Channels always have enough bandwidth to support both drives on the channel.
and the newer drives then (ATA5-capable) were able to deliver more
bandwidth, so there was a waste;

There is always a waste with a single drive (or a single drive's use) on a
channel because a channel must be able to support 2 drives simultaniously.
An UDMA66 bus can still fully support a single Ultra133 drive.
but I do not remember exactly where was the bottleneck,
ATA 40-wire or more plainly the PCI bus...

Neither. Just that new drives became faster than half the bandwidth available
on the then current IDE interface.

The same goes for SCSI except that they calculate with 4 drives per channel.
 
Gerhard Fiedler said:
There is no answer, because it depends a lot on the specific implementation
of RAID1. If the RAID1 controller only reads from one disk, there is no
difference. If the RAID1 controller reads from both disks, it can be up to
double the speed (because it can read different data from both disks
concurrently, adding their throughput), depending on how the read commands
come from the system. In any case, the RAID1 performance can be equal to
the RAID0 performance,
because the RAID1 controller can read the data
almost exactly as a RAID0 controller would read it.

Or even better. The RAID1 controller isn't limited by (fixed) stripes. A file
that is split 50:50 over 2 drives is read faster than a file that is unequally
split over (an uneven number of) strips because of a specific stripe size.
 
Antoine Leca said:
In news:p[email protected], Peter va escriure:

We were not speaking about random access, but about sequential.


Sorry, I cannot make sense of your point.

Yes, you appear to be particularly thick. I have said the same thing
over and over but you appear to be unable to put that into your skull.
So I shall ask plain and clear:
what is relative performance of RAID-1 versus non-RAID for (long) sequential
reads?

RAID-0 like if it can split those reads between the members of the RAID-1.
If I understand Folkert correctly

Apparently never.
(and he will correct me),

Yes, that appears to be insurmountable.
the performances should be about the same.

Nope, never said that.
That only applies to full drive benchmarks or full
drive copying, not individual/incidental file copying.

No on the connection link.
"combined" means "more", doesn't it?

And double is more so therefor more is double?
You aren't ticking right.
 
In Peter va escriure:
"No" to "because it can use more of the available bandwidth on the
connection link".

Since it surely "can use more", you are arguing about the "because". So the
debate is about pristine cause and contributing cause, and I shall certainly
add nothing to such debate.

"Connection link" is per your "connection link between the controller
and the drive ... (ATA33)".

Yes, that was my idea.
Maximum data transfer rate of a single hard disk times a number of
disks, is the limiting factor for a maximum throughput of RAID0.

Yes. To me, it does not seem incompatible with what I wrote (I did not write
"will deliver", for instance).


Antoine
 
Folkert said:
Or even better.

Of course... but I didn't want to complicate matters. Since my initial
question about this, I have learned quite a bit here, and it seems my guess
(at that time more intuition than knowledge) that this is possible even
though many don't seem to think so wasn't far off.

Anyway, as Bob said, it depends a lot on a large number of factors,
including the RAID controller, the application, the OS, ... in short
everything between the user wanting to see the data and the disk having the
data stored.

But my initial question has long been answered... it /is/ possible for
RAID1 to have equal (or, as you point out, superior) read performance
compared to RAID0.

Gerhard
 
Bob said:
And the answer is -- it depends on the implementation, and on the
workload.

Clearly, RAID1 has more hardware (seek mechanisms and on-disk R/W
channels) to deploy than a single HD, but there is no RAID1 standard
which mandates how that additional hardware will be used. A N-HD RAID1
implementation could be up to ~N times as fast as a single HD in read
performance, or it could be slightly slower; it could be nearly as fast
as a single HD in write performance, or it could be as bad as ~N times
slower. [...]

Thanks, that was a good summary of the whole issue. Shouldn't take a thread
with dozens or hundreds of messages to clear this up... :)

Gerhard
 
Antoine Leca said:
In Peter va escriure:

Since it surely "can use more", you are arguing about the "because". So the
debate is about pristine cause and contributing cause, and I shall certainly
add nothing to such debate.

It is not clear what you mean by "pristine cause" and what by "contributing
cause", therefore I agree with your conclussion to "add nothing to such
debate".
Yes, that was my idea.


Yes. To me, it does not seem incompatible with what I wrote (I did not write
"will deliver", for instance).

Perception is all relative.
May, could, most, some, should .....whatever.
 
In Folkert Rienstra va escriure:
No. Virtually at the same time.

If you want. I do not see how it affects, but I agree your position is more
correct.

But not at interface speed. It transfers them in ~50:50 bursts.

I am not sure I got which interface (up or down, or considering both) you
are talking about. I assume the bottleneck is the drive transmitting,
correct? so the drive send bursts over the link to the RAID controller. Is
that what you meant by "not at interface speed"?

The second drive also transfers in bursts.
The second drive bursts in between the first drive's bursts.

I do not see why there would be a need for such temporal dichotomy, assuming
of course that both drives are not together on the same physical link.
For example, I imagine both are connected using SATA to different channels.

OTOH, if you are considering SCSI-linked drives, I could see your point.

Nope. Ever heard of time sharing, multitasking?

Hmm, really it is about overlapping DMA, but I got your point.
Yes, I stand corrected, there is no need for the RAID controller to deliver
the sectors in sequencial order, so no need for intermediate cache (so a
restriction I was making was in fact unnecessary).

Then your comments make absolutely no sense.

Looks like so if I judge from your posts. BTW, I appreciate the effort you
are making to try to give them sense.

Then you constantly contradict yourself.

However, this is not correct. I certainly know what I intended to write
(what you are understanding of it being different); and I consider I never
contradicted myself. I acknowledged the points I had wrong (there are some)
and as a consequence my perception evolved, but that is not contradiction,
it is learning.
I understand it could quite difficult from a external point of view to skim
the thread and have a clear perception of my idea.

It can only read 256 sectors in one go.

http://www.maxtor.com/_files/maxtor/en_us/documentation/
white_papers/big_drives_white_papers.pdf

says (page 1, 4th paragraph) there is an extension to the ATA protocol which
allows very large number of sectors. I do not know how it stands in reality
(marketing arguments, actual use of it in OS drivers, etc.) through.

I have a very remote grasp at SCSI, but it seems it allows up to 65,535
blocks to be transfered in Read(10) or Read(16) (?)


But while considering only 256 sectors, you did not answer.
Will an actual drive deliver the same throughput when you ask it to read 256
continuous sectors, as when you ask it to read all even-numbered sectors
from 100,000 to 100,510 (count is 256 too)?



Antoine
 
Antoine said:
But while considering only 256 sectors, you did not answer.
Will an actual drive deliver the same throughput when you ask it to read 256
continuous sectors, as when you ask it to read all even-numbered sectors
from 100,000 to 100,510 (count is 256 too)?
I don't know of any HDs that support a command to read every other sector.
IIRC, the specs for SCSI and ATA (SATA and PATA) only allow specifying a
starting sector and a count of the sequential sectors to be read.

Even if a hypothetical HD could be told to read every other sector, it would
read from the platter at half-speed, since it takes just as much time to
skip a sector as to read it.

{Not true if the read is satisfied from the HD's cache, of course. In that
case, the answer would be "implementation-dependent".}
 
Antoine Leca said:
In Folkert Rienstra va escriure:

If you want. I do not see how it affects, but I agree your position is more
correct.



I am not sure I got which interface (up or down, or considering both) you
are talking about.

The IDE interface.
I assume the bottleneck is the drive transmitting, correct?

Yes, the media rate.
so the drive send bursts over the link to the RAID controller.

Yes, it sends data over the interface at media rate in bursts of IDE rate.
Is that what you meant by "not at interface speed"?

Right, only in bursts of interface speed.
I do not see why there would be a need for such temporal dichotomy, assuming
of course that both drives are not together on the same physical link.

IDE is a 2 device bus interface.
For example, I imagine both are connected using SATA to different channels.

Yes, I made a comment about that later, but as usual you snipped that again.
OTOH, if you are considering SCSI-linked drives, I could see your point.

No difference if you use dual channel SCSI.
Hmm, really it is about overlapping DMA, but I got your point.
Yes, I stand corrected, there is no need for the RAID controller to deliver
the sectors in sequencial order, so no need for intermediate cache (so a
restriction I was making was in fact unnecessary).



Looks like so if I judge from your posts. BTW, I appreciate the effort you
are making to try to give them sense.



However, this is not correct. I certainly know what I intended to write
(what you are understanding of it being different); and I consider I never
contradicted myself. I acknowledged the points I had wrong (there are some)
and as a consequence my perception evolved, but that is not contradiction,
it is learning.
I understand it could quite difficult from a external point of view to skim
the thread and have a clear perception of my idea.



http://www.maxtor.com/_files/maxtor/en_us/documentation/
white_papers/big_drives_white_papers.pdf

says (page 1, 4th paragraph) there is an extension to the ATA protocol which
allows very large number of sectors. I do not know how it stands in reality
(marketing arguments, actual use of it in OS drivers, etc.) though.

That's for consumer appliances.
I have a very remote grasp at SCSI, but it seems it allows up to 65,535
blocks to be transfered in Read(10) or Read(16) (?)


But while considering only 256 sectors, you did not answer.
Will an actual drive deliver the same throughput when you ask it to read 256
continuous sectors, as when you ask it to read all even-numbered sectors
from 100,000 to 100,510 (count is 256 too)?

That question is rediculous.
 
Yes I did. You just didn't recognize what the answer was.

That has been answered before and beaten to death. No point in doing it again.
I don't know of any HDs that support a command to read every other sector.
IIRC, the specs for SCSI and ATA (SATA and PATA) only allow
specifying a starting sector and a count of the sequential sectors to be read.

Which is what I have said before too. He just won't listen.
Even if a hypothetical HD could be told to read every other sector, it would
read from the platter at half-speed, since it takes just as much time to skip
a sector as to read it.

Said that too, earlier.
{Not true if the read is satisfied from the HD's cache, of course.

On that hypothetical drive that can read every other sector in a single read.
On the real drive they come from as many reads as there are sectors to read.
 
In news:[email protected], Bob Willard va escriure:
I don't know of any HDs that support a command to read every other
sector.

Neither do I. But I believe (could be wrong) there is the possibility to
send "some" commands to be "queued" (not sure that is the correct term),
that is, without waiting for the precedent to be completed.
With that feature, one can send various otherwise independent commands in a
row, for example, 256 commands to read one sector each (the even-numbered
ones). As a result, you are asking the drive to read all even-numbered
sectors from 100,000 to 100,510.

Even if a hypothetical HD could be told to read every other sector,
it would read from the platter at half-speed, since it takes just as
much time to skip a sector as to read it.

That is what I thought too, and it seems we all agree at this point.


It seemed to me that a sequencial read (issued by the OS as various I/O
commands) from a RAID-1 array will be seen by one drive inside the array
just this way, as a successiveness queue of commands to read the
even-numbered sectors.

(And please do not tell me it is silly to cut into such small units as
sectors, and that it would be much more efficient to use bigger "block": it
was my point initially behind the word "stripe", admittedly misused here;
but it appeared it was misunderstood so I needed to simplify the scheme.
Vocabulary is a big problem for beginners.)


Now, if the OS is not stupid/silly, it will group the read commands into a
_unique_ one, for 512 sectors here (and the RAID controller can do it also,
just in case); and then the controller will only issue one command to the
drive, for half the sectors (one drive will provide sectors 100,000 to
100,255; and the second 100,256 to 100,511).

*Maybe* this was the point that Folkert was trying to make "initially"
(when he
said I was too much focused ("fascinated") on hardware and I were missing
the overall picture, "IO-commands".

Could very well have been so, as seen when I re-read the thread (for
example, what Folkert said to Gerhard Fiedler in
I understand
it is almost exactly what I wrote just above.)

{Not true if the read is satisfied from the HD's cache, of course.
In that case, the answer would be "implementation-dependent".}

Of course. Which is why I tried to avoid this effect by referring to the
vagueness of "long" reads, thinking it will be clear.


Antoine
 
In Folkert Rienstra va escriure:
IDE is a 2 device bus interface.

For my enlightment, how common are RAID controllers using (paired) IDE
devices?

I learned reading
http://www.storagereview.com/guide2000/ref/hdd/perf/raid/conf/ctrlMultiple.html,
but it seems a bit old (2000?), particularly when it comes to disk
interfaces.

Where can I find present-day informations?

Yes, I made a comment about that later, but as usual you snipped that
again.

About "single channel" vs. "separate channel", right?
I am sorry, I did not (then) match the terminology, now I did better (read
the above article, for once.)

No difference if you use dual channel SCSI.

I was assuming this would allow two transferts to occur simultaneously,
wouldn't it?



Thanks for your explanations, and your patience. As you can read in the
other post, I perhaps finally understood your point.


Antoine
 
Back
Top