RAID 5

  • Thread starter Thread starter *eguy
  • Start date Start date
E

*eguy

Hi all,

RAID5 need at least 3 disks. If I have 3 disks configured as RAID5, then one
disk fails, does the rest still run as RAID5 when it has only two disks
left? If not, do I have to have a hot spare to switch automatically when one
disk fails?
Can you give me an example? Something like I have data A, with three disks,
it will writes A1 to disk1, A2 to disk2, and Ap to disk3. And now we only
have two disks so how can A be written?

Thanks,

guy
 
*eguy said:
Hi all,

RAID5 need at least 3 disks. If I have 3 disks configured as RAID5, then one
disk fails, does the rest still run as RAID5 when it has only two disks
left? If not, do I have to have a hot spare to switch automatically when one
disk fails?
Can you give me an example? Something like I have data A, with three disks,
it will writes A1 to disk1, A2 to disk2, and Ap to disk3. And now we only
have two disks so how can A be written?

RAID 5 works by taking a chunk of data, spliting it into 2 chunks,
calculating parity information (forming a 3rd chunk) and placing each chunk
on a separate drive.

In the event of a single drive failure, any 2 chunks can be used to
determine what was in the 3rd chunk. If chunk A and chunk B were data, then
chunk C would have been the parity information. If A and B survive, C might
be used just to check the integrity of A and B. If A and C survive, B can be
calculated (and the same is true if B and C survive).

A hot spare is very useful in situations where you don't monitor the
hardware health regularly (or can't take down the machine for maintenance)
and are worried about a double failure.

With a failed drive in the RAID 5 array, the system continues on doing what
it has always done, but it can't write to the failed drive. Still, 2 chunks
of data, even the new data, survive and integrity is maintained.

During the recovery period, the system uses the information from the 2
surviving drives to recalculate the missing data on the failed (and by now
replaced) drive, just as it would during normal system operation with a
failed drive still in place.
 
With a failed drive in the RAID 5 array, the system continues on doing
what
it has always done, but it can't write to the failed drive. Still, 2 chunks
of data, even the new data, survive and integrity is maintained.

I'm still not clear with your explanation above. The system is able to write
data to the remaining two disks. Is the parity chunk included? It used to
write A1 to disk 1, A2 to disk 2, and Ap to disk 3. So there is no disk 3
any more. What will the system do with this situation. I think I still think
in a wrong way.

Please help to clearify me.
Thanks,

eguy
 
I'm still not clear with your explanation above. The system is able to
write data to the remaining two disks. Is the parity chunk included?
It used to write A1 to disk 1, A2 to disk 2, and Ap to disk 3. So
there is no disk 3 any more. What will the system do with this
situation. I think I still think in a wrong way.

Please help to clearify me.
Thanks,

eguy

No the parity chunk is not included when you are missing a disk. It just
writes out the actual data. Then when you put a new drive in the system
to replace the failed one it will calculate the data that needs to be
written to it to regerate the RAID set. The other thing to remember is
the parity info is written to different drives for different data. That
way with 3 drives each drive will be 2/3 data and 1/3 parity.

Leonard Severt

Windows 2000 Server Setup Team
 
*eguy said:
I'm still not clear with your explanation above. The system is able to write
data to the remaining two disks. Is the parity chunk included? It used to
write A1 to disk 1, A2 to disk 2, and Ap to disk 3. So there is no disk 3
any more. What will the system do with this situation. I think I still think
in a wrong way.


Your description is RAID 4. Both RAID 4 and RAID 5 work similarly (splitting
data into chunks and calculating a parity chunk). RAID 5 intersperses parity
information while RAID 4 has a dedicated parity drive. RAID 5 operates in
"round robin" fashion. RAID 4 does not. Using your nomenclature, A1 would go
to disk 1 some of the time, to disk 2 some of the time, and to disk 3 some
of the time. In a 3-drive RAID 5 array, it would look something like:

A1 A2 Ap
B2 Bp B1
Cp C1 C2
D1 D2 Dp

Comparied to RAID 4 which would look something like:

A1 A2 Ap
B1 B2 Bp
C1 C2 Cp
D1 D2 Dp

The parity information still gets calculated, and RAID 5 still intersperses
the chunks, but drops the chunk that would have been placed on the failed
drive (sometimes 1, sometimes 2, and sometimes p).

The parity calculation uses XOR (exclusive OR). In binary math, comparing OR
to XOR:

1 XOR 1 = 0
1 XOR 0 = 1
0 XOR 0 = 0

1 OR 1 = 1
1 OR 0 = 1
0 OR 0 = 0

Given the data split into 2 chunks:

100 010

The parity chunk would be:

110

If you lose the 2nd chunk (010 in this example), using the XOR function on
the surviving data recalculates the missing information.

100 XOR 110 = 010

This is why RAID 5 can survive a single disk failure. If RAID 5 were to
suffer a multidisk failure, not enough information would survive to
recalculate the missing information. The RAID recovery procedure ends up
being relatively simple math. Users and SysAdmins aren't even aware that a
drive failure took place because the system is still running and the
information is still accessible. Using hot swap or hot spare drives, the
SysAdmin can even perform the recovery while the system is in use (with a
noticable and understandable performance degradation).
 
Back
Top