Vous êtes sur la page 1sur 3

00:02

PROFESSOR: A property shared by all of the RAID systems that we've looked at, up to
now, is that they can only withstand failure of a single drive at a time. So that
drive has to be replaced before a second drive fails. So it has to be replaced
within the mean time to failure, or the drives within the system.

In this video, we'll examine a few cases in which the system can withstand the
failure of two disks. And as long as the failure disks are replaced before a third
fails, then the system can continue to operate. The first of these is known as RAID
level six. It makes use of two independent parity functions.

One, we can refer to it as P, could be the same as a simple, straightforward X or


function that we describe for RAIDS level four or five. But the second would be an
independent function, perhaps based on say a Reed-Solomon code. So it uses a
different era check scheme.

So in this case, if a disk fails, and the corresponding parity that's required to
reconstruct that data on the disk is also missing, then we can use the alternate
parity, sort of as a back up. And know that as the number of disks increase in the
system, the probability of failure increases as well.

So that mean time to failure is divided by N, where N is the number of disks in the
entire system. So with RAID level six we need to make use of two additional disks,
in order to accommodate the two separate parity functions that are included.
Another scheme combines two of the previous levels that we examined.

The first here it's called RAID01, or in some cases, zero plus one/ that's because
the disks make up the array are organized as two groups of RAID arrays. We have two
RAID0 systems, each of which is the mirror image of the other. So from the point of
view of this mirrored controller it sees two logical disks.

Both of which are mirror images, or another way of saying it is one is the shadow
of the other. So if a disk in one of these RAID0 systems fails then the system can
continue to operate using the alternate copies. So once the failing disk is
replaced then it's a matter of copying the content of the remaining disk onto the
replacement disk.

Once that's done, then the system is back to fully operational mode. So the four
disks within each RAID0 system are made to appear to be a single disk, by the
stripe controller, to the mirror controller. An alternate scheme that also combines
RAID 1 and 0. It's called RAID10, or one plus zero.

In this case, we have a group of N disks where in each group there are two mirror
images. So the stripe controller thinks there are four separate disks in this
particular case, because each mirrored controller makes this mirrored pair appeared
to be a single disk to the stripe controller.

So now if a disk fails within one of these mirrored pairs, the system can continue
to operate. The only way the system becomes inoperable is if both mirrored images
within one pair fail. For this system, we could sustain more than one failure, and
continue operating, as long as the multiple failures are not within the same
mirrored group.

Comparing these two approaches, we see that in the RAID1 system, there are three
separate controllers required, two stripe controllers, and the single mirrored
controller. While in the RAID10 system, which has an equivalent number of disks as
this RAID01, there are five separate controllers, because we need a separate
controller for each mirrored pair, and one stripe controller that manages those
four logical disks that it sees.

So it would be more expensive, in terms of the number of controllers to implement


the RAID10. In terms of reliability though, even if only one disk within these two
RAID0 mirror image fails, then that entire group would be marked as being
inoperable, and the system continues operating using the alternate mirror image.

Once the disk is replaced that fails within one of the RAID0 systems, since that
appears to be a single logical drive, all of the data from the alternate image
would have to be copied onto these four disks, even though only one of them has
failed. In order for this RAID10 system to stop operating the two disks that failed
would have to be within the same mirror image, or mirrored pair.

So up to four disks could fail here. One from each of the mirrored pairs, and the
stripe controller still sees all four disks being available. Whereas, in the RAID01
system, if multiple disks failed, they would have to be within the same RAID0
mirror image, in order for the system to continue operating.

So once a disk fails, and a replacements is made, there's a certain amount of time
needed to reconstruct, and rewrite the data onto that replacement disk. So for
RAID01, that would be more time consuming due to the need to write all four disks,
even though only one failed within the group.

Whereas with RAID10, we only need to copy the content of the remaining operable
disk within a mirrored pair onto the replacement disk, within that pair. So there's
less data to rewrite in that case. Here's a summary that compares the various RAID
levels. Recall that for RAID0, there are no redundant check disks, because there's
no duplication, or error correcting information stored.

So the aim here is to simply improve the performance by allowing parallel access to
the records that comprise a file with each record coming from a different disk. So
in situations where reliability, or redundancy is not important, that would be an
attractive alternative due to the relatively low expense.

RAID1 is seldom used because of the need to fully duplicate all of the data disks.
So if we have N disc, we'll have 2 N total, because each data disk has to have an
exact mirror image. And we previously saw for RAID level 2 there's a Hamming code
used as the basis of the check data.

And that requires more time to compute. And the fact that the interleaving occurs
at the bit level, makes this less attractive. So it is seldom used, or never used,
really, commercially. RAID level 3 reduces the number of extra check disks down to
a single disk, because it computes a simple, exclusive or parity function across
the various strips, within a single stripe, to generate the parity that's written
on that check disk.

Recall that RAID level three employees bit level interleaving. The access has all
ganged together so that they move in unison. So that promotes the access to the
content of an entire strike. So if we're writing large amounts of data, or reading
large amounts of data, and all of the blocks reside within a single stripe, then
this would be an effective way of retrieving the data.

With RAID level four, the accesses are allowed to move independently. So they don't
have to be at the same relative position on each disk at the same time. But even
here, the parity is still concentrated on the single disk drive. It's simply
computed on a block level basis, rather than a low level basis.

Having the parity concentrating on a single drive can preclude doing to two writes
to different blocks, if they reside within different stripes. That's because
updating those two blocks would require updating at least two different parity
blocks on the single check disk, and that disk can't support two writes at the same
time.

RAID level five still uses a single parity disk. But rather than concentrating the
parity on a single drive, the parity is distributed among all of the drives within
the array. So that makes it possible, in some cases, to do multiple writes in
parallel. Even if they don't write to blocks within the same stripe.

As long as the data block, and the corresponding parity block do not reside on the
same disk, then they can be performed in parallel. Recall that each disk can only
do one read, or one right at a time. And with RAID6, we introduce a second parity
function that is independent of the first.

So if a disk fails, and the parity required to reconstruct the missing data block
is not available, due to the failure, than an alternate disk can be used that
contains the backup, or a secondary parity function. So this allows the RAID6
system to withstand up to two disks failing before one is replaced.

Also realize that the number of disks in the array increases, the mean time to
failure, or the average time before the disks fails, will be cut down by a factor
of N, where N is the number of disks in the array. So these systems have to have
replacements made before an additional disk fails, and makes the entire system
inoperable.

So in the levels below RAID6 a single disk is the limit. Whereas, with RAID6, up to
two can fail. So these systems are sometimes referred to as hot swappable, because
they can continue operating, even during the replacement procedure. Disks can be
substituted, new disks can be substituted, and the data can be reconstructed while
the system is still operating.

Although, it may operate at a reduced performance level. That concludes our


coverage of RAID systems.

Vous aimerez peut-être aussi