High Salary Software Jobs: raid technology

KNOW MORE ABOUT RAID 2 and RAID 3

When introducing the RAID levels we are sometimes asked: 'and what about RAID 2and RAID 3?'. The early work on RAID began at a time when disks were not yet veryreliable: bit errors were possible that could lead to a written 'one' being read as 'zero'or a written 'zero' being read as 'one'. In RAID 2 the Hamming code is used, so that redundant information is stored in addition to the actual data. This additional data permits the recognition of read errors and to some degree also makes it possible to correct them. Today, comparable functions are performed by the controller of each individual hard disk,which means that RAID 2 no longer has any practical signiﬁcance.
Like RAID 4 or RAID 5, RAID 3 stores parity data. RAID 3 distributes the data of a block amongst all the disks of the RAID 3 system so that, in contrast to RAID 4 or RAID 5, all disks are involved in every read or write access. RAID 3 only permits the reading34 INTELLIGENT DISK SYSTEMS and writing of whole blocks, thus dispensing with the write penalty that occurs in RAID
4 and RAID 5. The writing of individual blocks of a parity group is thus not possible. In addition, in RAID 3 the rotation of the individual hard disks is synchronized so that the data of a block can truly be written simultaneously. RAID 3 was for a long time called the recommended RAID level for sequential write and read proﬁles such as data mining and video processing. Current hard disks come with a large cache of their own,which means that they can temporarily store the data of an entire track, and they have signiﬁcantly higher rotation speeds than the hard disks of the past. As a result of theseinnovations, other RAID levels are now suitable for sequential load proﬁles, meaning that RAID 3 is becoming less and less important.2.5.6 A comparison of the RAID levelsThe various RAID levels raise the question of which RAID level should be used when.Table 2.1 compares the criteria of fault-tolerance, write performance, read formanceand space requirement for the individual RAID levels. The evaluation of the criteria canbe found in the discussion in the previous sections. CAUTION PLEASE: The comparison of the various RAID levels discussed in this section is only applicable to the theoretical basic forms of the RAID level in question.In practice, manufacturers of disk subsystems have design options in
• the selection of the internal physical hard disks;
• the I/O technique used for the communication within the disk subsystem;
• the use of several I/O channels;
• the realization of the RAID controller;
• the size of the cache; and
• the cache algorithms themselves.
The performance data of the speciﬁc disk subsystem must be considered very carefully
for each individual case. For example, in the previous chapter measures were discussed
Table 2.1 The table compares the theoretical basic forms of the various RAID levels. In
practice there are very marked differences in the quality of the implementation of RAID
controllers
RAID level Fault-tolerance Read performance Write performance Space requirement
RAID 0 none good very good minimal
RAID 1 high poor poor high
RAID 10 very high very good good high
RAID 4 high good very very poor low
RAID 5 high good very poor low2.6 CACHING: ACCELERATION OF HARD DISK ACCESS 35
that greatly reduce the write penalty of RAID 4 and RAID 5. Speciﬁc RAID controllers
may implement these measures, but they do not have to.
Subject to the above warning, RAID 0 is the choice for applications for which the
maximum write performance is more important than protection against the failure of a
disk. Examples are the storage of multimedia data for ﬁlm and video production and the
recording of physical experiments in which the entire series of measurements has no value
if all measured values cannot be recorded. In this case it is more beneﬁcial to record all
of the measured data on a RAID 0 array ﬁrst and then copy it after the experiment, for
example on a RAID 5 array. In databases, RAID 0 is used as a fast store for segments in
which intermediate results for complex requests are to be temporarily stored. However, as
a rule hard disks tend to fail at the most inconvenient moment so database administrators
only use RAID 0 if it is absolutely necessary, even for temporary data.
With RAID 1, performance and capacity are limited because only two physical hard
disks are used. RAID 1 is therefore a good choice for small databases for which the
conﬁguration of a virtual RAID 5 or RAID 10 disk would be too large. A further important
ﬁeld of application for RAID 1 is in combination with RAID 0.
RAID 10 is used in situations where high write performance and high fault-tolerance
are called for. For a long time it was recommended that database log ﬁles be stored on
RAID 10. Databases record all changes in log ﬁles so this application has a high write
component. After a system crash the restarting of the database can only be guaranteed if all
log ﬁles are fully available. Manufacturers of storage systems disagree as to whether this
recommendation is still valid as there are now fast RAID 4 and RAID 5 implementations.
RAID 4 and RAID 5 save disk space at the expense of a poorer write performance.
For a long time the rule of thumb was to use RAID 5 where the ratio of read operations
to write operations is 70 : 30. At this point we wish to repeat that there are now storage
systems on the market with excellent write performance that store the data internally usingRAID 4 or RAID 5.

KNOW MORE ABOUT DIFFERENT RAID LEVELS

RAID has developed since its original deﬁnition in 1987. Due to technical progress some RAID levels are now practically meaningless, whilst others have been modiﬁed or added at a later date. This section introduces the RAID levels that are currently the most signiﬁcant in practice. We will not introduce RAID levels that represent manufacturer-speciﬁc variants and variants that only deviate slightly from the basic forms mentioned in the following.

RAID 0: block-by-block striping
RAID 0 distributes the data that the server writes to the virtual hard disk onto one physicalhard disk after another block-by-block (block-by-block striping). 2.9 shows a RAIDarray with four physical hard disks. In 2.9 the server writes the blocks A, B, C, D,E, etc. onto the virtual hard disk one after the other. The RAID controller distributes the sequence of blocks onto the individual physical hard disks: it writes the ﬁrst block, A, to the ﬁrst physical hard disk, the second block, B, to the second physical hard disk, block C to the third and block D to the fourth. Then it begins to write to the ﬁrst physical hard
disk once again, writing block E to the ﬁrst disk, block F to the second, and so on. RAID 0 increases the performance of the virtual hard disk as follows: the individual hard disks can exchange data with the RAID controller via the I/O channel signiﬁcantly more quickly than they can write to or read from the rotating disk. In 2.9 the RAID controller sends the ﬁrst block, block A, to the ﬁrst hard disk. This takes some time to write the block to the disk. Whilst the ﬁrst disk is writing the ﬁrst block to the physical hard disk, the RAID controller is already sending the second block, block B, RAID 0 (striping): as in all RAID levels, the server sees only the virtual hard disk.The RAID controller distributes the write operations of the server amongst several physical hard disks. Parallel writing means that the performance of the virtual hard disk is higher than that of the individual physical hard disks to the second hard disk and block C to the third hard disk. In the meantime the ﬁrst two physical hard disks are still engaged in depositing their respective blocks onto the
physical hard disk. If the RAID controller now sends block E to the ﬁrst hard disk, then this has written block A at least partially, if not entirely, to the physical hard disk.In the example, the throughput can thus be approximately quadrupled: individual hard disks currently (2003) achieve a throughput of around 50 MByte/s. The four physical hard disks achieve a total throughput of around 4 × 50 MByte/s ≈ 200 MByte/s. CurrentI/O techniques such as SCSI or Fibre Channel achieve a throughput of 160 MByte/s or 200 MByte/s. If the RAID array consisted of just three physical hard disks the total throughput of the hard disks would be the limiting factor. If, on the other hand, the RAID array consisted of ﬁve physical hard disks the I/O path would be the limiting factor. With ﬁve or more hard disks, therefore, performance increases are only possible if the hard disks are connected to different I/O paths so that the load can be striped not only over several physical hard disks, but also over several I/O paths.RAID 0 increases the performance of the virtual hard disk, but not its fault-tolerance.If a physical hard disk is lost, all the data on the virtual hard disk is lost. To be precise,therefore, the 'R' for 'Redundant' in RAID is incorrect in the case of RAID 0, with'RAID 0' standing instead for 'zero redundancy'.

RAID 1: block-by-block mirroring

In contrast to RAID 0, in RAID 1 fault-tolerance is of primary importance. The basicform of RAID 1 brings together two physical hard disks to form a virtual hard disk by mirroring the data on the two physical hard disks. If the server writes a block to the virtual
hard disk, the RAID controller writes this block to both physical hard disks The individual copies are also called mirrors. Normally, two or sometimes three copiesof the data are kept (three-way mirror).In a normal operation with pure RAID 1, performance increases are only possible inread operations. After all, when reading the data the load can be divided between thetwo disks. However, this gain is very low in comparison to RAID 0. When writing withRAID 1 it tends to be the case that reductions in performance may even have to be taken
into account. This is because the RAID controller has to send the data to both hard disks.This disadvantage can be disregarded for an individual write operation, since the capacityof the I/O channel is signiﬁcantly higher than the maximum write speed of the two hard
disks put together. However, the I/O channel is under twice the load, which hinders other data trafﬁc using the I/O channel at the same time

RAID 0+1/RAID 10: striping and mirroring combined

The problem with RAID 0 and RAID 1 is that they increase either performance (RAID 0)or fault-tolerance (RAID 1). However, it would be nice to have both performance and (mirroring): as in all RAID levels, the server sees only the virtual hard disk. The RAID controller duplicates each of the server's write operations onto two physical hard disks. After the failure of one physical hard disk the data can still be read from the other diskfault-tolerance. This is where RAID 0+1 and RAID 10 come into play. These two RAID levels combine the ideas of RAID 0 and RAID 1.RAID 0+1 and RAID 10 each represent a two-stage virtualization hierarchy. shows the principle behind RAID 0+1 (mirrored stripes). In the example, eight physicalhard disks are used. The RAID controller initially brings together each four physical hard disks to forma total of two virtual hard disks that are only visible within the RAID controller by means of RAID 0 (striping). In the second level, it consolidates these two virtual hard disks into a single virtual hard disk by means of RAID 1 (mirroring); only this virtual hard disk is visible to the server.In RAID 10 (striped mirrors) the sequence of RAID 0 (striping) and RAID 1 (mirroring)
is reversed in relation to RAID 0+1 (mirrored stripes). 2.12 shows the principleunderlying RAID 10 based again on eight physical hard disks. In RAID 10 the RAID controller initially brings together the physical hard disks in pairs by means of RAID 1
(mirroring) to form a total of four virtual hard disks that are only visible within the RAID controller. In the second stage, the RAID controller consolidates these four virtual hard disks into a virtual hard disk by means of RAID 0 (striping). Here too, only this last
virtual hard disk is visible to the server.In both RAID 0+1 and RAID 10 the server sees only a single hard disk, which is larger, faster and more fault-tolerant than a physical hard disk. We now have to ask the question: which of the two RAID levels, RAID 0+1 or RAID 10, is preferable? The question can be answered by considering that when using RAID 0 the failure of a hard disk leads to the loss of the entire virtual hard disk. In the example relating to RAID 0+1 (2.11) the failure of a physical hard disk is thus equivalent to the effective failure of four physical hard disks (2.13). If one of the other four physical hard disks is lost, then the data is lost. In principle it is sometimes possible to reconstructthe data from the remaining disks, but the RAID controllers available on the market cannot do this particularly well.In the case of RAID 10, on the other hand, after the failure of an individual physicalhard disk, the additional failure of a further physical hard disk – with the exception of the 2.11 RAID 0+1 (mirrored stripes): as in all RAID levels, the server sees only the virtual hard disk. Internally, the RAID controller realizes the virtual disk in two stages: in the ﬁrst stageit brings together every four physical hard disks into one virtual hard disk that is only visible within the RAID controller by means of RAID 0 (striping). In the second stage it consolidates these two virtual hard disks by means of RAID 1 (mirroring) to form the hard disk that is visibleto the server RAID 10 (striped mirrors): as in all RAID levels, the server sees only the virtual hard disk. Here too, we proceed in two stages. The sequence of striping and mirroring is reversed in relation to RAID 0+1. In the ﬁrst stage the controller links every two physical hard disks by means of RAID 1 (mirroring) to a virtual hard disk, which it uniﬁes by means of RAID 0 (striping) in the second stage to form the hard disk that is visible to the server corresponding mirror – can be withstood RAID 10 thus has a signiﬁcantly higher fault-tolerance than RAID 0+1. In addition, the cost of restoring the RAID system after the failure of a hard disk is much lower in the case of RAID 10 than RAID 0+1. In RAID 10 only one physical hard disk has to be recreated. In RAID 0+1, on the other hand, a virtual hard disk must be recreated that is made up of four physical disks. However, the cost of recreating the defective hard disk can be signiﬁcantly reduced because a physical
hard disk is exchanged as a preventative measure when the number of read errors start to increase. In this case it is sufﬁcient to copy the data from the old disk to the new.However, things look different if the performance of RAID 0+1 is compared with the
performance of RAID 10. In Section 5.1 we discuss a case study in which the use of RAID 0+1 is advantageous.

RAID 4 and RAID 5: parity instead of mirroring
RAID 10 provides excellent performance at a high level of fault-tolerance. The problem with this is that mirroring using RAID 1 means that all data is written to the physical hard disk twice. RAID 10 thus doubles the required storage capacity. In RAID 10 (striped mirrors) the consequences of the failure of a physical hard disk are not as serious as in RAID 0+1 (mirrored stripes). All virtual hard disks remain intact.The restoration of the data from the failed hard disk is simpleThe idea of RAID 4 and RAID 5 is to replace all mirror disks of RAID 10 with a single parity hard disk. shows the principle of RAID 4 based upon ﬁve physical hard disks. The server again writes the blocks A, B, C, D, E, etc. to the virtual hard disk sequentially. The RAID controller stripes the data blocks over the ﬁrst four physical hard disks. Instead of mirroring all data onto the further four physical hard disks, as in RAID 10, the RAID controller calculates a parity block for every four blocks and writes this onto the ﬁfth physical hard disk. For example, the RAID controller calculates the parity block PABCD for the blocks A, B, C and D. If one of the four data disks fails, the RAID controller can reconstruct the data of the defective disks using the three other data disks and the parity disk. In comparison to the examples in (RAID 0+1) and 2.12 (RAID 10), RAID 4 saves three physical hard disks. As in all other RAID levels, the server again sees only the virtual disk, as if it were a single physical hard disk. RAID 4 (parity disk) is designed to reduce the storage requirement of RAID 0+1and RAID 10. In the example, the data blocks are distributed over four physical hard disks by means of RAID 0 (striping). Instead of mirroring all data once again, only a parity block is
stored for each four blocks From a mathematical point of view the parity block is calculated with the aid of the logical XOR operator (Exclusive OR). In the example from for example, the equation PABCD = A XOR B XOR C XOR D applies.The space saving offered by RAID 4 and RAID 5, which remains to be discussed, comes at a price in relation to RAID 10. Changing a data block changes the value of the associated parity block. This means that each write operation to the virtual hard disk requires (1) the physical writing of the data block, (2) the recalculation of the parity block and (3) the physical writing of the newly calculated parity block. This extra cost for write operations in RAID 4 and RAID 5 is called the write penalty of RAID 4 or the write penalty of RAID 5.The cost for the recalculation of the parity block is relatively low due to the mathemat-
ical properties of the XOR operator. If the block A is overwritten by block ˜ Aand is the difference between the old and new data block, then = A XOR ˜ A. The new parity block ˜ P can now simply be calculated from the old parity block P and ,i.e.
˜ P = P XOR . Proof of this property can be found in Appendix A. Therefore, if PABCD is the parity block for the data blocks A, B, C and D, then after the data block A has been changed, the new parity block can be calculated without knowing the maining blocks B, C and D. However, the old block A must be read in before overwriting the physical hard
disk in the controller, so that this can calculate the difference.When processing write commands for RAID 4 and RAID 5 arrays, RAID controllers use the above-mentioned mathematical properties of the XOR operation for the recalculation of the parity block. shows a server that changes block D on the virtual hard disk. The RAID controller reads the data block and the associated parity block from the disk in question into its cache. Then it uses the XOR operation to calculate the difference
between the old and the new parity block, i.e. = D XOR ˜ D, and from this the new parity block ˜ PABCD by means of ˜ PABCD = PABCD XOR . Therefore it is not necessary to read in all four associated data blocks to recalculate the parity block. To conclude the write operation to the virtual hard disk, the RAID controller writes the new data block and the recalculated parity block onto the physical hard disks in question.Good RAID 4 and RAID 5 implementations are capable of reducing the write penalty even further for certain load proﬁles. For example, if large data quantities are written sequentially, then the RAID controller can calculate the parity blocks from the data ﬂowwithout reading the old parity block from the disk. If, for example, the blocks E, F, G and H in are written in one go, then the controller can calculate the parity block PEFGH from them and overwrite this without having previously read in the old value.Likewise, a RAID controller with a suitably large cache can hold frequently changed parity blocks in the cache after writing to the disk, so that the next time one of the data blocks in question is changed there is no need to read in the parity block. In both cases the I/O load is now lower than in the case of RAID 10. In the example only ﬁve physical blocks now need to be written instead of eight as is the case with RAID 10.RAID 4 saves all parity blocks onto a single physical hard disk. For the example in this means that the write operations for the data blocks are distributed over four physical hard disks. However, the parity disk has to handle the same number of write operations all on its own. Therefore, the parity disk become the performance bottleneck of RAID 4 if there are a high number of write operations.
To get around this performance bottleneck, RAID 5 distributes the parity blocks overall hard disks. illustrates the procedure. As in RAID 4, the RAID controller writes the parity block PABCD for the blocks A, B, C and D onto the ﬁfth physical hard

High Salary Software Jobs

Monday, March 24, 2008

KNOW MORE ABOUT RAID 2 and RAID 3

KNOW MORE ABOUT DIFFERENT RAID LEVELS

Blog Archive

Your Name:
Your E-Mail:

Your Name:
Your E-Mail:

Your Name:
Your E-Mail: