KNOW MORE ABOUT RAID 2 and RAID 3
When introducing the RAID levels we are sometimes asked: 'and what about RAID 2and RAID 3?'. The early work on RAID began at a time when disks were not yet veryreliable: bit errors were possible that could lead to a written 'one' being read as 'zero'or a written 'zero' being read as 'one'. In RAID 2 the Hamming code is used, so that redundant information is stored in addition to the actual data. This additional data permits the recognition of read errors and to some degree also makes it possible to correct them. Today, comparable functions are performed by the controller of each individual hard disk,which means that RAID 2 no longer has any practical significance.
Like RAID 4 or RAID 5, RAID 3 stores parity data. RAID 3 distributes the data of a block amongst all the disks of the RAID 3 system so that, in contrast to RAID 4 or RAID 5, all disks are involved in every read or write access. RAID 3 only permits the reading34 INTELLIGENT DISK SYSTEMS and writing of whole blocks, thus dispensing with the write penalty that occurs in RAID
4 and RAID 5. The writing of individual blocks of a parity group is thus not possible. In addition, in RAID 3 the rotation of the individual hard disks is synchronized so that the data of a block can truly be written simultaneously. RAID 3 was for a long time called the recommended RAID level for sequential write and read profiles such as data mining and video processing. Current hard disks come with a large cache of their own,which means that they can temporarily store the data of an entire track, and they have significantly higher rotation speeds than the hard disks of the past. As a result of theseinnovations, other RAID levels are now suitable for sequential load profiles, meaning that RAID 3 is becoming less and less important.2.5.6 A comparison of the RAID levelsThe various RAID levels raise the question of which RAID level should be used when.Table 2.1 compares the criteria of fault-tolerance, write performance, read formanceand space requirement for the individual RAID levels. The evaluation of the criteria canbe found in the discussion in the previous sections. CAUTION PLEASE: The comparison of the various RAID levels discussed in this section is only applicable to the theoretical basic forms of the RAID level in question.In practice, manufacturers of disk subsystems have design options in
• the selection of the internal physical hard disks;
• the I/O technique used for the communication within the disk subsystem;
• the use of several I/O channels;
• the realization of the RAID controller;
• the size of the cache; and
• the cache algorithms themselves.
The performance data of the specific disk subsystem must be considered very carefully
for each individual case. For example, in the previous chapter measures were discussed
Table 2.1 The table compares the theoretical basic forms of the various RAID levels. In
practice there are very marked differences in the quality of the implementation of RAID
controllers
RAID level Fault-tolerance Read performance Write performance Space requirement
RAID 0 none good very good minimal
RAID 1 high poor poor high
RAID 10 very high very good good high
RAID 4 high good very very poor low
RAID 5 high good very poor low2.6 CACHING: ACCELERATION OF HARD DISK ACCESS 35
that greatly reduce the write penalty of RAID 4 and RAID 5. Specific RAID controllers
may implement these measures, but they do not have to.
Subject to the above warning, RAID 0 is the choice for applications for which the
maximum write performance is more important than protection against the failure of a
disk. Examples are the storage of multimedia data for film and video production and the
recording of physical experiments in which the entire series of measurements has no value
if all measured values cannot be recorded. In this case it is more beneficial to record all
of the measured data on a RAID 0 array first and then copy it after the experiment, for
example on a RAID 5 array. In databases, RAID 0 is used as a fast store for segments in
which intermediate results for complex requests are to be temporarily stored. However, as
a rule hard disks tend to fail at the most inconvenient moment so database administrators
only use RAID 0 if it is absolutely necessary, even for temporary data.
With RAID 1, performance and capacity are limited because only two physical hard
disks are used. RAID 1 is therefore a good choice for small databases for which the
configuration of a virtual RAID 5 or RAID 10 disk would be too large. A further important
field of application for RAID 1 is in combination with RAID 0.
RAID 10 is used in situations where high write performance and high fault-tolerance
are called for. For a long time it was recommended that database log files be stored on
RAID 10. Databases record all changes in log files so this application has a high write
component. After a system crash the restarting of the database can only be guaranteed if all
log files are fully available. Manufacturers of storage systems disagree as to whether this
recommendation is still valid as there are now fast RAID 4 and RAID 5 implementations.
RAID 4 and RAID 5 save disk space at the expense of a poorer write performance.
For a long time the rule of thumb was to use RAID 5 where the ratio of read operations
to write operations is 70 : 30. At this point we wish to repeat that there are now storage
systems on the market with excellent write performance that store the data internally usingRAID 4 or RAID 5.
Like RAID 4 or RAID 5, RAID 3 stores parity data. RAID 3 distributes the data of a block amongst all the disks of the RAID 3 system so that, in contrast to RAID 4 or RAID 5, all disks are involved in every read or write access. RAID 3 only permits the reading34 INTELLIGENT DISK SYSTEMS and writing of whole blocks, thus dispensing with the write penalty that occurs in RAID
4 and RAID 5. The writing of individual blocks of a parity group is thus not possible. In addition, in RAID 3 the rotation of the individual hard disks is synchronized so that the data of a block can truly be written simultaneously. RAID 3 was for a long time called the recommended RAID level for sequential write and read profiles such as data mining and video processing. Current hard disks come with a large cache of their own,which means that they can temporarily store the data of an entire track, and they have significantly higher rotation speeds than the hard disks of the past. As a result of theseinnovations, other RAID levels are now suitable for sequential load profiles, meaning that RAID 3 is becoming less and less important.2.5.6 A comparison of the RAID levelsThe various RAID levels raise the question of which RAID level should be used when.Table 2.1 compares the criteria of fault-tolerance, write performance, read formanceand space requirement for the individual RAID levels. The evaluation of the criteria canbe found in the discussion in the previous sections. CAUTION PLEASE: The comparison of the various RAID levels discussed in this section is only applicable to the theoretical basic forms of the RAID level in question.In practice, manufacturers of disk subsystems have design options in
• the selection of the internal physical hard disks;
• the I/O technique used for the communication within the disk subsystem;
• the use of several I/O channels;
• the realization of the RAID controller;
• the size of the cache; and
• the cache algorithms themselves.
The performance data of the specific disk subsystem must be considered very carefully
for each individual case. For example, in the previous chapter measures were discussed
Table 2.1 The table compares the theoretical basic forms of the various RAID levels. In
practice there are very marked differences in the quality of the implementation of RAID
controllers
RAID level Fault-tolerance Read performance Write performance Space requirement
RAID 0 none good very good minimal
RAID 1 high poor poor high
RAID 10 very high very good good high
RAID 4 high good very very poor low
RAID 5 high good very poor low2.6 CACHING: ACCELERATION OF HARD DISK ACCESS 35
that greatly reduce the write penalty of RAID 4 and RAID 5. Specific RAID controllers
may implement these measures, but they do not have to.
Subject to the above warning, RAID 0 is the choice for applications for which the
maximum write performance is more important than protection against the failure of a
disk. Examples are the storage of multimedia data for film and video production and the
recording of physical experiments in which the entire series of measurements has no value
if all measured values cannot be recorded. In this case it is more beneficial to record all
of the measured data on a RAID 0 array first and then copy it after the experiment, for
example on a RAID 5 array. In databases, RAID 0 is used as a fast store for segments in
which intermediate results for complex requests are to be temporarily stored. However, as
a rule hard disks tend to fail at the most inconvenient moment so database administrators
only use RAID 0 if it is absolutely necessary, even for temporary data.
With RAID 1, performance and capacity are limited because only two physical hard
disks are used. RAID 1 is therefore a good choice for small databases for which the
configuration of a virtual RAID 5 or RAID 10 disk would be too large. A further important
field of application for RAID 1 is in combination with RAID 0.
RAID 10 is used in situations where high write performance and high fault-tolerance
are called for. For a long time it was recommended that database log files be stored on
RAID 10. Databases record all changes in log files so this application has a high write
component. After a system crash the restarting of the database can only be guaranteed if all
log files are fully available. Manufacturers of storage systems disagree as to whether this
recommendation is still valid as there are now fast RAID 4 and RAID 5 implementations.
RAID 4 and RAID 5 save disk space at the expense of a poorer write performance.
For a long time the rule of thumb was to use RAID 5 where the ratio of read operations
to write operations is 70 : 30. At this point we wish to repeat that there are now storage
systems on the market with excellent write performance that store the data internally usingRAID 4 or RAID 5.
No comments:
Post a Comment