FREE TUTOR ON STORAGE LUN MASKING AND AVAILABILITY OF DISK SUBSYSTEMS
So-called LUN masking brings us to the third important function – after instant copy and remote mirroring – that intelligent disk subsystems offer over and above that offered by RAID. LUN masking limits the access to the hard disks that the disk subsystem exports to the connected server.A disk subsystem makes the storage capacity of its internal physical hard disks available
to servers by permitting access to individual physical hard disks, or to virtual hard disks created using RAID, via the connection ports. Based upon the SCSI protocol, all hard disks – physical and virtual – that are visible outside the disk subsystem are also known
as LUN (Logical Unit Number).Without LUN masking every server would see all hard disks that the disk subsystem pro-vides.
to servers by permitting access to individual physical hard disks, or to virtual hard disks created using RAID, via the connection ports. Based upon the SCSI protocol, all hard disks – physical and virtual – that are visible outside the disk subsystem are also known
as LUN (Logical Unit Number).Without LUN masking every server would see all hard disks that the disk subsystem pro-vides.
A disk subsystem without LUN masking to which three servers are connected. Each server sees all hard disks that the disk subsystem exports outwards.As a result, considerably more hard disks are visible to each server than is necessary.
AVAILABILITY OF DISK SUBSYSTEMS
In particular, on each server those hard disks that are required by applications that runon a different server are visible. This means that the individual servers must be verycarefully configured. In Figure 2.23 an erroneous formatting of the disk LUN 3 of server1 would destroy the data of the application that runs on server 3. In addition, some operating systems are very greedy: when booting up they try to draw to them each harddisk that is written with the signature (label) of a foreign operating system.Without LUN masking, therefore, the use of the hard disk must be very carefullyconfigured in the operating systems of the participating servers. LUN masking brings order to this chaos by assigning the hard disks that are externally visible to servers. As result, it limits the visibility of exported disks within the disk subsystem.shows how LUN masking brings order to the chaos of Figure 2.23. Each server now sees only the hard disks that it actually requires. LUN masking thus acts as a filter between the exported hard disks and the accessing servers.It is now no longer possible to destroy data that belongs to applications that run on another server. Configuration errors are still possible, but the consequences are no longer so devastating. Furthermore, configuration errors can now be more quickly traced since the information is bundled within the disk subsystem instead of being distributed over all servers.We differentiate between port-based LUN masking and server-based LUN masking.Port-based LUN masking is the 'poor man's LUN masking', it is found primarily in low-end disk subsystems. In port-based LUN masking the filter only works using the granularity of a port. This means that all servers connected to the disk subsystem via the
same port see the same disks.Server-based LUN masking offers more flexibility. In this approach every server sees only the hard disks assigned to it, regardless of which port it is connected via or which other servers are connected via the same port.
In particular, on each server those hard disks that are required by applications that runon a different server are visible. This means that the individual servers must be verycarefully configured. In Figure 2.23 an erroneous formatting of the disk LUN 3 of server1 would destroy the data of the application that runs on server 3. In addition, some operating systems are very greedy: when booting up they try to draw to them each harddisk that is written with the signature (label) of a foreign operating system.Without LUN masking, therefore, the use of the hard disk must be very carefullyconfigured in the operating systems of the participating servers. LUN masking brings order to this chaos by assigning the hard disks that are externally visible to servers. As result, it limits the visibility of exported disks within the disk subsystem.shows how LUN masking brings order to the chaos of Figure 2.23. Each server now sees only the hard disks that it actually requires. LUN masking thus acts as a filter between the exported hard disks and the accessing servers.It is now no longer possible to destroy data that belongs to applications that run on another server. Configuration errors are still possible, but the consequences are no longer so devastating. Furthermore, configuration errors can now be more quickly traced since the information is bundled within the disk subsystem instead of being distributed over all servers.We differentiate between port-based LUN masking and server-based LUN masking.Port-based LUN masking is the 'poor man's LUN masking', it is found primarily in low-end disk subsystems. In port-based LUN masking the filter only works using the granularity of a port. This means that all servers connected to the disk subsystem via the
same port see the same disks.Server-based LUN masking offers more flexibility. In this approach every server sees only the hard disks assigned to it, regardless of which port it is connected via or which other servers are connected via the same port.
AVAILABILITY OF DISK SUBSYSTEMS
Disk subsystems are assembled from standard components, which have a limited fault-tolerance. In this chapter we have shown how these standard components are combined in order to achieve a level of fault-tolerance for the entire disk subsystem that lies sig-
nificantly above the fault-tolerance of the individual components. Today, disk subsystems can be constructed so that they can withstand the failure of any component without databeing lost or becoming inaccessible. We can also say that such disk subsystems have no
'single point of failure'.The following list describes the individual measures that can be taken to increase the availability of data:
• The data is distributed over several hard disks using RAID processes and supple-mented by further data for error correction. After the failure of a physical hard disk,the data of the defective hard disk can be reconstructed from the remaining data and
the additional data.46 INTELLIGENT DISK SYSTEMS• Individual hard disks store the data using the so-called Hamming code. The Hamming code allows data to be correctly restored even if individual bits are changed on the hard disk. Self-diagnosis functions in the disk controller continuously monitor the rate of bit errors and the physical variables (temperature sensors, spindle vibration sensors).
In the event of an increase in the error rate, hard disks can be replaced before datais lost.• Each internal physical hard disk can be connected to the controller via two internal I/O channels. If one of the two channels fails, the other can still be used.• The controller in the disk subsystem can be realized by several controller instances. If one of the controller instances fails, one of the remaining instances takes over the tasks of the defective instance.• Other auxiliary components such as power supplies, batteries and fans can often beduplicated so that the failure of one of the components is unimportant. When connect-ing the power supply it should be ensured that the various power cables are at leastconnected through various fuses. Ideally, the individual power cables would be supplied
via different external power networks; however, in practice this is seldom realizable.• Server and disk subsystem are connected together via several I/O channels. If one of the channels fails, the remaining ones can still be used.• Instant copies can be used to protect against logical errors. For example, it would be possible to create an instant copy of a database every hour. If a table is 'accidentally'
deleted, then the database could revert to the last instant copy in which the database is still complete.• Remote mirroring protects against physical damage. If, for whatever reason, the original data can no longer be accessed, operation can continue using the data copy that was generated using remote mirroring.This list shows that disk subsystems can guarantee the availability of data to a very high degree. Despite everything it is in practice sometimes necessary to shut down and switch off a disk subsystem. In such cases, it can be very tiresome to co-ordinate all project groups to a common waiting window, especially if these are distributed over different
time zones.Further important factors for the availability of an entire IT system are the availabilityof the applications or the application server itself and the availability of the connection between application servers and disk subsystems. Chapter 6 shows how multipathing can improve the connection between servers and storage systems and how clustering canincrease the fault-tolerance of applications.
Disk subsystems are assembled from standard components, which have a limited fault-tolerance. In this chapter we have shown how these standard components are combined in order to achieve a level of fault-tolerance for the entire disk subsystem that lies sig-
nificantly above the fault-tolerance of the individual components. Today, disk subsystems can be constructed so that they can withstand the failure of any component without databeing lost or becoming inaccessible. We can also say that such disk subsystems have no
'single point of failure'.The following list describes the individual measures that can be taken to increase the availability of data:
• The data is distributed over several hard disks using RAID processes and supple-mented by further data for error correction. After the failure of a physical hard disk,the data of the defective hard disk can be reconstructed from the remaining data and
the additional data.46 INTELLIGENT DISK SYSTEMS• Individual hard disks store the data using the so-called Hamming code. The Hamming code allows data to be correctly restored even if individual bits are changed on the hard disk. Self-diagnosis functions in the disk controller continuously monitor the rate of bit errors and the physical variables (temperature sensors, spindle vibration sensors).
In the event of an increase in the error rate, hard disks can be replaced before datais lost.• Each internal physical hard disk can be connected to the controller via two internal I/O channels. If one of the two channels fails, the other can still be used.• The controller in the disk subsystem can be realized by several controller instances. If one of the controller instances fails, one of the remaining instances takes over the tasks of the defective instance.• Other auxiliary components such as power supplies, batteries and fans can often beduplicated so that the failure of one of the components is unimportant. When connect-ing the power supply it should be ensured that the various power cables are at leastconnected through various fuses. Ideally, the individual power cables would be supplied
via different external power networks; however, in practice this is seldom realizable.• Server and disk subsystem are connected together via several I/O channels. If one of the channels fails, the remaining ones can still be used.• Instant copies can be used to protect against logical errors. For example, it would be possible to create an instant copy of a database every hour. If a table is 'accidentally'
deleted, then the database could revert to the last instant copy in which the database is still complete.• Remote mirroring protects against physical damage. If, for whatever reason, the original data can no longer be accessed, operation can continue using the data copy that was generated using remote mirroring.This list shows that disk subsystems can guarantee the availability of data to a very high degree. Despite everything it is in practice sometimes necessary to shut down and switch off a disk subsystem. In such cases, it can be very tiresome to co-ordinate all project groups to a common waiting window, especially if these are distributed over different
time zones.Further important factors for the availability of an entire IT system are the availabilityof the applications or the application server itself and the availability of the connection between application servers and disk subsystems. Chapter 6 shows how multipathing can improve the connection between servers and storage systems and how clustering canincrease the fault-tolerance of applications.
No comments:
Post a Comment