Free tutors on Storage Virtualization
Although the cost of storage has fallen considerably in recent years, at the same time the need for storage has risen immensely, so that we can observe of a real data explosion. The administrative costs associated with these quantities of data should not, however, increase to the same degree. The introduction of storage networks is a first step towards remedying the disadvantages of the server-centric IT architecture (Section 1.1). Whereas in smaller environments the use of storage networks is completely adequate for the mastery of data, practical experience has shown that, in large environments, a storage network alone is not sufficient to efficiently manage the ever-increasing volumes of data. In this chapter we will introduce the storage virtualization in the storage network, an approach that has the potential to get to grips with the management of large quantities of data. The basic idea behind storage virtualization is to move the storage virtualization functions from the servers (volume manager, file systems) and disk subsystems (caching, RAID, instant copy, remote mirroring, LUN masking) into the storage network (Figure 5.1). This creates a new virtualization entity which, as a result of its positioning in the storage network, spans all servers and storage systems. This new virtualization in the storage network permits the full utilization of the potential of a storage network with regard to the efficient use of resources and data, the improvement of performance and protection against failures. As an introduction into storage virtualization we repeat the I/O path from the disk to the main memory: Section 5.1 contrasts the virtualization variants discussed so far once again. Then we describe the difficulties relating to storage administration and the requirements of data and data users that occur in a storage network, for which storage virtualization aims
to provide a solution (Section 5.2). We will then define the term 'storage virtualization' and consider the concept of storage virtualization in more detail (Section 5.3). We will see that for storage virtualization a virtualization entity is required. The requirements for from servers and storage devices into the storage network. This creates a new virtualization entity which, as a result of its central position in the storage network, spans all servers and storage systems and can thus centrally manage all available storage resources this virtualization entity are defined and some implementation considerations investigated(Section 5.4). Then we will consider the two different forms of virtualization (on block and file level) (Section 5.5), before going on to consider on which different levels a virtualization entity can be positioned in the storage network (server, storage device or network) and the advantages and disadvantages of each (Section 5.6). We will back this up by reintroducing some examples of virtualization methods that have already been discussed. Finally, we will introduce two new virtualization approaches – symmetric and asymmetric storage virtualization – in which the virtualization entity is positioned in the storage network (Section 5.7). 5.1 ONCE AGAIN: VIRTUALIZATION IN THE I/O PATH The structure of Chapters 2, 3 and 4 was based upon the I/O path from the hard disk to the main memory (Figure 1.7). Consequently, several sections of these chapters discuss different aspects of virtualization. This section consolidates the various realization locations for storage virtualization which we have presented so far. After that we will move on to the virtualization inside of the storage network. Virtualization is the name given to functions such as RAID, caching, instant copies and remote mirroring. The objectives of virtualization are:
• improvement of availability (fault-tolerance) • improvement of performance
• improvement of scalability
• improvement of maintainability.
At various points of the previous chapters we encountered virtualization functions .illustrates the I/O path from the CPU to the storage system and shows at what points of the I/O path virtualization is realized. We have already discussed in detail virtualization within a disk subsystem (Chapter 2) and, based upon the example of volume manager and file system, in the main memory and CPU (Chapter 4). The host bus adapter and the storage network itself should be mentioned as further possible realization locations for virtualization functions. Virtualization in the disk subsystem has the advantage that tasks are moved from the computer to the disk subsystem, thus freeing up the computer. The functions are realized at the point where the data is stored: at the hard disks. Measures such as mirroring (RAID 1) and instant copies only load the disk subsystem itself (Figure 5.3). This additional cost is not even visible on the I/O channel between computer and disk subsystem. The communication between servers and other devices on the same I/O bus is thus not impaired. Virtualization in the storage network has the advantage that the capacity of all available storage resources (e.g. disks and tapes) is centrally managed (Figure 5.4). This reduces the costs for the management of and access to storage resources and permits a more efficient utilization of the available hardware. For example, a cache server installed in the storage network can serve various disk subsystems. Depending upon the load on the individual disk subsystems, sometimes one and sometimes the other requires more cache. If virtualization is realized only within the disk subsystems, the cache of one disk
subsystem that currently has a lower load cannot be used to support a different disk subsystem operating at a higher load. A further advantage of virtualization within the storage network is that functions such as caching, instant copy and remote mirroring can be used even with cheaper disk subsystems (JBODs, RAID arrays). The I/O card provides the option of realizing RAID between the server and the disk subsystem (Figure 5.5). As a result, virtualization takes place between the I/O bus and the host I/O bus. This frees up the computer just like virtualization in the disk subsystem, however, for many operations the I/O buses between computer and disk subsystem are more heavily loaded. Virtualization in the main memory can either take place within the operating system in the volume manager or in low-level applications such as file systems or databases (Figure 5.6). Like all virtualization locations described previously, virtualization in the volume manager takes place at block level; the structure of the data is not known. However, in virtualization in the volume manager the cost for the virtualization is fully passed on to the computer: internal and external buses in particular are now more heavily loaded. On the other hand,
the CPU load for volume manager mirroring can generally be disregarded. The alternative approach of realizing copying functions such as remote mirroring and instant copy in system-near applications such as file systems and databases is of interest. The applications know the structure of the data and can therefore sometimes perform the copying functions significantly more efficiently than when the structure is not known. It is not possible to give any general recommendation regarding the best location for the realization of virtualization. Instead it is necessary to consider the requirements of resource consumption, fault-tolerance, performance, scalability and ease of administration for the specific individual case when selecting the realization location. From the point of view of the resource load on the server it is beneficial to realize the virtualization as close as possible to the hard disk. However, to increase performance when performance requirements are high, it is necessary to virtualize within the main memory: only thus can the load be divided amongst several host bus adapters and host I/O buses
Virtualization in the volume manager is also beneficial from the point of view of fault tolerance (Section 6.3.3). As is the case of many design decisions, the requirements of simple maintainability and high performance are in conflict in the selection of the virtualization location. In Section 2.5.3 we discussed how both the performance and also the fault-tolerance of disk subsystems can be increased by the combination of RAID 0 (striping) and RAID 1 (mirroring), with the striping of mirrored disks being more beneficial in terms of fault- tolerance and maintainability than the mirroring of striped disks. However, things are different if data is distributed over several disk subsystems. In terms of high fault-tolerance and simple maintainability it is often better to mirror the
blocks in the volume manager and then stripe them within the disk subsystems by means of RAID 0 or RAID 5 (Figure 5.7). This has the advantage that the application can be kept in operation if an entire disk subsystem fails. In applications with very high performance requirements for write throughput, it isnot always feasible to mirror in the volume manager because the blocks have to be transferred through the host I/O bus twice, as shown in Figure 5.7. If the host I/ bus is the performance bottleneck, then it is better to stripe the blocks in the volume manager and only mirror them within the disk subsystems (Figure 5.8). The number of blocks to be written can thus be halved at the expense of fault-tolerance.
No comments:
Post a Comment