Friday, March 28, 2008

SHARED DISK FILE SYSTEMS and Case study: the General Parallel File System (GPFS)

 SHARED DISK FILE SYSTEMS and Case study: the General Parallel File System (GPFS)

The greatest performance limitation of NAS servers and self-configured file servers is that each file must pass through the internal buses of the file servers  wice before the files arrive at the computer where they are required (Figure 4.7). Even DAFS and its alternatives like NFS over RDMA cannot get around this 'eye of the needle'. With storage networks it is possible for several computers to access a storage device simultaneously. The I/O bottleneck in the file server can be circumvented if all clients fetch the files from the disk directly via the storage network (Figure 4.11). The difficulty here: today's file systems consider their storage devices as local. They concentrate upon the caching and the aggregation of I/O operations; they increase perfor- mance by reducing the number of disk accesses needed. So-called shared disk file systems can deal with this problem. Integrated into them are special algorithms that synchronize the simultaneous accesses of several computers to common disks. As a result, shared disk file systems make it possible for several computers to access files simultaneously without causing version conflict.  To achieve this, shared disk file systems must synchronize write accesses in addition to the functions of local file systems. It should be ensured locally that new files are written to different areas of the hard disk. It must also be ensured that cache entries are marked as invalid. Let us assume that two computers each have a file in their local cache and one of the computers changes the file. If the second computer subsequently reads the file

again it may not take the now invalid copy from the cache. The great advantage of shared disk file systems is that the computers accessing files and the storage devices in question now communicate with each other directly. The diversion via a central file server, which represents the bottleneck in conventional network file systems and also in DAFS and RDMA-enabled NFS, is no longer necessary. In addition, the load on the CPU in the accessing machine is reduced because com-

munication via Fibre Channel places less of a load on the processor than communication  via IP and Ethernet. The sequential access to large files can thus more than make up for the extra cost for access synchronization. On the other hand, in applications with many small files or in the case of many random accesses within the same file, we should check whether the use of a shared disk file system is really worthwhile. One side-effect of file sharing over the storage network is that the availability of the shared disk file system can be better than that of conventional network file systems. This is because a central file server is no longer needed. If a machine in the shared disk file system cluster fails, then the other machines can carry on working. This means that the availability of the underlying storage devices largely determines the availability of shared disk file systems.

 Case study: the General Parallel File System (GPFS)

We have decided at this point to introduce a product of our employer, IBM, for once. The General Parallel File System (GPFS) is a shared disk file system that has for many years been used on cluster computers of type RS/6000 SP (currently IBM eServer Cluster 1600). We believe that this section on GPFS illustrates the requirements of a shared disk file system very nicely. The reason for introducing GPFS at this point is quite simply that it is the shared disk file system that we know best. The RS/6000 SP is a cluster computer. It was, for example, used for Deep Blue, the computer that beat the chess champion Gary Kasparov. An RS/6000 SP consists of up to 512 conventional AIX computers that can also be connected together via a so-called high performance switch (HPS). The individual computers of an RS/6000 SP are also called nodes. Originally GPFS is based upon so-called Virtual Shared Disks (Figure 4.12). The VSD

subsystem makes hard disks that are physically connected to a computer visible to other nodes of the SP. This means that several nodes can access the same physical hard disk. The VSD subsystem ensures that there is consistency at block level, which means that a block is either written completely or not written at all. From today's perspective we could say that VSDs emulate the function of a storage network. In more recent versions of GPFS the VSD layer can be replaced by an SSA SAN or a Fibre Channel SAN.

GPFS uses the VSDs to ensure the consistency of the file system, i.e. to ensure that  the metadata structure of the file system is maintained. For example, no file names are allocated twice. Furthermore, GPFS realizes some RAID functions such as the striping and mirroring of data and metadata.

Figure 4.12 illustrates two benefits of shared disk file systems. First, they can use RAID 0 to stripe the data over several hard disks, host bus adapters and even disk subsystems, which means that shared disk file systems can achieve a very high throughput. All applications that have at least a partially sequential access pattern profit from this. Second, the location of the application becomes independent of the location of the data. In Figure 4.12 the system administrator can start applications on the four GPFS nodes that have the most resources (CPU, main memory, buses) available at the time. A so-called workload manager can move applications from one node to the other depending upon load. In conventional file systems this is not possible. Instead, applications have to run

on the nodes on which the file system is mounted since access via a network file system such as NFS or CIFS is generally too slow. The unusual thing about GPFS is that there is no individual file server. Each node in the GPFS cluster can mount a GPFS file system. For end users and applications the GPFS file system behaves – apart from its significantly better performance – like a conventional local file system. GPFS introduces the so-called node set as an additional management unit. Several node sets can exist within a GPFS cluster, with a single node only ever being able to belong to a maximum of one node set (Figure 4.13). GPFS file systems are only ever visible within a node set. Several GPFS file systems can be active in every node set. The GPFS Daemon must run on every node in the GPFS cluster. GPFS is realized as distributed application, with all nodes in a GPFS cluster having the same rights and duties. In addition, depending upon the configuration of the GPFS cluster, the GPFS Daemon must take on further administrative functions over and above the normal tasks of a file system.

• Configuration Manager

In every node set one GPFS Daemon takes on the role of the Configuration Manager. The Configuration Manager determines the File System Manager for every file system and monitors the so-called quorum. The quorum is a common procedure in distributed systems that maintains the consistency of the distributed application in the event of a network split. For GPFS more than half of the nodes of a node set must be active. If the quorum is lost in a node set, the GPFS file system is automatically deactivated (unmount) on all nodes of the node set. • File System Manager Every file system has its own File System Manager. Its tasks include the following: – configuration changes of the file system; – management of the hard disk blocks; – token administration; – management and monitoring of the quota; and – security services. Token administration is particularly worth highlighting. One of the design objectives

of GPFS is the support of parallel applications that read and modify common files from different nodes. Like every file system, GPFS buffers files or file fragments inorder to increase performance. GPFS uses a token mechanism in order to synchronize  the cache entries on various computers in the event of parallel write and read accesses (Figure 4.14). However, this synchronization only ensures that GPFS behaves precisely in the same way as a local file system that can only be mounted on one computer. This means that in GPFS – as in every file system – parallel applications still have to synchronize the accesses to common files, for example, by means of locks. • Metadata Manager Finally, one GPFS Daemon takes on the role of the Metadata Manager for every

open file. GPFS guarantees the consistency of the metadata of a file because only the  Metadata Manager may change a file's metadata. Generally, the GPFS Daemon of the node on which the file has been open for the longest is the Metadata Manager for the file. The assignment of the Metadata Manager of a file to a node can change in relation to the access behaviour of the applications. The example of GPFS shows that a shared disk file system has to achieve a great deal

more than a conventional local file system, which is only managed on one computer. GPFS has been used successfully on the RS/6000 SP for some years. The complexity of shared disk file systems is illustrated by the fact that IBM is only gradually transferring the GPFS file system to other operating systems such as Linux, which is strategically supported by IBM, and to new I/O technologies such as Fibre Channel.

No comments:

Buy Vmware Interview Questions & Storage Interview Questions for $150. 100+ Interview Questions with Answers.Get additional free bonus reference materials. You can download immediately even if its 1 AM. You will recieve download link immediately after payment completion.You can buy using credit card or paypal.
----------------------------------------- Get 100 Storage Interview Questions.
:
:
500+ Software Testing Interview Questions with Answers are also available plz email roger.smithson1@gmail.com if you are interested to buy them. 200 Storage Interview Questions word file @ $97

Vmware Interview Questions with Answers $100 Fast Download Immediately after payment.: Get 100 Technical Interview Questions with Answers for $100.
------------------------------------------ For $24 Get 100 Vmware Interview Questions only(No Answers)
Vmware Interview Questions - 100 Questions from people who attended Technical Interview related to Vmware virtualization jobs ($24 - Questions only) ------------------------------------------- Virtualization Video Training How to Get High Salary Jobs Software Testing Tutorials Storage Job Openings Interview Questions

 Subscribe To Blog Feed