Introduction

Classification

Lustre

Conclusion

Distributed File Systems
An Overview
Anant Narayanan
Cluster and Grid Com...
Introduction

Classification

Lustre

Conclusion

Outline
Introduction
Classification
Storage
Fault Tolerance
Applications
L...
Introduction

Classification

Lustre

Conclusion

Definition

Allows access to files located on a remote host
In a transparen...
Introduction

Classification

Lustre

Conclusion

Why do we need them?
Distributed applications usually require a common da...
Introduction

Classification

Lustre

Conclusion

Storage

Block Oriented

“Usual” meaning of a file system
Deal with storin...
Introduction

Classification

Lustre

Conclusion

Storage

Record Oriented

Were used on Mainframes and Minicomputers
Fetch...
Introduction

Classification

Lustre

Conclusion

Storage

Object Oriented

Splits file metadata from file data
File data is ...
Introduction

Classification

Lustre

Conclusion

Fault Tolerance

High Availability

Replication
Parallel Striping
Example...
Introduction

Classification

Lustre

Conclusion

Applications

Clusters

Shared disk systems (GFS)
Distributed disk system...
Introduction

Classification

Lustre

Conclusion

Applications

Grids or Clouds

Dynamic nature
Deal with heterogeneity
Dea...
Introduction

Classification

Lustre

Conclusion

Overview

Linux + Cluster
Distributed, parallel, fault tolerant, object b...
Introduction

Lustre Clusters
Classification

Lustre

Conclusion

Lustre clusters contain three kinds of systems:
Architect...
Introduction

Classification

Lustre

Conclusion

Architecture

Components

File system clients: used to access the file sys...
4

Lustre Clusters

Introduction

Sun Microsystems, Inc.

Classification

Lustre

Conclusion

Architecture

The following
C...
Introduction

Classification

Lustre

Conclusion

Architecture

Heterogeneous?

MDS and OSS may store actual data on ext3 o...
Introduction

Classification

Lustre

Conclusion

Implementation

Setup
MDS
mkfs.lustre -mdt -mgs -fsname=large-fs
/dev/sda...
Introduction

Classification

Lustre

Conclusion

Implementation

Networking
LNET abstracts over multiple supported network...
Introduction

Implementation

Where Are the Files?
Classification

Lustre

Conclusion

Traditional UNIX disk file systems u...
Introduction

Classification

Lustre

Conclusion

Implementation

Striping and Replication

One object per MDS inode implie...
Introduction

Classification

Lustre

Conclusion

Conclusion

Reliable, Scalable and Performant filesystem
Open Architecture...
Introduction

Classification

Lustre

Conclusion

What next?

Other distributed file systems and implementation details
Grid...
Introduction

Classification

Lustre

Conclusion

References
You may be required to register on the Sun Website to access t...
Upcoming SlideShare
Loading in...5
×

Distributed File Systems: An Overview

1,096

Published on

A brief overview of the various types of distributed filesystems.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,096
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
40
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Distributed File Systems: An Overview

  1. 1. Introduction Classification Lustre Conclusion Distributed File Systems An Overview Anant Narayanan Cluster and Grid Computing Vrije Universiteit 11 March 2009 Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  2. 2. Introduction Classification Lustre Conclusion Outline Introduction Classification Storage Fault Tolerance Applications Lustre Overview Architecture Implementation Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  3. 3. Introduction Classification Lustre Conclusion Definition Allows access to files located on a remote host In a transparent manner, as though the client is actually working on the host Typically, clients do not have access to the underlying block storage They interact over the network using a protocol Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  4. 4. Introduction Classification Lustre Conclusion Why do we need them? Distributed applications usually require a common data store Eases ability to keep data consistent Access control is possible both on the server and client Depending on how the protocol is designed Allows for implementation of Replication Fault tolerance Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  5. 5. Introduction Classification Lustre Conclusion Storage Block Oriented “Usual” meaning of a file system Deal with storing data on a block basis Most distributed file systems are based on this at the lowest level Examples: ext3, NTFS, HFS+ Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  6. 6. Introduction Classification Lustre Conclusion Storage Record Oriented Were used on Mainframes and Minicomputers Fetch and put whole records, seek to boundaries Have a lot in common with today’s databases Examples: Files-11, Virtual Storage Access Method (VSAM) Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  7. 7. Introduction Classification Lustre Conclusion Storage Object Oriented Splits file metadata from file data File data is further split into objects Objects stored on object storage servers May or may not have a block oriented FS at the lowest layer Examples: Lustre, XtreemFS Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  8. 8. Introduction Classification Lustre Conclusion Fault Tolerance High Availability Replication Parallel Striping Examples: Brtfs, Coda, GlusterFS Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  9. 9. Introduction Classification Lustre Conclusion Applications Clusters Shared disk systems (GFS) Distributed disk systems (Lustre) Typically used on local networks Fast network access Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  10. 10. Introduction Classification Lustre Conclusion Applications Grids or Clouds Dynamic nature Deal with heterogeneity Deal with VOs (Grids) and SLAs (Clouds) Examples: XtreemFS, Dynamo Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  11. 11. Introduction Classification Lustre Conclusion Overview Linux + Cluster Distributed, parallel, fault tolerant, object based file system Tens of thousands of nodes Petabytes of storage capacity Hundreds of Gigabytes / second of throughput Without compromising on speed or security Small workgroup clusters, to large-scale, multi-site clusters, to super-computers Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  12. 12. Introduction Lustre Clusters Classification Lustre Conclusion Lustre clusters contain three kinds of systems: Architecture Layout • File system clients, which can be used to access the file system • Object storage servers (OSS), which provide file I/O service • Metadata servers (MDS), which manage the names and directories in the file system MDS disk storage containing Metadata Targets (MDT) Pool of clustered MDS servers 1-100 OSS servers 1-1000s MDS 1 (active) MDS 2 (standby) Elan Myrinet InfiniBand OSS 1 Commodity Storage OSS 2 Simultaneous support of multiple network types Lustre clients 1-100,000 OSS storage with object storage targets (OST) OSS 3 Shared storage enables failover OSS OSS 4 Router OSS 5 GigE OSS 6 = Failover OSS 7 Enterprise-Class Storage Arrays and SAN Fabric Figure 1. Systems in a Lustre cluster Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  13. 13. Introduction Classification Lustre Conclusion Architecture Components File system clients: used to access the file system Object storage servers (OSS): provide file I/O service, deals with block storage Metadata servers (MDS): manage the names and directories in the file system, deals with authentication Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  14. 14. 4 Lustre Clusters Introduction Sun Microsystems, Inc. Classification Lustre Conclusion Architecture The following Characteristicstable shows the characteristics associated with each of the three types of systems. Typical number of systems Performance Required attached storage Desirable hardware characteristics Clients 1–100,000 1 GB/sec I/O, 1000 metadata ops None None OSS 1–1000 500 MB/sec — 2.5 GB/sec File system capacity/OSS count Good bus bandwidth MDS 2 (in the future 2–100) 3000–15,000 metadata ops/sec (operations) 1–2% of file system capacity Adequate CPU power, plenty of memory The storage attached to the servers is partitioned, optionally organized with logical Anant Narayanan Cluster OSS and MDS volume management (LVM) and formatted as file systems. The Lustreand Grid Computing Vrije Universiteit servers read, write, and modify data in the format imposed by these file systems. Each Distributed File Systems
  15. 15. Introduction Classification Lustre Conclusion Architecture Heterogeneous? MDS and OSS may store actual data on ext3 or ZFS block file systems Infiniband, TCP/IP over Ethernet and Myrinet are supported network types Multiple CPU architectures: x86, x86_64, PPC Requires patched Linux kernel Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  16. 16. Introduction Classification Lustre Conclusion Implementation Setup MDS mkfs.lustre -mdt -mgs -fsname=large-fs /dev/sdamount -t lustre /dev/sda /mnt/mdt OSS1 mkfs.lustre -ost -fsname=large-fs -mgsnode=mds@tcp0 /dev/sdb mount -t lustre /dev/sdb /mnt/ost1 Client mount -t lustre mds.your.org:/large-fs /mnt/lustre-client Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  17. 17. Introduction Classification Lustre Conclusion Implementation Networking LNET abstracts over multiple supported networks Provides the communication infrastructure required by Lustre Takes care of abstracting over fail-over servers, load balancing Provides support for Remote Direct Memory Access (RDMA) Provides an end-to-end throughput of 100MB per sec on Gigabit Ethernet networks Upto 1.5GB per sec on Infiniband Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  18. 18. Introduction Implementation Where Are the Files? Classification Lustre Conclusion Traditional UNIX disk file systems use inodes, which contain lists of block numbers where the file data for the inode is stored. Similarly, one inode exists on the MDT for each file the Lustre Where aredoesinnot point to file system. However, in points to one or more objects associated the files?blocks, but instead the Lustre file system, the inode on the MDT data with the files. This is illustrated in Figure 5. File on MDT Extended Attributes Ordinary ext3 File obj1 oss2 obj3 oss3 Direct Data Blocks oss1 obj2 Data Block ptrs Indirect Double Indirect inode inode Indirect Data Blocks Figure 5. MDS inodes point to objects; ext3 inodes point to data Anant Narayanan These objects are implemented as files on the OST file systems and contain file data. Cluster and Grid Computing Vrije Universiteit Figure 6 shows how a file open operation transfers the object pointers from the MDS Distributed File Systems
  19. 19. Introduction Classification Lustre Conclusion Implementation Striping and Replication One object per MDS inode implies “unstriped” data Multiple objects per MDS inode implies that the file has been split, similar to RAID 0 These stripes may be duplicated across several OSS Provides fault tolerance and high availability Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  20. 20. Introduction Classification Lustre Conclusion Conclusion Reliable, Scalable and Performant filesystem Open Architecture and Protocols BUT Does not handle dynamic addition / removal of servers Does not provide the kind of access control and security that a VO might need Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  21. 21. Introduction Classification Lustre Conclusion What next? Other distributed file systems and implementation details Grids (XtreemFS), Clouds (Dynamo) Questions? Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  22. 22. Introduction Classification Lustre Conclusion References You may be required to register on the Sun Website to access these documents! Datasheet: http://www.sun.com/software/products/lustre/ datasheet.pdf Scalable Cluster Filesystem Whitepaper: http://www.sun.com/offers/docs/LustreFileSystem.pdf LNET: http://www.sun.com/offers/docs/ lustre_networking.pdf Lustre Documentation Index: http://manual.lustre.org/index.php?title=Main_Page Lustre Publications Index: http://wiki.lustre.org/ index.php?title=Lustre_Publications Anant Narayanan Distributed File Systems Cluster and Grid Computing Vrije Universiteit
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×