SlideShare a Scribd company logo
Distributed File Systems (DFS)
A distributed file system is any file system that allows access to files from multiple hosts shared via a
computer network. This makes it possible for multiple users on multiple machines to share files and
storage resources. Distributed file systems differ in their performance, mutability of content, handling of
concurrent writes, handling of permanent or temporary loss of nodes or storage, and their policy of storing
content.
Let us first understand what a Big Data project will comprise of, so that there is no doubt in differentiating
between a traditional project and a Big Data based project. Traditionally, projects have required an
RDBMS for managing data. Now that we know that Big Data differentiates with an RDBMS-based project
due to the distributed nature of data management in Big Data systems.
During a Big Data project analysis phase, one of the most important decisions is to ascertain the
distributed storage system. Distributed storage is relatively easy to understand but complex in
configuration and management. As with singular systems, where hard drives are locally attached to a
machine, and accessed from the machine itself whether running Windows OS, Linux OS or any other
operating system. Examples of managing data locally include a laptop’s c: drive, or a USB drive, or SAN
connected storage, or internal drives of Desktop and Server class machines.
For making a choice for a Big-Data based system, several factor need to be analysed. Also, more than
one Distributed Filesystem may be incorporated.
Types of Distributed File Systems
There are several types of Distributed File Systems (DFSs) and many of these put in production
depending upon the use case. Some DFSs are more popular, while others have been applied to specific
applications. Following is a list of few important ones:
 Ceph
 GlusterFS
 Lustre
 MooseFS
These have been chosen based on their popularity, use in production, and be regularly updated.
Distributed file systems differ in their performance, mutability of content, handling of concurrent writes,
handling of permanent or temporary loss of nodes or storage, and their policy of storing content. Let us
now see more details of each of these.
Ceph
Ceph is an open source distributed file system developed by Sage Weil.
Architecture
Ceph is a totally distributed system. Unlike HDFS, to ensure scalability Ceph provides a dynamic
distributed metadata management using a metadata cluster (MDS) and stores data and metadata in
Object Storage Devices (OSD). MDSs manage the namespace, the security and the consistency of the
system and perform queries of metadata while OSDs perform I/O operations.
Cache consistency
Ceph is a near POSIX system in which reads must re
ect an up-to-date version of data and writes must reflect a particular order of occurrences. It allows
concurrent read and write accesses using a simple locking mechanism called capabilities. These latter,
granted by the MDSs, give users the ability to read, read and cache, write or write and buffer data. Ceph
also provides some tools to relax consistency.
Load Balancing
Ceph ensures load balancing at two levels: data and metadata. Firstly, it implements a counter for each
metadata which allows it to know their access frequency. Thus, Ceph is able to replicate or move the
most frequently accessed metadata from an overloaded MDS to another one. Secondly, Ceph uses
weighted devices. For example, let assume having two storage resources with respectively a weight of
one and two. The second one will store twice as much data as the first one. Ceph monitors the disk space
used by each OSD, and moves data to balance the system according to the weight associated to each
OSD. For example, let us assume a new device is added. Ceph will be able to move data from OSDs
which have become unbalanced to this new device.
Fault Detection
Ceph uses the same fault detection model as HDFS. OSDs periodically exchange heartbeats to detect
failures. Ceph provides a small cluster of monitors which holds a copy of the cluster map. Each OSD can
send a failure report to any monitor. The inaccurate OSD will be marked as down. Each OSD can also
query an up-to-date version of the cluster map from the cluster of monitor. Thus, a formerly defaulting
OSD can join the system again.
Refer to following URL for installing GlusterFS:
https://github.com/ceph/ceph
GlusterFS
GlusterFS is an open source distributed file system developed by the gluster core team. This was
acquired by Red Hat in 2011and currently in development and production. GlusterFS is a scale-out
network-attached storage file system, ans has found applications in cloud computing, streaming media
services, and content delivery networks.
Architecture
GlusterFS is different from the other DFSs. It has a client-server design in which there is no metadata
server. Instead, GlusterFS stores data and metadata on several devices attached to different servers.
The set of devices is called a volume which can be configured to stripe data into blocks and/or replicate
them. Thus, blocks will be distributed and/or replicated across several devices inside the volume.
Naming
GlusterFS does not manage metadata in a dedicated and centralised server, instead, it locates files
algorithmically using the Elastic Hashing Algorithm (EHA) to provide a global name space. EHA uses a
hash function to convert a file's pathname into a fixed length, uniform and unique value. A storage is
assigned to a set of values allowing the system to store a file based on its value. For example, let assume
there are two storage devices: disk1 and disk2 which respectively store files with value from 1 to 20 and
from 21 to 40. The file Data.txt is converted to the value 30. Therefore, it will be stored onto disk2.
API and client access
GlusterFS provides a REST API and a mount module (FUSE or mount command) to give clients with an
access to the system. Clients interact, send and retrieve files with, to and from a volume.
Cache consistency
GlusterFS does not use client side caching, thus, avoiding cache consistency issues.
Replication and synchronisation
Compared to the other DFSs, GlusterFS does not replicate data one by one, but relies on RAID 1. It
makes several copies of a whole storage device into other ones inside a same volume using synchronous
writes. That is, a volume is composed by several subsets in which there are an initial storage disk and its
mirrors. Therefore, the number of storage devices in a volume must be a multiple of the desired
replication. For example, let assume we have got four storage medias, and replication is fixed to two. The
first storage will be replicated into the second and thus form a subset. In the same way, the third and
fourth storage form another subset.
Load balancing
GlusterFS uses a hash function to convert files' pathnames into values and assigns several logical
storage to different set of value. This is uniformly done and allows GlusterFS to be sure that each logical
storage almost holds the same number of files. Since Thus, when a disk is added or removed, virtual
storage can be moved to a different physical one in order to balance the system.
Fault detection
In GlusterFS, when a server becomes unavailable, it is removed from the system and no I/O operations to
it can be perform.
Refer to following URL for installing GlusterFS:
https://github.com/gluster/glusterfs
Lustre File System
Lustre is a parallel distributed file system, generally used for large-scale cluster computing. The name
Lustre is a word derived from Linux and cluster. Lustre file system software is available under the GNU
General Public License (version 2 only) and provides high performance file systems for computer clusters
ranging in size from small workgroup clusters to large-scale, multi-site clusters.
Because Lustre file systems have high performance capabilities and open licensing, it is often used in
supercomputers. Since June 2005, it has consistently been used by at least half of the top ten, and more
than 60 of the top 100 fastest supercomputers in the world, including the world's No. 2 and No. 3 ranked
TOP500 supercomputers in 2014, Titan and Sequoia.
Lustre file systems are scalable and can be part of multiple computer clusters with tens of thousands of
client nodes, tens of petabytes (PB) of storage on hundreds of servers, and more than a terabyte per
second (TB/s) of aggregate I/O throughput. This makes Lustre file systems a popular choice for
businesses with large data centers, including those in industries such as meteorology, simulation, oil and
gas, life science, rich media, and finance.
Architecture
Lustre is a centralised distributed file system which differs from the current DFSs in that it does not
provide any copy of data and metadata. Instead, Lustre chooses to It stores metadata on a shared
storage called Metadata Target (MDT) attached to two Metadata Servers (MDS), thus offering an
active/passive failover. MDS are the servers that handle the requests to metadata. Data themselves are
managed in the same way. They are split into objects and distributed at several shared Object Storage
Target (OST) which can be attached to several Object Storage Servers (OSS) to provide an active/active
failover. OSS are the servers that handle I/O requests.
Naming
The Lustre's single global name space is provided to user by the MDS. Lustre uses inodes, like HDFS or
MooseFS, and extended attributes to map file object name to its corresponding OSTs. Therefore, clients
will be informed of which OSTs it should query for each requested data.
API and client access
Lustre provides tools to mount the entire file system in user space (FUSE). Transfers of files and
processing queries are done in the same way like the other centralised DFSs, that is, they first ask the
MDS to locate data and then directly perform I/O with the appropriate OSTs.
Cache consistency
Lustre implements a Distributed Lock Manager (DLM) to manage cache consistency. This is a
sophisticated lock mechanism which allows Lustre to support concurrent read and write access. For
example, Lustre can choose granting write-back lock in a directory with little contention. Write-back lock
allows clients to cache a large number of updates in memory which will be committed to disk later, thus
reducing data transfers. While in a highly concurrently accessed directory, Lustre can choose performing
each request one by one to provide strong data consistency. More details on DLM can be found in.
Replication and synchronisation
Lustre does not ensure replication. Instead, it relies on independent software.
Load balancing
Lustre provides simple tools for load balancing. When an OST is unbalanced, that is, it uses more space
disk than other OSTs, the MDS will choose OSTs with more free space to store new data.
Fault detection
Lustre is different from other DFSs because faults are detected during clients' operations (metadata
queries or data transfers) instead of being detected during servers' communications. When a client
requests a file, it first contacts the MDS. If the latter is not available, the client will ask a LDAP server to
know what other MDS it can question. Then, the system perform a failover, that is, the first MDS is
marked as dead, and the second become the active one. A failure in OST is done in the same way. If a
client sends data to an OST which does not respond during a certain time it will be redirected to another
OST.
Refer to following URL for installing Lustre File System:
https://gist.github.com/joshuar/4e283308c932ec62fc05
MooseFS
Moose File System (MooseFS) is an Open-source, POSIX-compliant distributed file system developed by
Core Technology. MooseFS aims to be fault-tolerant, highly available, highly performing, scalable
general-purpose network distributed file system for data centers. Initially proprietary software, it was
released to the public as open source on May 5, 2008.
Architecture
MooseFS acts as HDFS. It has a master server managing metadata, several chunk servers storing and
replicating data blocks. MooseFS has a little difference since it provides failover between the master
server and the metalogger servers. Those are machines which periodically download metadata from the
master in order to be promoted as the new one in case of failures.
Naming
MooseFS manages the namespace as HDFS does. It stores metadata (permission, last access. . . ) and
the hierarchy of files and directories in the master main memory, while performing a persistent copy on
the metalogger. It provides users with a global name space of the system.
API and client access
MooseFS clients access the file system by mounting the name space (using the mfsmount command) in
their local file system. They can perform the usual operations by communicating with the master server
which redirects them to the chunk servers in a seamless way. mfsmount is based on the FUSE
mechanism.
Replication and synchronisation
In MooseFS, each file has a goal, that is, a specific number of copies to be maintained. When a client
writes data, the master server will send it a list of chunks servers in which data will be stored. Then, the
client sends the data to the first chunk server which orders other ones to replicate the file in a
synchronous way.
Load balancing
MooseFS provides load balancing. When a server is unavailable due to a failure, some data do not reach
their goal. In this case, the system will store replicas on other chunks servers. Furthermore, files can be
over their goal. If such a case appears, the system will remove a copy from a chunk server. MooseFS
also maintains a data version number so that if a server is again available, mechanisms allow the server
to update the data it stores. Finally, it is possible to dynamically add new data server which will be used to
store new files and replicas.
Fault detection
In MooseFS, when a server becomes unavailable, it is put in quarantine and no I/O operations to it can be
perform.
Refer to following URL for installing MooseFS:
https://github.com/moosefs/moosefs
Following table summarises the popular and widely used Distributed File Systems. It us essential to
understand that each of these may be applied to specific use cases. Some of these are functionally
similar and solve the same purpose.
Client Written in License Access API
Ceph C++ LGPL librados (C, C++, Python, Ruby), S3, Swift, FUSE
BeeGFS C / C++
FRAUNHOFER FS
(FhGFS) EULA,
GPLv2 client
POSIX
GlusterFS C GPLv3 libglusterfs, FUSE, NFS, SMB, Swift, libgfapi
Infinit C++
Proprietary (to be
open sourced in
2016)
FUSE, Installable File
System, NFS/SMB, POSIX, CLI, SDK (libinfinit)
ObjectiveFS C Proprietary POSIX, FUSE
MooseFS C GPLv2 POSIX, FUSE
Quantcast
File System
C Apache License 2.0
C++ client, FUSE (C++ server: MetaServer and
ChunkServer are both in C++)
Lustre C GPLv2 POSIX, liblustre, FUSE
OpenAFS C IBM Public License Virtual file system, Installable File System
Tahoe-LAFS Python
GNU GPL 2+ and
other
HTTP (browser
or CLI), SFTP, FTP, FUSE via SSHFS, pyfilesystem
HDFS Java Apache License 2.0 Java and C client, HTTP
XtreemFS Java, C++ BSD License libxtreemfs (Java, C++), FUSE
Ori C, C++ MIT libori, FUSE
Table : Features of Distributed File Systems

More Related Content

What's hot

Distributed shared memory shyam soni
Distributed shared memory shyam soniDistributed shared memory shyam soni
Distributed shared memory shyam soni
Shyam Soni
 
Distributed Operating System
Distributed Operating SystemDistributed Operating System
Distributed Operating System
SanthiNivas
 
Difference between Homogeneous and Heterogeneous
Difference between Homogeneous  and    HeterogeneousDifference between Homogeneous  and    Heterogeneous
Difference between Homogeneous and Heterogeneous
Faraz Qaisrani
 
Csma cd and csma-ca
Csma cd and csma-caCsma cd and csma-ca
Csma cd and csma-ca
kazim Hussain
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
Vajira Thambawita
 
Mobile Transport layer
Mobile Transport layerMobile Transport layer
Mobile Transport layer
Pallepati Vasavi
 
Data link layer
Data link layer Data link layer
Data link layer
Mukesh Chinta
 
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
vtunotesbysree
 
Satellite access
Satellite accessSatellite access
Satellite access
THANDAIAH PRABU
 
Pure aloha
Pure alohaPure aloha
Pure aloha
HarshithGade
 
Chapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage SystemsChapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage Systems
Wayne Jones Jnr
 
Internetworking basics
Internetworking basicsInternetworking basics
Internetworking basics
Romeo Alonzo
 
Centralised and distributed database
Centralised and distributed databaseCentralised and distributed database
Centralised and distributed database
Santosh Singh
 
Web and http computer network
Web and http computer networkWeb and http computer network
Web and http computer network
Anil Pokhrel
 
Cloud computing protocol
Cloud computing protocolCloud computing protocol
Cloud computing protocol
Kartik Kalpande Patil
 
X.25
X.25X.25
X.25
Ali Usman
 
Data link control
Data link controlData link control
Data link control
Iffat Anjum
 
Mobile transport layer - traditional TCP
Mobile transport layer - traditional TCPMobile transport layer - traditional TCP
Mobile transport layer - traditional TCP
Vishal Tandel
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
Harshad Umredkar
 
Message and Stream Oriented Communication
Message and Stream Oriented CommunicationMessage and Stream Oriented Communication
Message and Stream Oriented Communication
Dilum Bandara
 

What's hot (20)

Distributed shared memory shyam soni
Distributed shared memory shyam soniDistributed shared memory shyam soni
Distributed shared memory shyam soni
 
Distributed Operating System
Distributed Operating SystemDistributed Operating System
Distributed Operating System
 
Difference between Homogeneous and Heterogeneous
Difference between Homogeneous  and    HeterogeneousDifference between Homogeneous  and    Heterogeneous
Difference between Homogeneous and Heterogeneous
 
Csma cd and csma-ca
Csma cd and csma-caCsma cd and csma-ca
Csma cd and csma-ca
 
Lecture 1 introduction to parallel and distributed computing
Lecture 1   introduction to parallel and distributed computingLecture 1   introduction to parallel and distributed computing
Lecture 1 introduction to parallel and distributed computing
 
Mobile Transport layer
Mobile Transport layerMobile Transport layer
Mobile Transport layer
 
Data link layer
Data link layer Data link layer
Data link layer
 
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
SOLUTION MANUAL OF COMPUTER ORGANIZATION BY CARL HAMACHER, ZVONKO VRANESIC & ...
 
Satellite access
Satellite accessSatellite access
Satellite access
 
Pure aloha
Pure alohaPure aloha
Pure aloha
 
Chapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage SystemsChapter 12 - Mass Storage Systems
Chapter 12 - Mass Storage Systems
 
Internetworking basics
Internetworking basicsInternetworking basics
Internetworking basics
 
Centralised and distributed database
Centralised and distributed databaseCentralised and distributed database
Centralised and distributed database
 
Web and http computer network
Web and http computer networkWeb and http computer network
Web and http computer network
 
Cloud computing protocol
Cloud computing protocolCloud computing protocol
Cloud computing protocol
 
X.25
X.25X.25
X.25
 
Data link control
Data link controlData link control
Data link control
 
Mobile transport layer - traditional TCP
Mobile transport layer - traditional TCPMobile transport layer - traditional TCP
Mobile transport layer - traditional TCP
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
 
Message and Stream Oriented Communication
Message and Stream Oriented CommunicationMessage and Stream Oriented Communication
Message and Stream Oriented Communication
 

Similar to Distributed File Systems

module 2.pptx
module 2.pptxmodule 2.pptx
module 2.pptx
ssuser6e8e41
 
Hadoop
HadoopHadoop
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERSPARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
RaheemUnnisa1
 
Introduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptxIntroduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptx
sunithachphd
 
Hdfs
HdfsHdfs
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
Kalyan Hadoop
 
Hadoop
HadoopHadoop
Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file system
srikanthhadoop
 
Hadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data AnalyticsHadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data Analytics
DrPDShebaKeziaMalarc
 
Distributed Filesystems Review
Distributed Filesystems ReviewDistributed Filesystems Review
Distributed Filesystems Review
Schubert Zhang
 
Unit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxUnit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptx
AnkitChauhan817826
 
Sector Vs Hadoop
Sector Vs HadoopSector Vs Hadoop
Sector Vs Hadoop
lilyco
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
Kathirvel Ayyaswamy
 
Design of Hadoop Distributed File System
Design of Hadoop Distributed File SystemDesign of Hadoop Distributed File System
Design of Hadoop Distributed File System
Dr. C.V. Suresh Babu
 
big data hadoop technonolgy for storing and processing data
big data hadoop technonolgy for storing and processing databig data hadoop technonolgy for storing and processing data
big data hadoop technonolgy for storing and processing data
preetik9044
 
Software Defined storage
Software Defined storageSoftware Defined storage
Software Defined storage
Kirillos Akram
 
Hdfs design
Hdfs designHdfs design
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
Bhavesh Padharia
 
Unit 1
Unit 1Unit 1
Survey of clustered_parallel_file_systems_004_lanl.ppt
Survey of clustered_parallel_file_systems_004_lanl.pptSurvey of clustered_parallel_file_systems_004_lanl.ppt
Survey of clustered_parallel_file_systems_004_lanl.ppt
Rohn Wood
 

Similar to Distributed File Systems (20)

module 2.pptx
module 2.pptxmodule 2.pptx
module 2.pptx
 
Hadoop
HadoopHadoop
Hadoop
 
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERSPARALLEL FILE SYSTEM FOR LINUX CLUSTERS
PARALLEL FILE SYSTEM FOR LINUX CLUSTERS
 
Introduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptxIntroduction_to_HDFS sun.pptx
Introduction_to_HDFS sun.pptx
 
Hdfs
HdfsHdfs
Hdfs
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop distributed file system
Hadoop distributed file systemHadoop distributed file system
Hadoop distributed file system
 
Hadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data AnalyticsHadoop Distributed File System for Big Data Analytics
Hadoop Distributed File System for Big Data Analytics
 
Distributed Filesystems Review
Distributed Filesystems ReviewDistributed Filesystems Review
Distributed Filesystems Review
 
Unit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptxUnit-1 Introduction to Big Data.pptx
Unit-1 Introduction to Big Data.pptx
 
Sector Vs Hadoop
Sector Vs HadoopSector Vs Hadoop
Sector Vs Hadoop
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
Design of Hadoop Distributed File System
Design of Hadoop Distributed File SystemDesign of Hadoop Distributed File System
Design of Hadoop Distributed File System
 
big data hadoop technonolgy for storing and processing data
big data hadoop technonolgy for storing and processing databig data hadoop technonolgy for storing and processing data
big data hadoop technonolgy for storing and processing data
 
Software Defined storage
Software Defined storageSoftware Defined storage
Software Defined storage
 
Hdfs design
Hdfs designHdfs design
Hdfs design
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Unit 1
Unit 1Unit 1
Unit 1
 
Survey of clustered_parallel_file_systems_004_lanl.ppt
Survey of clustered_parallel_file_systems_004_lanl.pptSurvey of clustered_parallel_file_systems_004_lanl.ppt
Survey of clustered_parallel_file_systems_004_lanl.ppt
 

More from Manish Chopra

AWS and Slack Integration - Sending CloudWatch Notifications to Slack.pdf
AWS and Slack Integration - Sending CloudWatch Notifications to Slack.pdfAWS and Slack Integration - Sending CloudWatch Notifications to Slack.pdf
AWS and Slack Integration - Sending CloudWatch Notifications to Slack.pdf
Manish Chopra
 
Getting Started with ChatGPT.pdf
Getting Started with ChatGPT.pdfGetting Started with ChatGPT.pdf
Getting Started with ChatGPT.pdf
Manish Chopra
 
Grafana and AWS - Implementation and Usage
Grafana and AWS - Implementation and UsageGrafana and AWS - Implementation and Usage
Grafana and AWS - Implementation and Usage
Manish Chopra
 
Containers Auto Scaling on AWS.pdf
Containers Auto Scaling on AWS.pdfContainers Auto Scaling on AWS.pdf
Containers Auto Scaling on AWS.pdf
Manish Chopra
 
OpenKM Solution Document
OpenKM Solution DocumentOpenKM Solution Document
OpenKM Solution Document
Manish Chopra
 
Alfresco Content Services - Solution Document
Alfresco Content Services - Solution DocumentAlfresco Content Services - Solution Document
Alfresco Content Services - Solution Document
Manish Chopra
 
Jenkins Study Guide ToC
Jenkins Study Guide ToCJenkins Study Guide ToC
Jenkins Study Guide ToC
Manish Chopra
 
Ansible Study Guide ToC
Ansible Study Guide ToCAnsible Study Guide ToC
Ansible Study Guide ToC
Manish Chopra
 
Microservices with Dockers and Kubernetes
Microservices with Dockers and KubernetesMicroservices with Dockers and Kubernetes
Microservices with Dockers and Kubernetes
Manish Chopra
 
Unix and Linux Operating Systems
Unix and Linux Operating SystemsUnix and Linux Operating Systems
Unix and Linux Operating Systems
Manish Chopra
 
Working with Hive Analytics
Working with Hive AnalyticsWorking with Hive Analytics
Working with Hive Analytics
Manish Chopra
 
Preparing a Dataset for Processing
Preparing a Dataset for ProcessingPreparing a Dataset for Processing
Preparing a Dataset for Processing
Manish Chopra
 
Organizations with largest hadoop clusters
Organizations with largest hadoop clustersOrganizations with largest hadoop clusters
Organizations with largest hadoop clusters
Manish Chopra
 
Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3
Manish Chopra
 
Oracle solaris 11 installation
Oracle solaris 11 installationOracle solaris 11 installation
Oracle solaris 11 installation
Manish Chopra
 
Big Data Analytics Course Guide TOC
Big Data Analytics Course Guide TOCBig Data Analytics Course Guide TOC
Big Data Analytics Course Guide TOC
Manish Chopra
 
Emergence and Importance of Cloud Computing for the Enterprise
Emergence and Importance of Cloud Computing for the EnterpriseEmergence and Importance of Cloud Computing for the Enterprise
Emergence and Importance of Cloud Computing for the Enterprise
Manish Chopra
 
Steps to create an RPM package in Linux
Steps to create an RPM package in LinuxSteps to create an RPM package in Linux
Steps to create an RPM package in Linux
Manish Chopra
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
Manish Chopra
 
The Anatomy of GOOGLE Search Engine
The Anatomy of GOOGLE Search EngineThe Anatomy of GOOGLE Search Engine
The Anatomy of GOOGLE Search Engine
Manish Chopra
 

More from Manish Chopra (20)

AWS and Slack Integration - Sending CloudWatch Notifications to Slack.pdf
AWS and Slack Integration - Sending CloudWatch Notifications to Slack.pdfAWS and Slack Integration - Sending CloudWatch Notifications to Slack.pdf
AWS and Slack Integration - Sending CloudWatch Notifications to Slack.pdf
 
Getting Started with ChatGPT.pdf
Getting Started with ChatGPT.pdfGetting Started with ChatGPT.pdf
Getting Started with ChatGPT.pdf
 
Grafana and AWS - Implementation and Usage
Grafana and AWS - Implementation and UsageGrafana and AWS - Implementation and Usage
Grafana and AWS - Implementation and Usage
 
Containers Auto Scaling on AWS.pdf
Containers Auto Scaling on AWS.pdfContainers Auto Scaling on AWS.pdf
Containers Auto Scaling on AWS.pdf
 
OpenKM Solution Document
OpenKM Solution DocumentOpenKM Solution Document
OpenKM Solution Document
 
Alfresco Content Services - Solution Document
Alfresco Content Services - Solution DocumentAlfresco Content Services - Solution Document
Alfresco Content Services - Solution Document
 
Jenkins Study Guide ToC
Jenkins Study Guide ToCJenkins Study Guide ToC
Jenkins Study Guide ToC
 
Ansible Study Guide ToC
Ansible Study Guide ToCAnsible Study Guide ToC
Ansible Study Guide ToC
 
Microservices with Dockers and Kubernetes
Microservices with Dockers and KubernetesMicroservices with Dockers and Kubernetes
Microservices with Dockers and Kubernetes
 
Unix and Linux Operating Systems
Unix and Linux Operating SystemsUnix and Linux Operating Systems
Unix and Linux Operating Systems
 
Working with Hive Analytics
Working with Hive AnalyticsWorking with Hive Analytics
Working with Hive Analytics
 
Preparing a Dataset for Processing
Preparing a Dataset for ProcessingPreparing a Dataset for Processing
Preparing a Dataset for Processing
 
Organizations with largest hadoop clusters
Organizations with largest hadoop clustersOrganizations with largest hadoop clusters
Organizations with largest hadoop clusters
 
Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3
 
Oracle solaris 11 installation
Oracle solaris 11 installationOracle solaris 11 installation
Oracle solaris 11 installation
 
Big Data Analytics Course Guide TOC
Big Data Analytics Course Guide TOCBig Data Analytics Course Guide TOC
Big Data Analytics Course Guide TOC
 
Emergence and Importance of Cloud Computing for the Enterprise
Emergence and Importance of Cloud Computing for the EnterpriseEmergence and Importance of Cloud Computing for the Enterprise
Emergence and Importance of Cloud Computing for the Enterprise
 
Steps to create an RPM package in Linux
Steps to create an RPM package in LinuxSteps to create an RPM package in Linux
Steps to create an RPM package in Linux
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
 
The Anatomy of GOOGLE Search Engine
The Anatomy of GOOGLE Search EngineThe Anatomy of GOOGLE Search Engine
The Anatomy of GOOGLE Search Engine
 

Recently uploaded

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 

Recently uploaded (20)

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 

Distributed File Systems

  • 1. Distributed File Systems (DFS) A distributed file system is any file system that allows access to files from multiple hosts shared via a computer network. This makes it possible for multiple users on multiple machines to share files and storage resources. Distributed file systems differ in their performance, mutability of content, handling of concurrent writes, handling of permanent or temporary loss of nodes or storage, and their policy of storing content. Let us first understand what a Big Data project will comprise of, so that there is no doubt in differentiating between a traditional project and a Big Data based project. Traditionally, projects have required an RDBMS for managing data. Now that we know that Big Data differentiates with an RDBMS-based project due to the distributed nature of data management in Big Data systems. During a Big Data project analysis phase, one of the most important decisions is to ascertain the distributed storage system. Distributed storage is relatively easy to understand but complex in configuration and management. As with singular systems, where hard drives are locally attached to a machine, and accessed from the machine itself whether running Windows OS, Linux OS or any other operating system. Examples of managing data locally include a laptop’s c: drive, or a USB drive, or SAN connected storage, or internal drives of Desktop and Server class machines. For making a choice for a Big-Data based system, several factor need to be analysed. Also, more than one Distributed Filesystem may be incorporated. Types of Distributed File Systems There are several types of Distributed File Systems (DFSs) and many of these put in production depending upon the use case. Some DFSs are more popular, while others have been applied to specific applications. Following is a list of few important ones:  Ceph  GlusterFS  Lustre  MooseFS These have been chosen based on their popularity, use in production, and be regularly updated. Distributed file systems differ in their performance, mutability of content, handling of concurrent writes, handling of permanent or temporary loss of nodes or storage, and their policy of storing content. Let us now see more details of each of these. Ceph Ceph is an open source distributed file system developed by Sage Weil. Architecture Ceph is a totally distributed system. Unlike HDFS, to ensure scalability Ceph provides a dynamic distributed metadata management using a metadata cluster (MDS) and stores data and metadata in Object Storage Devices (OSD). MDSs manage the namespace, the security and the consistency of the system and perform queries of metadata while OSDs perform I/O operations. Cache consistency Ceph is a near POSIX system in which reads must re ect an up-to-date version of data and writes must reflect a particular order of occurrences. It allows concurrent read and write accesses using a simple locking mechanism called capabilities. These latter, granted by the MDSs, give users the ability to read, read and cache, write or write and buffer data. Ceph also provides some tools to relax consistency.
  • 2. Load Balancing Ceph ensures load balancing at two levels: data and metadata. Firstly, it implements a counter for each metadata which allows it to know their access frequency. Thus, Ceph is able to replicate or move the most frequently accessed metadata from an overloaded MDS to another one. Secondly, Ceph uses weighted devices. For example, let assume having two storage resources with respectively a weight of one and two. The second one will store twice as much data as the first one. Ceph monitors the disk space used by each OSD, and moves data to balance the system according to the weight associated to each OSD. For example, let us assume a new device is added. Ceph will be able to move data from OSDs which have become unbalanced to this new device. Fault Detection Ceph uses the same fault detection model as HDFS. OSDs periodically exchange heartbeats to detect failures. Ceph provides a small cluster of monitors which holds a copy of the cluster map. Each OSD can send a failure report to any monitor. The inaccurate OSD will be marked as down. Each OSD can also query an up-to-date version of the cluster map from the cluster of monitor. Thus, a formerly defaulting OSD can join the system again. Refer to following URL for installing GlusterFS: https://github.com/ceph/ceph GlusterFS GlusterFS is an open source distributed file system developed by the gluster core team. This was acquired by Red Hat in 2011and currently in development and production. GlusterFS is a scale-out network-attached storage file system, ans has found applications in cloud computing, streaming media services, and content delivery networks. Architecture GlusterFS is different from the other DFSs. It has a client-server design in which there is no metadata server. Instead, GlusterFS stores data and metadata on several devices attached to different servers. The set of devices is called a volume which can be configured to stripe data into blocks and/or replicate them. Thus, blocks will be distributed and/or replicated across several devices inside the volume. Naming GlusterFS does not manage metadata in a dedicated and centralised server, instead, it locates files algorithmically using the Elastic Hashing Algorithm (EHA) to provide a global name space. EHA uses a hash function to convert a file's pathname into a fixed length, uniform and unique value. A storage is assigned to a set of values allowing the system to store a file based on its value. For example, let assume there are two storage devices: disk1 and disk2 which respectively store files with value from 1 to 20 and from 21 to 40. The file Data.txt is converted to the value 30. Therefore, it will be stored onto disk2. API and client access GlusterFS provides a REST API and a mount module (FUSE or mount command) to give clients with an access to the system. Clients interact, send and retrieve files with, to and from a volume. Cache consistency GlusterFS does not use client side caching, thus, avoiding cache consistency issues. Replication and synchronisation Compared to the other DFSs, GlusterFS does not replicate data one by one, but relies on RAID 1. It makes several copies of a whole storage device into other ones inside a same volume using synchronous writes. That is, a volume is composed by several subsets in which there are an initial storage disk and its mirrors. Therefore, the number of storage devices in a volume must be a multiple of the desired replication. For example, let assume we have got four storage medias, and replication is fixed to two. The
  • 3. first storage will be replicated into the second and thus form a subset. In the same way, the third and fourth storage form another subset. Load balancing GlusterFS uses a hash function to convert files' pathnames into values and assigns several logical storage to different set of value. This is uniformly done and allows GlusterFS to be sure that each logical storage almost holds the same number of files. Since Thus, when a disk is added or removed, virtual storage can be moved to a different physical one in order to balance the system. Fault detection In GlusterFS, when a server becomes unavailable, it is removed from the system and no I/O operations to it can be perform. Refer to following URL for installing GlusterFS: https://github.com/gluster/glusterfs Lustre File System Lustre is a parallel distributed file system, generally used for large-scale cluster computing. The name Lustre is a word derived from Linux and cluster. Lustre file system software is available under the GNU General Public License (version 2 only) and provides high performance file systems for computer clusters ranging in size from small workgroup clusters to large-scale, multi-site clusters. Because Lustre file systems have high performance capabilities and open licensing, it is often used in supercomputers. Since June 2005, it has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world, including the world's No. 2 and No. 3 ranked TOP500 supercomputers in 2014, Titan and Sequoia. Lustre file systems are scalable and can be part of multiple computer clusters with tens of thousands of client nodes, tens of petabytes (PB) of storage on hundreds of servers, and more than a terabyte per second (TB/s) of aggregate I/O throughput. This makes Lustre file systems a popular choice for businesses with large data centers, including those in industries such as meteorology, simulation, oil and gas, life science, rich media, and finance. Architecture Lustre is a centralised distributed file system which differs from the current DFSs in that it does not provide any copy of data and metadata. Instead, Lustre chooses to It stores metadata on a shared storage called Metadata Target (MDT) attached to two Metadata Servers (MDS), thus offering an active/passive failover. MDS are the servers that handle the requests to metadata. Data themselves are managed in the same way. They are split into objects and distributed at several shared Object Storage Target (OST) which can be attached to several Object Storage Servers (OSS) to provide an active/active failover. OSS are the servers that handle I/O requests. Naming The Lustre's single global name space is provided to user by the MDS. Lustre uses inodes, like HDFS or MooseFS, and extended attributes to map file object name to its corresponding OSTs. Therefore, clients will be informed of which OSTs it should query for each requested data. API and client access Lustre provides tools to mount the entire file system in user space (FUSE). Transfers of files and processing queries are done in the same way like the other centralised DFSs, that is, they first ask the MDS to locate data and then directly perform I/O with the appropriate OSTs.
  • 4. Cache consistency Lustre implements a Distributed Lock Manager (DLM) to manage cache consistency. This is a sophisticated lock mechanism which allows Lustre to support concurrent read and write access. For example, Lustre can choose granting write-back lock in a directory with little contention. Write-back lock allows clients to cache a large number of updates in memory which will be committed to disk later, thus reducing data transfers. While in a highly concurrently accessed directory, Lustre can choose performing each request one by one to provide strong data consistency. More details on DLM can be found in. Replication and synchronisation Lustre does not ensure replication. Instead, it relies on independent software. Load balancing Lustre provides simple tools for load balancing. When an OST is unbalanced, that is, it uses more space disk than other OSTs, the MDS will choose OSTs with more free space to store new data. Fault detection Lustre is different from other DFSs because faults are detected during clients' operations (metadata queries or data transfers) instead of being detected during servers' communications. When a client requests a file, it first contacts the MDS. If the latter is not available, the client will ask a LDAP server to know what other MDS it can question. Then, the system perform a failover, that is, the first MDS is marked as dead, and the second become the active one. A failure in OST is done in the same way. If a client sends data to an OST which does not respond during a certain time it will be redirected to another OST. Refer to following URL for installing Lustre File System: https://gist.github.com/joshuar/4e283308c932ec62fc05 MooseFS Moose File System (MooseFS) is an Open-source, POSIX-compliant distributed file system developed by Core Technology. MooseFS aims to be fault-tolerant, highly available, highly performing, scalable general-purpose network distributed file system for data centers. Initially proprietary software, it was released to the public as open source on May 5, 2008. Architecture MooseFS acts as HDFS. It has a master server managing metadata, several chunk servers storing and replicating data blocks. MooseFS has a little difference since it provides failover between the master server and the metalogger servers. Those are machines which periodically download metadata from the master in order to be promoted as the new one in case of failures. Naming MooseFS manages the namespace as HDFS does. It stores metadata (permission, last access. . . ) and the hierarchy of files and directories in the master main memory, while performing a persistent copy on the metalogger. It provides users with a global name space of the system. API and client access MooseFS clients access the file system by mounting the name space (using the mfsmount command) in their local file system. They can perform the usual operations by communicating with the master server which redirects them to the chunk servers in a seamless way. mfsmount is based on the FUSE mechanism.
  • 5. Replication and synchronisation In MooseFS, each file has a goal, that is, a specific number of copies to be maintained. When a client writes data, the master server will send it a list of chunks servers in which data will be stored. Then, the client sends the data to the first chunk server which orders other ones to replicate the file in a synchronous way. Load balancing MooseFS provides load balancing. When a server is unavailable due to a failure, some data do not reach their goal. In this case, the system will store replicas on other chunks servers. Furthermore, files can be over their goal. If such a case appears, the system will remove a copy from a chunk server. MooseFS also maintains a data version number so that if a server is again available, mechanisms allow the server to update the data it stores. Finally, it is possible to dynamically add new data server which will be used to store new files and replicas. Fault detection In MooseFS, when a server becomes unavailable, it is put in quarantine and no I/O operations to it can be perform. Refer to following URL for installing MooseFS: https://github.com/moosefs/moosefs
  • 6. Following table summarises the popular and widely used Distributed File Systems. It us essential to understand that each of these may be applied to specific use cases. Some of these are functionally similar and solve the same purpose. Client Written in License Access API Ceph C++ LGPL librados (C, C++, Python, Ruby), S3, Swift, FUSE BeeGFS C / C++ FRAUNHOFER FS (FhGFS) EULA, GPLv2 client POSIX GlusterFS C GPLv3 libglusterfs, FUSE, NFS, SMB, Swift, libgfapi Infinit C++ Proprietary (to be open sourced in 2016) FUSE, Installable File System, NFS/SMB, POSIX, CLI, SDK (libinfinit) ObjectiveFS C Proprietary POSIX, FUSE MooseFS C GPLv2 POSIX, FUSE Quantcast File System C Apache License 2.0 C++ client, FUSE (C++ server: MetaServer and ChunkServer are both in C++) Lustre C GPLv2 POSIX, liblustre, FUSE OpenAFS C IBM Public License Virtual file system, Installable File System Tahoe-LAFS Python GNU GPL 2+ and other HTTP (browser or CLI), SFTP, FTP, FUSE via SSHFS, pyfilesystem HDFS Java Apache License 2.0 Java and C client, HTTP XtreemFS Java, C++ BSD License libxtreemfs (Java, C++), FUSE Ori C, C++ MIT libori, FUSE Table : Features of Distributed File Systems