Google File System: System and Design Overview

oogle File System
Presented by :
Ayedi Hedi Abderrahim Habiba

Introduction
Design Overview
Fault tolerance and diagnosis
System Interaction
conclusion

FS “File System”
▸ A ﬁle system is a process that manages how and where
data is stored, accessed and managed on a storage disk
3
One Operating system

DFS “ Distributed
File System”
▸ As the name suggests, is a ﬁle system that is distributed on
multiple locations.
▸ It allows programs to access or store isolated ﬁles as they
do with the local ones .
4
Transparency

“Commodity“
hardware
▸ Commodity servers are inexpensive and can be easily
scaled out ( Horizontally ) with right software , They work in
tandem to get the greatest amount of computation at the
lowest cost .
5
Disk Failure Network Failure server crash OS bugs
Using commodity hardware we have to take on consideration That:

What is“Google File System”

GFS “Google File
System”
A scalable distributed ﬁle system(DFS) , built
on cheap commodity components designed
to store and read large distributed
data-intensive applications , so it meet the
rapid growth demand of google data
processing .
.
7

GFS Assumptions
GFS key challenges
Hardware failure ,the system must constantly monitor itself and
detect, tolerate, and recover failures
Managed efficiently huge data typically 100 MB or larger
The system must effectively implement well-defined semantics for multiple clients
that are simultaneously adding data to the same file.
High sustained bandwidth.
The workload reads Large streaming reads , small random reads
Sequential writes that append data to files.
9

GFS Interface
create, delete, read, write, open and close
+
snapshot and record append
makes a copy concurrently appends
commit / rollBack
Read Read Read Read/write
….
Example of an absolute directory name : /d1/d2/.../dn/leaf
The GFS supports usual ﬁle operations :
Files are organized hierarchically in directories and identiﬁed by pathnames.
10

Google File System
Architecture

GFS Architecture
A GFS cluster consists of a single master and multiple chunkservers
and is accessed by multiple clients .
Diagram from Gsf paper
12

GFS Architecture
Serverchunks - Chunks
Data availability /reliability
- Chunks are identified by a globally
immutable unique “chunk handle”
(64bit).
- Each file is splitted into multiple
chunks .
- Chunk size is fixed as 64 MB
- Chunkservers store chunks on a
local disk as linux file .
- Each chunk is replicated on multiple
chunkservers(default =3).
13

GFS Architecture
Master
Maintain all ﬁle system metadata stored in it
memory:
- Access control information .
- The ﬁle and chunk namespaces, chunks
ID,and the locations of each chunk’s
replicas (request the information from
chunkservers) .
14

GFS Architecture
Master - Master operations
- Controls system-wide activities such as garbage
collection, and chunk migration between
chunkservers (Rebalancing).
For a better disk space and load balancing
Master
Serverchunk
- The master periodically scan through its entire
state in the background (re-replication).
Availables chuncks falls
- Controls all chunkplacement and monitors
chunkserver status with regular HeartBeat
messages.
15

Does the master have his
hands full ?

operation log
append - only
The master recovers its ﬁle system state by replaying the operation log:
- The operation log contains a historical record of critical metadata changes
replicated to multiple remote machine .
Record all operations
Used if the master crashes Historical records
Checkpointed regularly
- The master recover by loading the latest checkpoint .
19

system interaction
mutation - lease
- Mutation :A mutation is an operation that changes the
contents or metadata of a chunk (write ,append) and
each mutation is performed at all the chunk’s replicas.
primary chunk
- Lease : Maintain a coherent mutation order
between replicas.
21

system interaction
Data flow
- Data flow is decoupled from the flow of control to use the network
efficiently,by pipelining the data transfer over TCP connections.
- Data is pushed linearly along a carefully picked chain of chunkservers,
each machine forwards the data to the “closest” machine in the
network topology that has not received it.
22

system interaction
write
1. The client asks the master which chunkserver holds
the current lease for the chunk and the locations of
the other replicas.
2. The master replies with the identity of the primary
and the locations of the other (secondary) replicas.
3. Client pushes data to all replicas .
4. Once all the replicas have acknowledged receiving the
data, the client sends a write request to the primary.
5. The primary forwards the write request to all
secondary replicas.
6. The secondaries reply to the primary indicating that
they have completed the operation.
7. The primary replies to the client.
23

Fault tolerance and diagnosis
High Availability
GFS keeps the overall system highly available with simple yet effective strategies:
Fast recovery and replications : Both the master and the chunkserver are designed to restore
their state and start in seconds no matter how they terminated (operation log and checkpoints
and Chunk Replication) .
Data Integrity : Detect corruption of stored data using checksumming (each chunkserver must
independently verify the integrity of its own copy by maintaining checksums).
Diagnostic Tools : GSF generate detailed diagnostic logging has helped immeasurably in
problem isolation, debugging, and performance analysis.
26

- GFS provides fault tolerance, reliability, scalability, availability and
performance to large networks and connected nodes.
- GFS generates huge amounts of data that must be stored.
- GFS features include:
▸ Fault tolerance
▸ Critical data replication
▸ Automatic and efficient data recovery
▸ High aggregate throughput
▸ Reduced client and master interaction because of large chunk server size
▸ Namespace management and locking
28

Google File System: System and Design Overview

More Related Content

Similar to Google File System: System and Design Overview

More from habibaabderrahim1

Recently uploaded

Google File System: System and Design Overview