3. Introduction:
More than 15,000 commodity-class PC's.
Multiple clusters distributed worldwide.
Thousands of queries served per second.
One query reads 100's of MB of data.
One query consumes 10's of billions of CPU cycles.
Google stores dozens of copies of the entire Web!
Conclusion: Need large, distributed, highly fault tolerant
file system.
4. Architecture:
A GFS cluster consists of a single master and
multiple chunk-servers and is accessed by multiple
clients
5. Master
Manages namespace/metadata
Manages chunk creation, replication, placement
Performs snapshot operation to create duplicate of file or directory tree
Performs checkpointing and logging of changes to metadata
Chunkservers
Stores chunk data and checksum for each block
On startup/failure recovery, reports chunks to master
Periodically reports sub-set of chunks to master (to detect no longer
needed chunks)
Metadata
Types of Metadata:- File and chunk namespaces, Mapping from files to
chunks, Location of each chunks replicas
Easy and efficient for the master to periodically scan .
Periodic scanning is used to implement chunk garbage collection, re-
replication and chunk migration .
6. System Interactions:
Read Algorithm
1. Application originates the read request
2. GFS client translates the request form
(filename, byte range) -> (filename, chunk
index), and sends it to master
3. Master responds with chunk handle and
replica locations (i.e. chunkservers where
the replicas are stored)
4. Client picks a location and sends the
(chunk handle, byte range) request to the
location
5. Chunkserver sends requested data to the
client
6. Client forwards the data to the application
7. Write Algorithm
1. Application originates the request
2. GFS client translates request from (filename,
data) -> (filename, chunk index), and sends it to
master
3. Master responds with chunk handle and (primary
+ secondary) replica locations
4. Client pushes write data to all locations. Data is
stored in chunkservers’ internal buffers
5. Client sends write command to primary
6. Primary determines serial order for data
instances stored in its buffer and writes the
instances in that order to the chunk
7. Primary sends the serial order to the
secondaries and tells them to perform the write
8. Secondaries respond to the primaryPrimary
responds back to the client
8. Master Operation
Namespace Management and Locking:
o GFS maps full pathname to Metadata in a table.
o Each master operation acquires a set of locks.
o Locking scheme allows concurrent mutations in same directory.
o Locks are acquired in a consistent total order to prevent deadlock.
Replica Placement:
o Maximizes reliability, availability and network bandwidth utilization.
o Spread chunk replicas across racks
9. Fault Tolerance
High availability:
Fast recovery.
Chunk replication.
Master Replication
Data Integrity:
Chunkserver uses checksumming.
Broken up into 64 KB blocks.
10. Latest Advancement
Gmail - An easily configurable email
service with 15GB of web space.
Blogger- A free web-based service that helps consumers
publish on the web without writing code or installing
software.
Google “next generation corporate s/w”
- A smaller version of the google software, modified
for private use.
11. Conclusion
GFS meets Google storage requirements:
Incremental growth
Regular check of component failure
Data optimization from special operations
Simple architecture
Fault Tolerance
12. References
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung,
The Google File System, ACM SIGOPS Operating Systems
Review, Volume 37, Issue 5.
Sean Quinlan, Kirk McKusick “GFS-Evolution and Fast-
Forward” Communications of the ACM, Vol 53.
Naushad Uzzman, Survey on Google File System,
Conference on SIGOPS at University of Rochester.