Google file system
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Google file system

on

  • 669 views

 

Statistics

Views

Total Views
669
Views on SlideShare
669
Embed Views
0

Actions

Likes
0
Downloads
53
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Performance,Scalability,Reliability,Availability
  • Namespace, Access-control information, Chunk locations,mapping|| never move data through it, use only for metadata
  • 512B,1KB,2KB
  • AvoidsCache Coherence Issues

Google file system Presentation Transcript

  • 1. Google File System Dhan V Sagar CB.EN.P2CSE13007 M.Tech CSE I
  • 2. • • • • • Thousands of queries served per second. One query reads 100's of MB of data. One query consumes 10's of billions of CPU cycles. Google Apps Dozens of copies of the entire Web…! DVS INTRODUCTION • TRADITIONAL ISSUES • Need large, distributed, highly fault tolerant file system 2
  • 3. DVS GFS: Architecture 3
  • 4. DVS • Single Master • Multiple Chunk Servers • Multiple Clients 4
  • 5. Master Server • Manages metadata • Performs check pointing and logging of changes to metadata DVS • Manages chunk creation, replication, placement • Garbage Collection • Periodically communicate with chunkservers (Heart Beat Message) 5
  • 6. Master Server • Single Point Failure..?? • Shadow Masters DVS • Least involvement of master 6
  • 7. Chunk • Files are divided into fixed-size chunks • Chunk Servers store chunks on local disk as Linux files • Replication for Reliability DVS • Unique 64 bit chunkhandle • Chunk Size : 64MB - Much larger than typical file system block sizes 7
  • 8. Large Chunk Size • Reduce interaction between client and master • Reduces network overhead by keeping persistent TCP connection DVS • Reduce size of metadata stored on the master 8
  • 9. Client • Control (metadata) requests to master server • Caches metadata DVS • Data requests to chunkservers • No caching of data 9
  • 10. Client Read • Client sends master: • read(file name, chunk index) • chunk ID, chunk version number, locations of replicas DVS • Master’s reply: • Client sends “closest” chunkserver replica: • read(chunk ID, byte range) • “Closest” determined by IP address on simple rackbased network topology • Chunkserver replies with data 10
  • 11. DVS Client Read 11
  • 12. DVS Client Read 12
  • 13. DVS Client Write 1. Application originates the request 2. GFS client translates request and sends it to master 3. Master responds with chunk handle and replica locations 13
  • 14. DVS Client Write 4. Client pushes write data to all locations. Data is stored in chunkserver’s internal buffers 14
  • 15. DVS Client Write 5. Client sends write command to primary 6. Primary determines serial order for data instances in its buffer and writes the instances in that order to the chunk 7. Primary sends the serial order to the secondaries and tells them to perform the write 15
  • 16. DVS Client Write 8. Secondaries respond back to primary 9. Primary responds back to the client 16
  • 17. File Deletion • Master records deletion in its log • File renamed to hidden name including deletion timestamp DVS • When client deletes file: • Master scans file namespace in background: • In-memory metadata erased • Master scans chunk namespace in background: • Removes unreferenced chunks from chunkservers 17
  • 18. References DVS • The Google File System - Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung 18
  • 19. DVS THANK YOU 19