Google File System


Published on

Google File System

1 Comment
1 Like
  • very nice ppt
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Google File System

  1. 1. Avinash Kumar BE Computer-2 Roll No-40
  2. 2. Contents  Introduction to GFS  System Architecture  System Features  Working of GFS  Latest advancement  Conclusion  Questions
  3. 3. Introduction  More than 15,000 commodity-class PC's.  Multiple clusters distributed worldwide.  Thousands of queries served per second.  One query reads 100's of MB of data.  One query consumes 10's of billions of CPU cycles.  Google stores dozens of copies of the entire Web! Conclusion: Need large, distributed, highly fault tolerant file system.
  4. 4. System Architecture A GFS cluster consists of a single master and multiple chunk-servers and is accessed by multiple clients.
  5. 5. Large Chunk  GFS uses large chunk: 64MB (1G = 1024 MB = 16 chunks)  Stored as a plain Linux file, which will be lazily extended up to 64MB.  Opt to many read and write on a given chunk  Reduces network overhead by keeping a connection to the chunk server.  See also Map-Reduce, Big-Table.
  6. 6. Architecture (cont’d) Chunkserver Files are divided into fixed-size chunks (64 MB) Each chunk is identified by an immutable and globally unique 64 bit chunkhandle assigned by the master at the time of chunkcreation Chunkservers store chunks on local disks as Linux files and read or write chunk data specified by a chunkhandle For reliability, each chunk is replicated on multiple chunkservers. (default 3 replicas) GFS Client GFS client code linked into each application implements the file system API and communicates with the master and chunkservers to read or write data on behalf of the
  7. 7. System Metadata The master stores three major types of metadata: The file and chunk namespaces The mapping from files to chunks The locations of each chunk’s replicas All metadata is kept in the master’s memory The first two types are also kept persistent by logging mutations to an operation log stored on the master’s local disk and replicated on remote machines. The master does not store third type persistently. Instead, it asks each chunkserver about its chunks at master startup
  8. 8. System Features Page Rank- Probability that a random surfer visits the site Citations (Back links) • How is Page Rank calculated?? • PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) where, PR -> Page Rank of a page T1….Tn -> Pages that point to Page A (citations) d -> Damping Factor (0<d<1) C(A) -> No. of Links going out from A Page Rank of a page depends on-  Number of pages pointing to it.  Page Rank of the page that points to it.
  9. 9. System Features  Anchor Text- text associated with the link Association with the page the link is on • • Association with the page the link points to( unique to Google) Advantages: • Anchors contain more information than the pages themselves • Documents that cannot be indexed can be displayed  Other Features: Proximity of location information in search for all hits • Track of visual presentation details •
  10. 10. System Anatomy
  11. 11. Working Of GFS
  12. 12. Google Query Evaluation Parse the query. 1. Convert words into wordIDs. 2. Seek to the start of the doclist in the short barrel for every word. 3. Scan through the doclists until there is a document that matches all the 4. search terms. Compute the rank of that document for the query. 5. If in the short barrels and at the end of any doclist, seek to the start of 6. the doclist in the full barrel for every word and go to step 4. If we are not at the end of any doclist go to step 4. 7. Sort the documents that have matched by rank and return the top k.
  13. 13. Client Read  Client sends master:  read(file name, chunk index)  Master’s reply:  chunk ID, chunk version number, locations of replicas  Client sends “closest” chunkserver w/replica:  read(chunk ID, byte range)  “Closest” determined by IP address on simple rack- based network topology  Chunkserver replies with data
  14. 14. Client Write  Some chunkserver is primary for each chunk  Master grants lease to primary (typically for 60 sec.)  Leases renewed using periodic heartbeat messages between master and chunkservers  Client asks server for primary and secondary replicas for each chunk  Client sends data to replicas in daisy chain  Pipelined: each replica forwards as it receives  Takes advantage of full-duplex Ethernet links
  15. 15. Client Write (2)  All replicas acknowledge data write to client  Client sends write request to primary  Primary assigns serial number to write request, providing ordering  Primary forwards write request with same serial number to secondaries  Secondaries all reply to primary after completing write  Primary replies to client
  16. 16. Client Write (3)
  17. 17. What Happen If the Master Reboots?  Replays log from disk  Recovers namespace (directory) information  Recovers file-to-chunk-ID mapping  Asks chunkservers which chunks they hold  Recovers chunk-ID-to-chunkserver mapping  If chunk server has older chunk, it’s stale  Chunk server down at lease renewal  If chunk server has newer chunk, adopt its version number  Master may have failed while granting lease
  18. 18. What Happen if Chunkserver Fails?  Master notices missing heartbeats  Master decrements count of replicas for all chunks on dead chunkserver  Master re-replicates chunks missing replicas in background  Highest priority for chunks missing greatest number of replicas
  19. 19. Latest Advancement  gMail - An easily configurable email service with 1GB of web space.  Blogger- A free web-based service that helps consumers publish on the web without writing code or installing software.  Google “next generation corporate s/w” - A smaller version of the google software, modified for private use.
  20. 20. Conclusion  Success: used actively by Google to support search service and other applications  Availability and recoverability on cheap hardware  High throughput by decoupling control and data  Supports massive data sets and concurrent appends  Semantics not transparent to apps  Must verify file contents to avoid inconsistent regions, repeated appends (at-least-once semantics)  Performance not good for all apps  Assumes read-once, write-once workload (no client caching!)
  21. 21. ANY QUESTION ?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.