Big table


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Big table

  1. 1. Presented by Manuel Correa BigTable: A Distributed StorageBigTable: A Distributed Storage System for Structure DataSystem for Structure Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google, Inc.Google, Inc.
  2. 2.  Problem in RDBMS & NoSQL Databases  What is BigTable?  Data Model  Implementation  HBASE – Hadoop project  Quick Demo  Performance and Evaluations  Real Applications  Questions AgendaAgenda
  3. 3.  RDBMS do not scale with Large Data Sets – petabytes  Do not scale horizontally – Replication, and Clustering – RDBMS were not designed to be distributed  Schema are rigid  Joins do not scale well Problems in RDBMSProblems in RDBMS
  4. 4.  Non-Relational data storage  No joins, one simple schema that accommodates large datasets  NoSQL Data Bases designed to be distributed  NoSQL is not a replacement for RDBMS!!  Examples: – Document DB: Hbase (Hadoop) – Object DB: MongoDB – Graph DB: Neo4j NoSQL Databases –NoSQL Databases – NNoo OOnlynly SQLSQL
  5. 5. Who's using NoSQL?Who's using NoSQL?
  6. 6.  BigTable is a distributed storage system for managing structure data  Designed by Google Inc. in 2006  BigTable was designed to scale to petabytes of data in thousands of machines  BigTable has successfully provided a flexible, wide applicability, scalability, high availability, and high-performance solution for all Google products What is BigTable?What is BigTable?
  7. 7.  BigTable is sparse, distributed, persistent multidimensional sorted Map  The map is index by row key, column key, timestamp. The value is an array of bytes BigTable Data ModelBigTable Data Model
  8. 8.  Rows keys are arbitrary strings (up to 64KB)  Every single Read/Write operation over a row is Atomic  BigTable data is ordered lexicographically by row key  A table is dynamically partitioned in row ranges called TabletTablet  A Tablet is the basic unit of distribution and load balancing BigTable Model - RowsBigTable Model - Rows
  9. 9.  Columns keys are grouped together in a single unit called column families  A column family is the single unit of access control  All data stored in a column family are usually of the same type  A column family contains several column indexes  Access a column index: family:qualifier BigTable Model – ColumnsBigTable Model – Columns
  10. 10.  Each cell in BigTable can contain multiple versions of the same data  Each version is index with a timestamp  The timestamp is an 64-bit Integer  Different versions of a cell are store in decreasing order, so that the most recent version can be read first  BigTable implements a Garbage Collector to remove unused versions (The client can specify a expiration policy for each column family) BigTable Model – TimestampsBigTable Model – Timestamps
  11. 11.  Each cell in BigTable are index by timestamp – Maintain different version of the same data – The most recent version will be first. The order of the timestamp is decreasing – The system implements garbage collector. This takes care of unused versions  Example: WebTable – The contents family column of a Web page has different versions BigTable Model – TimestampsBigTable Model – Timestamps
  12. 12. BigTable Model – Example: WebTableBigTable Model – Example: WebTable  Row Key is the URL in reverse  Pages in the same domain are group together in contiguous rows = TABLET  Anchor is a family column with two column indexes  Different versions (t3, t5, t6, t8, 9) are keep in family indexes
  13. 13. BigTable ImplementationBigTable Implementation  BigTable uses the distributed Google File System  MapReduce inputs/outputs can be store in BigTable  BigTable uses the Google SSTable file format to internally store the data  SSTable: provide a persistent, ordered immutable map from keys to values – Internally SSTable contains a sequence of blocks (~64KB) – SSTable contains a index at the end of the block – A lookup can be perform with a single disk seek – SSTable can loaded into memory, so the scans and lookups operation happens in memory
  14. 14. BigTable ImplementationBigTable Implementation  BigTable relies in a high-available, persistent distributed Lock service called Chubby  The Chubby services consist in five active replicas. One of them is the Master. Paxos algorithm is used to keep the replicas in sync  Chubby provides a namespace that consist in Directories/Files. Which can be used as a Lock. Each Read/Write is atomic  BigTable uses Chubby to: – To ensure that there is at most one Master node active at any time – To store the bootstrap location of BigTable data – To discover Tablet servers and finalize Tablet server deaths – To Store BigTable schema information
  15. 15.  Three Major Components: – Master Server • Assigns Tablet to Tablets servers • Handles schema changes: Tablet and column families creation – Tablet Servers: • Manage a set of Tablets (between 10 and 1000 tablets) • Handles Read/Write requests – A library, linked to every client • Clients communicate directly with Tablet Server • Client cache the tablet location. Worst case must go through the Master to find out the tablet location and refresh the local cache BigTable ImplementationBigTable Implementation
  16. 16.  Three levels hierarchy to store Tablet location information. (analogous to B+ tree)  Chubby File contains the location to the Root Tablet  Root Tablet contain the location to other Tablets in a special METADATA tablet BigTable ImplementationBigTable Implementation
  17. 17.  Updates are committed to a commit log that stores redo records. The Authorization is checked in a chubby file  The recently commits are store in memtable (Memory)  To recover a Tablet server read its METADATA and then applies the changes in MEMTABLE  The read operations read for memtable or SSTable. The authorization is checked in a Chubby file BigTable Implementation – Tablet ServingBigTable Implementation – Tablet Serving
  18. 18.  This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware  HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's BigTable  Hbase includes: – Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules – Query predicate push down via server side scan and get filters – Optimizations for real time queries – A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options – Extensible jruby-based (JIRB) shell HBase – The Hadoop DBHBase – The Hadoop DB
  19. 19. DEMO HBase – The Hadoop DBHBase – The Hadoop DB
  20. 20.  dff BigTable: Performance/EvaluationsBigTable: Performance/Evaluations
  21. 21.  dff BigTable: Real ApplicationsBigTable: Real Applications
  22. 22. Questions ? BigTableBigTable