SlideShare a Scribd company logo
A Distributed Storage System for
Structured Data
Bigtable
Presenter:
Yunming Zhang
Conglong Li
Saturday, September 21, 13
References
SOCC 2010 Key Note Slides
Jeff Dean Google
Introduction to Distributed Computing, Winter 2008
University of Washington
2
Saturday, September 21, 13
Motivation
Lots of (semi) structured data at Google
URLs
Contents, crawl metadata, links
Per-user data:
User preference settings, search results
Scale is large
Billions of URLs, hundreds of million of users,
Existing Commercial database doesn’t meet the
requirements
3
Saturday, September 21, 13
Store and manage all the state reliably and efficiently
Allow asynchronous processes to update different
pieces of data continuously
Very high read/write rates
Efficient scans over all or interesting subsets of
data
Often want to examine data changes over time
Goals
4
Saturday, September 21, 13
BigTable vs. GFS
GFS provides raw data storage
We need:
More sophisticated storage
Key - value mapping
Flexible enough to be useful
Store semi-structured data
Reliable, scalable, etc.
5
Saturday, September 21, 13
BigTable
Bigtable is a distributed storage system for managing
large scale structured data
Wide applicability
Scalability
High performance
High availability
6
Saturday, September 21, 13
Overview
Data Model
API
Implementation Structures
Optimizations
Performance Evaluation
Applications
Conclusions
7
Saturday, September 21, 13
Data Model
Sparse
Sorted
Multidimensional
8
Saturday, September 21, 13
Cell
Contains multiple versions of the data
Can locate a data using row key, column key and a
time stamp
Treats data as uninterpreted array of bytes that allow
clients to serialize various forms of structured and
semi-structured data
Supports automatic garbage collection per column
family for management of versioned data
9
Saturday, September 21, 13
Store and manage all the state reliably and efficiently
Allow asynchronous processes to update different
pieces of data continuously
Very high read/write rates
Efficient scans over all or interesting subsets of
data
Often want to examine data changes over time
Goals
10
Saturday, September 21, 13
Row
Row key is an arbitrary string
Access to column data in a row is atomic
Row creation is implicit upon storing data
Rows ordered lexicographically
Rows close together lexicographically usually reside
on one or a small number of machines
11
Saturday, September 21, 13
Columns
Columns are grouped into Column Families:
family:optional_qualifier
Column family
Has associated type information
Usually of the same type 12
Saturday, September 21, 13
Overview
Data Model
API
Implementation Structures
Optimizations
Performance Evaluation
Applications
Conclusions
13
Saturday, September 21, 13
API
Metadata operations
Create/delete tables, column families, change
metadata, modify access control list
Writes ( atomic )
Set (), DeleteCells(), DeleteRow()
Reads
Scanner: read arbitrary cells in a BigTable
14
Saturday, September 21, 13
Overview
Data Model
API
Implementation Structures
Optimizations
Performance Evaluation
Applications
Conclusions
15
Saturday, September 21, 13
Tablets
Large tables broken into tablets at row boundaries
Tablet holds contiguous range of rows
Clients can often choose row keys for locality
Aim for ~100MB to 200MB of data per tablet
Serving machine responsible for ~100 tablets
Fast recovery:
100 machine each pick up 1 tablet from failed machine
Fine-grained load balancing:
Migrate tablets away from overloaded machine
16
Saturday, September 21, 13
Tablets and Splitting
Saturday, September 21, 13
System Structure
Master
Metadata operations
Load balancing
Keep track of live tablet servers
Master failure
Tablet server
Accept read and write to data
18
Saturday, September 21, 13
System Structure
Saturday, September 21, 13
System Structure
read/write
Saturday, September 21, 13
System Structure
Metadata operations
Saturday, September 21, 13
Locating Tablets
3-level hierarchical lookup scheme for tablets
Location is ip port of servers in META tables
22
Saturday, September 21, 13
Tablet Representation
and serving
Append only tablet log
SSTable on GFS
A Sorted map of string to string
If you want to find a row data, all the data are
contiguous
Memtable write buffer
When a read comes in, you have to merge SSTable data
and uncommitted value.
23
Saturday, September 21, 13
Tablet Representation
and Serving
24
Saturday, September 21, 13
Tablet Representation
and Serving
25
Saturday, September 21, 13
Compaction
Tablet state represented as a set of immutable compacted
SSTable files, plus tail of log
Minor compaction:
When in-memory buffer fills up, it freezes the in-memory
buffer and create a new SSTable
Major compaction:
Periodically compact all SSTables for tablet into new base
SSTable on GFS
Storage reclaimed from deletions at this point
Produce new tables
26
Saturday, September 21, 13
Overview
Data Model
API
Implementation Structures
Optimizations
Performance Evaluation
Applications
Conclusions
27
Saturday, September 21, 13
Reliable system for storing and managing all the states
Allow asynchronous processes to update different
pieces of data continuously
Very high read/write rates
Efficient scans over all or interesting subsets of
data
Often want to examine data changes over time
Goals
28
Saturday, September 21, 13
Locality Groups
Clients can group multiple column families together
into a locality group
A separate SSTable is generated for each locality group
Enable more efficient read
Can be declared to be in-memory
29
Saturday, September 21, 13
Compression
Many opportunities for compression
Similar values in columns and cells
Within each SSTable for a locality group, encode
compressed blocks
Keep blocks small for random access
Exploit fact that many values very similar
30
Saturday, September 21, 13
Reliable system for storing and managing all the states
Allow asynchronous processes to update different
pieces of data continuously
Very high read/write rates
Efficient scans over all or interesting subsets of
data
Often want to examine data changes over time
Goals
31
Saturday, September 21, 13
Commit log and recovery
Single commit log file per tablet server
reduce the number of concurrent file writes to GFS
Tablet Recovery
redo points in log
perform the same set of operations from last
persistent state
32
Saturday, September 21, 13
Overview
Data Model
API
Implementation Structures
Optimizations
Performance Evaluation
Applications
Conclusions
33
Saturday, September 21, 13
Performance evaluation
Test Environment
Based on a GFS with 1876 machines
400 GB IDE hard drives in each machine
Two-level tree-shaped switched network
Performance Tests
Random Read/Write
Sequential Read/Write
34
Saturday, September 21, 13
Single tablet-server performance
Random reads is the slowest
Transfer 64 KB SSTable over GFS to read 1000 byte
Random and sequential writes perform better
Append writes to server to a single commit log
Group commit
35
Saturday, September 21, 13
Performance Scaling
Performance didn’t scale linearly
Load imbalance in multiple server configurations
Larger data transfer overhead
36
Saturday, September 21, 13
Overview
Data Model
API
Implementation Structures
Optimizations
Performance Evaluation
Applications
Conclusions
37
Saturday, September 21, 13
Google Analytics
A service that analyzes traffic patterns at web sites
Raw Click Table
Row for each end-user session
Row key is (website name, time)
Summary Table
Extracts recent session data using MapReduce jobs
38
Saturday, September 21, 13
Google Earth
Use one table for preprocessing and one for serving
Different latency requirements (disk vs memory)
Each row in the imagery table represents a single
geographic segment
Column family to store data source
One column for each raw image
Very sparse
39
Saturday, September 21, 13
Personalized Search
Row key is a unique userid
A column family for each type of user action
Replicated across Bigtable clusters to increase
availability and reduce latency
40
Saturday, September 21, 13
Conclusions
Bigtable provides a high scalability, high performance,
high availability and flexible storage for structured
data.
It provides a low level read / write based interface for
other frameworks to build on top of it
It has enabled Google to deal with large scale data
efficiently
41
Saturday, September 21, 13

More Related Content

What's hot

GOOGLE BIGTABLE
GOOGLE BIGTABLEGOOGLE BIGTABLE
GOOGLE BIGTABLE
Tomcy Thankachan
 
Google Big Table
Google Big TableGoogle Big Table
Google Big Table
Omar Al-Sabek
 
Google - Bigtable
Google - BigtableGoogle - Bigtable
Google - Bigtable
영원 서
 
Google BigTable
Google BigTableGoogle BigTable
Bigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured DataBigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured Data
elliando dias
 
Bigtable
BigtableBigtable
Bigtable
Amir Payberah
 
Bigtable
BigtableBigtable
Bigtable
zafargilani
 
Bigtable
BigtableBigtable
Big table
Big tableBig table
The Google Bigtable
The Google BigtableThe Google Bigtable
The Google Bigtable
Romain Jacotin
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
Edward Yoon
 
Bigtable and Dynamo
Bigtable and DynamoBigtable and Dynamo
Bigtable and Dynamo
Iraklis Psaroudakis
 
Cloud Technology: Virtualization
Cloud Technology: VirtualizationCloud Technology: Virtualization
8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databases
Fabio Fumarola
 
Google cluster architecture
Google cluster architecture Google cluster architecture
Google cluster architecture
Abhijeet Desai
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
Kanike Krishna
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
Biju Nair
 
Bigtable and Boxwood
Bigtable and BoxwoodBigtable and Boxwood
Bigtable and Boxwood
Evan Weaver
 
Write intensive workloads and lsm trees
Write intensive workloads and lsm treesWrite intensive workloads and lsm trees
Write intensive workloads and lsm trees
Tilak Patidar
 
Rise of Column Oriented Database
Rise of Column Oriented DatabaseRise of Column Oriented Database
Rise of Column Oriented Database
Suvradeep Rudra
 

What's hot (20)

GOOGLE BIGTABLE
GOOGLE BIGTABLEGOOGLE BIGTABLE
GOOGLE BIGTABLE
 
Google Big Table
Google Big TableGoogle Big Table
Google Big Table
 
Google - Bigtable
Google - BigtableGoogle - Bigtable
Google - Bigtable
 
Google BigTable
Google BigTableGoogle BigTable
Google BigTable
 
Bigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured DataBigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured Data
 
Bigtable
BigtableBigtable
Bigtable
 
Bigtable
BigtableBigtable
Bigtable
 
Bigtable
BigtableBigtable
Bigtable
 
Big table
Big tableBig table
Big table
 
The Google Bigtable
The Google BigtableThe Google Bigtable
The Google Bigtable
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
 
Bigtable and Dynamo
Bigtable and DynamoBigtable and Dynamo
Bigtable and Dynamo
 
Cloud Technology: Virtualization
Cloud Technology: VirtualizationCloud Technology: Virtualization
Cloud Technology: Virtualization
 
8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databases
 
Google cluster architecture
Google cluster architecture Google cluster architecture
Google cluster architecture
 
Column oriented database
Column oriented databaseColumn oriented database
Column oriented database
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
Bigtable and Boxwood
Bigtable and BoxwoodBigtable and Boxwood
Bigtable and Boxwood
 
Write intensive workloads and lsm trees
Write intensive workloads and lsm treesWrite intensive workloads and lsm trees
Write intensive workloads and lsm trees
 
Rise of Column Oriented Database
Rise of Column Oriented DatabaseRise of Column Oriented Database
Rise of Column Oriented Database
 

Viewers also liked

App Engine overview (Android meetup 06-10)
App Engine overview (Android meetup 06-10)App Engine overview (Android meetup 06-10)
App Engine overview (Android meetup 06-10)
jasonacooper
 
Bigtable a distributed storage system
Bigtable a distributed storage systemBigtable a distributed storage system
Bigtable a distributed storage system
Devyani Vaidya
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
Eric Evans
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
DataStax
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
Michelle Darling
 

Viewers also liked (6)

App Engine overview (Android meetup 06-10)
App Engine overview (Android meetup 06-10)App Engine overview (Android meetup 06-10)
App Engine overview (Android meetup 06-10)
 
Bigtable a distributed storage system
Bigtable a distributed storage systemBigtable a distributed storage system
Bigtable a distributed storage system
 
Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3Cassandra By Example: Data Modelling with CQL3
Cassandra By Example: Data Modelling with CQL3
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 

Similar to Big table presentation-final

Sadcw 6e chapter12
Sadcw 6e chapter12Sadcw 6e chapter12
Sadcw 6e chapter12
Matthew McKenzie
 
Comparing sql and nosql dbs
Comparing sql and nosql dbsComparing sql and nosql dbs
Comparing sql and nosql dbs
Vasilios Kuznos
 
B036407011
B036407011B036407011
B036407011
theijes
 
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility PresentationMicrosoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft Private Cloud
 
Ch 7 Physical D B Design
Ch 7  Physical D B  DesignCh 7  Physical D B  Design
Ch 7 Physical D B Design
guest8fdbdd
 
Basic and Introduction to DBMS Unit 1 of AU
Basic and Introduction to DBMS Unit 1 of AUBasic and Introduction to DBMS Unit 1 of AU
Basic and Introduction to DBMS Unit 1 of AU
infant2404
 
Fundamentals of database system - Database System Concepts and Architecture
Fundamentals of database system - Database System Concepts and ArchitectureFundamentals of database system - Database System Concepts and Architecture
Fundamentals of database system - Database System Concepts and Architecture
Mustafa Kamel Mohammadi
 
Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloud
moshfiq
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture
Tin Ho
 
NoSQL
NoSQLNoSQL
Srds Pres011120
Srds Pres011120Srds Pres011120
Srds Pres011120
Rudolf Husar
 
2004-11-13 Supersite Relational Database Project: (Data Portal?)
2004-11-13 Supersite Relational Database Project: (Data Portal?)2004-11-13 Supersite Relational Database Project: (Data Portal?)
2004-11-13 Supersite Relational Database Project: (Data Portal?)
Rudolf Husar
 
S18 das
S18 dasS18 das
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
SHIKHA GAUTAM
 
hpc2013_20131223
hpc2013_20131223hpc2013_20131223
hpc2013_20131223
Ryohei Kobayashi
 
Climbing the beanstalk
Climbing the beanstalkClimbing the beanstalk
Climbing the beanstalk
gordonyorke
 
17-NoSQL.pptx
17-NoSQL.pptx17-NoSQL.pptx
17-NoSQL.pptx
levichan1
 
Facade
FacadeFacade
Facade
Louis Zhang
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
Editor Jacotech
 
Windows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Windows Azure - Uma Plataforma para o Desenvolvimento de AplicaçõesWindows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Windows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Comunidade NetPonto
 

Similar to Big table presentation-final (20)

Sadcw 6e chapter12
Sadcw 6e chapter12Sadcw 6e chapter12
Sadcw 6e chapter12
 
Comparing sql and nosql dbs
Comparing sql and nosql dbsComparing sql and nosql dbs
Comparing sql and nosql dbs
 
B036407011
B036407011B036407011
B036407011
 
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility PresentationMicrosoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
Microsoft SQL Server - Reduce Your Cost and Improve your Agility Presentation
 
Ch 7 Physical D B Design
Ch 7  Physical D B  DesignCh 7  Physical D B  Design
Ch 7 Physical D B Design
 
Basic and Introduction to DBMS Unit 1 of AU
Basic and Introduction to DBMS Unit 1 of AUBasic and Introduction to DBMS Unit 1 of AU
Basic and Introduction to DBMS Unit 1 of AU
 
Fundamentals of database system - Database System Concepts and Architecture
Fundamentals of database system - Database System Concepts and ArchitectureFundamentals of database system - Database System Concepts and Architecture
Fundamentals of database system - Database System Concepts and Architecture
 
Presentation on Databases in the Cloud
Presentation on Databases in the CloudPresentation on Databases in the Cloud
Presentation on Databases in the Cloud
 
Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture Speed layer : Real time views in LAMBDA architecture
Speed layer : Real time views in LAMBDA architecture
 
NoSQL
NoSQLNoSQL
NoSQL
 
Srds Pres011120
Srds Pres011120Srds Pres011120
Srds Pres011120
 
2004-11-13 Supersite Relational Database Project: (Data Portal?)
2004-11-13 Supersite Relational Database Project: (Data Portal?)2004-11-13 Supersite Relational Database Project: (Data Portal?)
2004-11-13 Supersite Relational Database Project: (Data Portal?)
 
S18 das
S18 dasS18 das
S18 das
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
hpc2013_20131223
hpc2013_20131223hpc2013_20131223
hpc2013_20131223
 
Climbing the beanstalk
Climbing the beanstalkClimbing the beanstalk
Climbing the beanstalk
 
17-NoSQL.pptx
17-NoSQL.pptx17-NoSQL.pptx
17-NoSQL.pptx
 
Facade
FacadeFacade
Facade
 
Data management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunitiesData management in cloud study of existing systems and future opportunities
Data management in cloud study of existing systems and future opportunities
 
Windows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Windows Azure - Uma Plataforma para o Desenvolvimento de AplicaçõesWindows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
Windows Azure - Uma Plataforma para o Desenvolvimento de Aplicações
 

Recently uploaded

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 

Recently uploaded (20)

GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 

Big table presentation-final

  • 1. A Distributed Storage System for Structured Data Bigtable Presenter: Yunming Zhang Conglong Li Saturday, September 21, 13
  • 2. References SOCC 2010 Key Note Slides Jeff Dean Google Introduction to Distributed Computing, Winter 2008 University of Washington 2 Saturday, September 21, 13
  • 3. Motivation Lots of (semi) structured data at Google URLs Contents, crawl metadata, links Per-user data: User preference settings, search results Scale is large Billions of URLs, hundreds of million of users, Existing Commercial database doesn’t meet the requirements 3 Saturday, September 21, 13
  • 4. Store and manage all the state reliably and efficiently Allow asynchronous processes to update different pieces of data continuously Very high read/write rates Efficient scans over all or interesting subsets of data Often want to examine data changes over time Goals 4 Saturday, September 21, 13
  • 5. BigTable vs. GFS GFS provides raw data storage We need: More sophisticated storage Key - value mapping Flexible enough to be useful Store semi-structured data Reliable, scalable, etc. 5 Saturday, September 21, 13
  • 6. BigTable Bigtable is a distributed storage system for managing large scale structured data Wide applicability Scalability High performance High availability 6 Saturday, September 21, 13
  • 7. Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 7 Saturday, September 21, 13
  • 9. Cell Contains multiple versions of the data Can locate a data using row key, column key and a time stamp Treats data as uninterpreted array of bytes that allow clients to serialize various forms of structured and semi-structured data Supports automatic garbage collection per column family for management of versioned data 9 Saturday, September 21, 13
  • 10. Store and manage all the state reliably and efficiently Allow asynchronous processes to update different pieces of data continuously Very high read/write rates Efficient scans over all or interesting subsets of data Often want to examine data changes over time Goals 10 Saturday, September 21, 13
  • 11. Row Row key is an arbitrary string Access to column data in a row is atomic Row creation is implicit upon storing data Rows ordered lexicographically Rows close together lexicographically usually reside on one or a small number of machines 11 Saturday, September 21, 13
  • 12. Columns Columns are grouped into Column Families: family:optional_qualifier Column family Has associated type information Usually of the same type 12 Saturday, September 21, 13
  • 13. Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 13 Saturday, September 21, 13
  • 14. API Metadata operations Create/delete tables, column families, change metadata, modify access control list Writes ( atomic ) Set (), DeleteCells(), DeleteRow() Reads Scanner: read arbitrary cells in a BigTable 14 Saturday, September 21, 13
  • 15. Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 15 Saturday, September 21, 13
  • 16. Tablets Large tables broken into tablets at row boundaries Tablet holds contiguous range of rows Clients can often choose row keys for locality Aim for ~100MB to 200MB of data per tablet Serving machine responsible for ~100 tablets Fast recovery: 100 machine each pick up 1 tablet from failed machine Fine-grained load balancing: Migrate tablets away from overloaded machine 16 Saturday, September 21, 13
  • 18. System Structure Master Metadata operations Load balancing Keep track of live tablet servers Master failure Tablet server Accept read and write to data 18 Saturday, September 21, 13
  • 22. Locating Tablets 3-level hierarchical lookup scheme for tablets Location is ip port of servers in META tables 22 Saturday, September 21, 13
  • 23. Tablet Representation and serving Append only tablet log SSTable on GFS A Sorted map of string to string If you want to find a row data, all the data are contiguous Memtable write buffer When a read comes in, you have to merge SSTable data and uncommitted value. 23 Saturday, September 21, 13
  • 26. Compaction Tablet state represented as a set of immutable compacted SSTable files, plus tail of log Minor compaction: When in-memory buffer fills up, it freezes the in-memory buffer and create a new SSTable Major compaction: Periodically compact all SSTables for tablet into new base SSTable on GFS Storage reclaimed from deletions at this point Produce new tables 26 Saturday, September 21, 13
  • 27. Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 27 Saturday, September 21, 13
  • 28. Reliable system for storing and managing all the states Allow asynchronous processes to update different pieces of data continuously Very high read/write rates Efficient scans over all or interesting subsets of data Often want to examine data changes over time Goals 28 Saturday, September 21, 13
  • 29. Locality Groups Clients can group multiple column families together into a locality group A separate SSTable is generated for each locality group Enable more efficient read Can be declared to be in-memory 29 Saturday, September 21, 13
  • 30. Compression Many opportunities for compression Similar values in columns and cells Within each SSTable for a locality group, encode compressed blocks Keep blocks small for random access Exploit fact that many values very similar 30 Saturday, September 21, 13
  • 31. Reliable system for storing and managing all the states Allow asynchronous processes to update different pieces of data continuously Very high read/write rates Efficient scans over all or interesting subsets of data Often want to examine data changes over time Goals 31 Saturday, September 21, 13
  • 32. Commit log and recovery Single commit log file per tablet server reduce the number of concurrent file writes to GFS Tablet Recovery redo points in log perform the same set of operations from last persistent state 32 Saturday, September 21, 13
  • 33. Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 33 Saturday, September 21, 13
  • 34. Performance evaluation Test Environment Based on a GFS with 1876 machines 400 GB IDE hard drives in each machine Two-level tree-shaped switched network Performance Tests Random Read/Write Sequential Read/Write 34 Saturday, September 21, 13
  • 35. Single tablet-server performance Random reads is the slowest Transfer 64 KB SSTable over GFS to read 1000 byte Random and sequential writes perform better Append writes to server to a single commit log Group commit 35 Saturday, September 21, 13
  • 36. Performance Scaling Performance didn’t scale linearly Load imbalance in multiple server configurations Larger data transfer overhead 36 Saturday, September 21, 13
  • 37. Overview Data Model API Implementation Structures Optimizations Performance Evaluation Applications Conclusions 37 Saturday, September 21, 13
  • 38. Google Analytics A service that analyzes traffic patterns at web sites Raw Click Table Row for each end-user session Row key is (website name, time) Summary Table Extracts recent session data using MapReduce jobs 38 Saturday, September 21, 13
  • 39. Google Earth Use one table for preprocessing and one for serving Different latency requirements (disk vs memory) Each row in the imagery table represents a single geographic segment Column family to store data source One column for each raw image Very sparse 39 Saturday, September 21, 13
  • 40. Personalized Search Row key is a unique userid A column family for each type of user action Replicated across Bigtable clusters to increase availability and reduce latency 40 Saturday, September 21, 13
  • 41. Conclusions Bigtable provides a high scalability, high performance, high availability and flexible storage for structured data. It provides a low level read / write based interface for other frameworks to build on top of it It has enabled Google to deal with large scale data efficiently 41 Saturday, September 21, 13