SlideShare a Scribd company logo
1 of 32
Download to read offline
Bigtable

A Distributed Storage System for
         Structured Data

        Authors: Fay Chang et. al.

        Presenter: Zafar Gilani

                                     1
Bigtable


Outline
•   Introduction
•   Data model
•   Implementation
•   Performance evaluation
•   Conclusions




                               2
Bigtable


A distributed storage system ..
• .. for managing structured data.
• Used for demanding workloads, such as:
   – Throughput oriented batch processing.
   – Serving latency-sensitive data to the client.
• Dynamic control instead of relational model.
• Data locality properties (revisit later briefly).



                                                        3
Bigtable


Bigtable has achieved several goals
• Wide applicability: used for 60+ Google
  products, including:
  – Google Analytics, Google Code, Google Earth,
    Google Maps and Gmail.
• Scalability (explain later under evaluation).
• High performance.
• High availability.


                                                     4
Bigtable


Outline
•   Introduction
•   Data model
•   Implementation
•   Performance evaluation
•   Conclusions




                               5
Bigtable


  Data model
  • Essentially a sparse, distributed, persistent
    multi-dimensional sorted map.
  • The map is indexed by a row key, column key
    and a timestamp.
  • Atomic reads and writes over a single row.
                            Columns




Rows


                                                    6
Bigtable


Row and column range
• Row range dynamically       • Column keys grouped
  partitioned into tablets.     into column families.
• Data in lexicographic       • Each family has the
  order.                        same type.
• Allows data locality.       • Allows access control
                                and disk or memory
                                accounting.




                                                          7
Bigtable


Row and column range
• Row range dynamically             • Column keys grouped
  partitioned into tablets.           into column families.
• Data in lexicographic             • Each family has the
  order.                              same type.
• Allows data locality.             • Allows access control
                                      and disk or memory
                                      accounting.


               Enables reasoning about data locality

                                                                8
Columns




Rows




                 9
Columns




Rows




       Anchor is a column family




                                   10
Columns




Tablets
                          “anchor:bbcworld.com   “anchor:weather


          “com.bbc.www”          “BBC”            “BBC.com”




                                                                   11
Bigtable


Outline
•   Introduction
•   Data model
•   Implementation
•   Performance evaluation
•   Conclusions




                              12
Bigtable
Bigtable uses several other
technologies
• Google File System to store log and data files.
• SSTable file format to store BigTable data.
• Chubby, a distributed lock service.




For more details on these technologies, refer to section 4 of the paper.


                                                                            13
Bigtable


  Implementation
Master responsibilities:
-Assign tablets to tablet
servers
-Add/delete tablet servers
-Balance tablet server load
-GC
-Schema changes
                                                   INTERNET

                                                                      CLIENT
                                               Communicate directly
                                               to tablet servers


               MASTER
                              TABLET SERVERS
                                                                       14
Bigtable


How data is stored?
A three-level hierarchy, similar to B+ trees.




                                                 15
Bigtable


Location hierarchy
 Chubby file contains location
     of the root tablet.




                                  16
Bigtable


Location hierarchy
       Root tablet contains all
     tablet locations in Metadata
                 table.




                                     17
Bigtable


Location hierarchy     Metadata table stores
                     locations of actual tablets.




                                                     18
Bigtable


Location hierarchy




  Client moves up the hierarchy (Metadata -> Root -> Chubby),
  if location of tablet is unknown or incorrect.
                                                                 19
Bigtable


How data is served?




                       20
Bigtable


Tablet serving




Persistent




                  21
Bigtable


  Tablet serving


                          Compactions




Compactions occur
regularly, advantages:
-Shrinks memory usage.
-Reduces amount of data
read from log during
recovery.                                22
Bigtable


Outline
•   Introduction
•   Data model
•   Implementation
•   Performance evaluation
•   Conclusions




                              23
Bigtable


Benchmarks for perf evaluation
• Scan:
  – Scans over values in a row range.
• Random reads from memory.
• Random reads/writes:
  – R keys to be read/written spread over N clients.
• Sequential reads/writes:
  – 0 to R-1 keys to be read/written spread over N
    clients.

                                                        24
Bigtable


Performance evaluation
                     Scan uses single RPC call and
                       shows best performance.




                                               25
Bigtable


Performance evaluation



                          Sequential reads are better
                           than random reads, since
                         each fetched block is used to
                             serve next requests.




                                                 26
Bigtable


Performance evaluation




                     Random read shows the worst
                      performance. Fetching 64KB
                     every 1000 bytes is expensive.




                                              27
Bigtable


Performance evaluation
    Not linear, but scales well.




                                    28
Bigtable


Outline
•   Introduction
•   Data model
•   Implementation
•   Performance evaluation
•   Conclusions




                              29
Bigtable


Conclusions
• Bigtable: highly scalable and available, without
  compromising performance.
• Flexibility for Google – designed using their
  own data model.
• Custom design gives Google the ability to
  remove or minimize bottlenecks.
• Related work:
  – Apache Hbase (open source)
  – Boxwood (though targeted at a lower/FS level)

                                                     30
Bigtable

A Distributed Storage System for
         Structured Data

        Authors: Fay Chang et. al.

        Presenter: Zafar Gilani

                                     31
Bigtable


B+ Trees
• A tree with sorted data for:
  – Efficient insertion, retrieval and removal of
    records.
• All records are stored at the leaf level, only
  keys stored in interior nodes.




                                                     32

More Related Content

What's hot

What's hot (18)

Big table
Big tableBig table
Big table
 
Bigtable
BigtableBigtable
Bigtable
 
Big table presentation-final
Big table presentation-finalBig table presentation-final
Big table presentation-final
 
Big table
Big tableBig table
Big table
 
Bigtable and Dynamo
Bigtable and DynamoBigtable and Dynamo
Bigtable and Dynamo
 
Summary of "Google's Big Table" at nosql summer reading in Tokyo
Summary of "Google's Big Table" at nosql summer reading in TokyoSummary of "Google's Big Table" at nosql summer reading in Tokyo
Summary of "Google's Big Table" at nosql summer reading in Tokyo
 
GOOGLE BIGTABLE
GOOGLE BIGTABLEGOOGLE BIGTABLE
GOOGLE BIGTABLE
 
Google Bigtable paper presentation
Google Bigtable paper presentationGoogle Bigtable paper presentation
Google Bigtable paper presentation
 
Google Bigtable Paper Presentation
Google Bigtable Paper PresentationGoogle Bigtable Paper Presentation
Google Bigtable Paper Presentation
 
Bigtable
BigtableBigtable
Bigtable
 
Cloud Technology: Virtualization
Cloud Technology: VirtualizationCloud Technology: Virtualization
Cloud Technology: Virtualization
 
Bigtable
BigtableBigtable
Bigtable
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
Bigtable and Boxwood
Bigtable and BoxwoodBigtable and Boxwood
Bigtable and Boxwood
 
Mysql database
Mysql databaseMysql database
Mysql database
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
MySQL 8 Server Optimization Swanseacon 2018
MySQL 8 Server Optimization Swanseacon 2018MySQL 8 Server Optimization Swanseacon 2018
MySQL 8 Server Optimization Swanseacon 2018
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
 

Viewers also liked

Dynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theoremDynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theoremGrisha Weintraub
 
Temadeinvestigacion 130402203353-phpapp02
Temadeinvestigacion 130402203353-phpapp02Temadeinvestigacion 130402203353-phpapp02
Temadeinvestigacion 130402203353-phpapp02Camilo López
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And HbaseEdward Yoon
 
Aprendizaje de Maquina y Aplicaciones
Aprendizaje de Maquina y AplicacionesAprendizaje de Maquina y Aplicaciones
Aprendizaje de Maquina y AplicacionesEdgar Marca
 
Dynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and ComparisonDynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and ComparisonGrisha Weintraub
 
The Google File System (GFS)
The Google File System (GFS)The Google File System (GFS)
The Google File System (GFS)Romain Jacotin
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamojbellis
 
ROCKING YOUR SEAT AT THE BIG TABLE - ROB BAILEY
ROCKING YOUR SEAT AT THE BIG TABLE - ROB BAILEYROCKING YOUR SEAT AT THE BIG TABLE - ROB BAILEY
ROCKING YOUR SEAT AT THE BIG TABLE - ROB BAILEYSupport Driven
 
I've (probably) been using Google App Engine for a week longer than you have
I've (probably) been using Google App Engine for a week longer than you haveI've (probably) been using Google App Engine for a week longer than you have
I've (probably) been using Google App Engine for a week longer than you haveSimon Willison
 
Big table por Matias tesoriero
Big table por Matias tesorieroBig table por Matias tesoriero
Big table por Matias tesorieromtesoriero
 
Mallorca MUG: MapReduce y Aggregation Framework
Mallorca MUG: MapReduce y Aggregation FrameworkMallorca MUG: MapReduce y Aggregation Framework
Mallorca MUG: MapReduce y Aggregation FrameworkEmilio Torrens
 
Map reduce (from Google)
Map reduce (from Google)Map reduce (from Google)
Map reduce (from Google)Sri Prasanna
 

Viewers also liked (18)

Dynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theoremDynamo and BigTable in light of the CAP theorem
Dynamo and BigTable in light of the CAP theorem
 
Temadeinvestigacion 130402203353-phpapp02
Temadeinvestigacion 130402203353-phpapp02Temadeinvestigacion 130402203353-phpapp02
Temadeinvestigacion 130402203353-phpapp02
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
 
The Google Bigtable
The Google BigtableThe Google Bigtable
The Google Bigtable
 
Aprendizaje de Maquina y Aplicaciones
Aprendizaje de Maquina y AplicacionesAprendizaje de Maquina y Aplicaciones
Aprendizaje de Maquina y Aplicaciones
 
Cloud Computing y MapReduce
Cloud Computing y MapReduceCloud Computing y MapReduce
Cloud Computing y MapReduce
 
Dynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and ComparisonDynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and Comparison
 
Gfs vs hdfs
Gfs vs hdfsGfs vs hdfs
Gfs vs hdfs
 
The Google File System (GFS)
The Google File System (GFS)The Google File System (GFS)
The Google File System (GFS)
 
Cassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + DynamoCassandra: Open Source Bigtable + Dynamo
Cassandra: Open Source Bigtable + Dynamo
 
ROCKING YOUR SEAT AT THE BIG TABLE - ROB BAILEY
ROCKING YOUR SEAT AT THE BIG TABLE - ROB BAILEYROCKING YOUR SEAT AT THE BIG TABLE - ROB BAILEY
ROCKING YOUR SEAT AT THE BIG TABLE - ROB BAILEY
 
I've (probably) been using Google App Engine for a week longer than you have
I've (probably) been using Google App Engine for a week longer than you haveI've (probably) been using Google App Engine for a week longer than you have
I've (probably) been using Google App Engine for a week longer than you have
 
Google's BigTable
Google's BigTableGoogle's BigTable
Google's BigTable
 
Big table por Matias tesoriero
Big table por Matias tesorieroBig table por Matias tesoriero
Big table por Matias tesoriero
 
Mallorca MUG: MapReduce y Aggregation Framework
Mallorca MUG: MapReduce y Aggregation FrameworkMallorca MUG: MapReduce y Aggregation Framework
Mallorca MUG: MapReduce y Aggregation Framework
 
Map reduce (from Google)
Map reduce (from Google)Map reduce (from Google)
Map reduce (from Google)
 
MapReduce en Hadoop
MapReduce en HadoopMapReduce en Hadoop
MapReduce en Hadoop
 
HDFS
HDFSHDFS
HDFS
 

Similar to Bigtable

8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databasesFabio Fumarola
 
Xldb2011 wed 1415_andrew_lamb-buildingblocks
Xldb2011 wed 1415_andrew_lamb-buildingblocksXldb2011 wed 1415_andrew_lamb-buildingblocks
Xldb2011 wed 1415_andrew_lamb-buildingblocksliqiang xu
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
Large scale computing with mapreduce
Large scale computing with mapreduceLarge scale computing with mapreduce
Large scale computing with mapreducehansen3032
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQLDon Demcsak
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In DepthFabio Fumarola
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7abdulrahmanhelan
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsMichael Kopp
 
Sql Performance Tuning For Developers
Sql Performance Tuning For DevelopersSql Performance Tuning For Developers
Sql Performance Tuning For Developerssqlserver.co.il
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloudboorad
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911Ines Sombra
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.Kyong-Ha Lee
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlantaboorad
 
Building a highly scalable and available cloud application
Building a highly scalable and available cloud applicationBuilding a highly scalable and available cloud application
Building a highly scalable and available cloud applicationNoam Sheffer
 
Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0EDB
 
Spil Games: outgrowing an internet startup
Spil Games: outgrowing an internet startupSpil Games: outgrowing an internet startup
Spil Games: outgrowing an internet startupart-spilgames
 
Challenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsChallenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsYasin Memari
 
Codemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labCodemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labUgo Landini
 

Similar to Bigtable (20)

8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databases
 
Xldb2011 wed 1415_andrew_lamb-buildingblocks
Xldb2011 wed 1415_andrew_lamb-buildingblocksXldb2011 wed 1415_andrew_lamb-buildingblocks
Xldb2011 wed 1415_andrew_lamb-buildingblocks
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
Large scale computing with mapreduce
Large scale computing with mapreduceLarge scale computing with mapreduce
Large scale computing with mapreduce
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
Lecture3.ppt
Lecture3.pptLecture3.ppt
Lecture3.ppt
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ Applications
 
NoSQL
NoSQLNoSQL
NoSQL
 
Sql Performance Tuning For Developers
Sql Performance Tuning For DevelopersSql Performance Tuning For Developers
Sql Performance Tuning For Developers
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlanta
 
Building a highly scalable and available cloud application
Building a highly scalable and available cloud applicationBuilding a highly scalable and available cloud application
Building a highly scalable and available cloud application
 
Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0Active/Active Database Solutions with Log Based Replication in xDB 6.0
Active/Active Database Solutions with Log Based Replication in xDB 6.0
 
Spil Games: outgrowing an internet startup
Spil Games: outgrowing an internet startupSpil Games: outgrowing an internet startup
Spil Games: outgrowing an internet startup
 
Challenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsChallenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data Genomics
 
Codemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech labCodemotion 2015 Infinispan Tech lab
Codemotion 2015 Infinispan Tech lab
 

More from zafargilani

6 intelligent-placement-of-datacenters
6 intelligent-placement-of-datacenters6 intelligent-placement-of-datacenters
6 intelligent-placement-of-datacenterszafargilani
 
Assignment 1-mtat
Assignment 1-mtatAssignment 1-mtat
Assignment 1-mtatzafargilani
 
5 state-of-cloud-applications-and-platforms
5 state-of-cloud-applications-and-platforms5 state-of-cloud-applications-and-platforms
5 state-of-cloud-applications-and-platformszafargilani
 
1 logical data models for cc arch
1 logical data models for cc arch1 logical data models for cc arch
1 logical data models for cc archzafargilani
 
2 rest-elevator-pitch
2 rest-elevator-pitch2 rest-elevator-pitch
2 rest-elevator-pitchzafargilani
 
1 distributed-systems-template-modified
1 distributed-systems-template-modified1 distributed-systems-template-modified
1 distributed-systems-template-modifiedzafargilani
 

More from zafargilani (7)

6 intelligent-placement-of-datacenters
6 intelligent-placement-of-datacenters6 intelligent-placement-of-datacenters
6 intelligent-placement-of-datacenters
 
Assignment 1-mtat
Assignment 1-mtatAssignment 1-mtat
Assignment 1-mtat
 
5 state-of-cloud-applications-and-platforms
5 state-of-cloud-applications-and-platforms5 state-of-cloud-applications-and-platforms
5 state-of-cloud-applications-and-platforms
 
1 logical data models for cc arch
1 logical data models for cc arch1 logical data models for cc arch
1 logical data models for cc arch
 
3 apache-avro
3 apache-avro3 apache-avro
3 apache-avro
 
2 rest-elevator-pitch
2 rest-elevator-pitch2 rest-elevator-pitch
2 rest-elevator-pitch
 
1 distributed-systems-template-modified
1 distributed-systems-template-modified1 distributed-systems-template-modified
1 distributed-systems-template-modified
 

Recently uploaded

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Recently uploaded (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Bigtable

  • 1. Bigtable A Distributed Storage System for Structured Data Authors: Fay Chang et. al. Presenter: Zafar Gilani 1
  • 2. Bigtable Outline • Introduction • Data model • Implementation • Performance evaluation • Conclusions 2
  • 3. Bigtable A distributed storage system .. • .. for managing structured data. • Used for demanding workloads, such as: – Throughput oriented batch processing. – Serving latency-sensitive data to the client. • Dynamic control instead of relational model. • Data locality properties (revisit later briefly). 3
  • 4. Bigtable Bigtable has achieved several goals • Wide applicability: used for 60+ Google products, including: – Google Analytics, Google Code, Google Earth, Google Maps and Gmail. • Scalability (explain later under evaluation). • High performance. • High availability. 4
  • 5. Bigtable Outline • Introduction • Data model • Implementation • Performance evaluation • Conclusions 5
  • 6. Bigtable Data model • Essentially a sparse, distributed, persistent multi-dimensional sorted map. • The map is indexed by a row key, column key and a timestamp. • Atomic reads and writes over a single row. Columns Rows 6
  • 7. Bigtable Row and column range • Row range dynamically • Column keys grouped partitioned into tablets. into column families. • Data in lexicographic • Each family has the order. same type. • Allows data locality. • Allows access control and disk or memory accounting. 7
  • 8. Bigtable Row and column range • Row range dynamically • Column keys grouped partitioned into tablets. into column families. • Data in lexicographic • Each family has the order. same type. • Allows data locality. • Allows access control and disk or memory accounting. Enables reasoning about data locality 8
  • 10. Columns Rows Anchor is a column family 10
  • 11. Columns Tablets “anchor:bbcworld.com “anchor:weather “com.bbc.www” “BBC” “BBC.com” 11
  • 12. Bigtable Outline • Introduction • Data model • Implementation • Performance evaluation • Conclusions 12
  • 13. Bigtable Bigtable uses several other technologies • Google File System to store log and data files. • SSTable file format to store BigTable data. • Chubby, a distributed lock service. For more details on these technologies, refer to section 4 of the paper. 13
  • 14. Bigtable Implementation Master responsibilities: -Assign tablets to tablet servers -Add/delete tablet servers -Balance tablet server load -GC -Schema changes INTERNET CLIENT Communicate directly to tablet servers MASTER TABLET SERVERS 14
  • 15. Bigtable How data is stored? A three-level hierarchy, similar to B+ trees. 15
  • 16. Bigtable Location hierarchy Chubby file contains location of the root tablet. 16
  • 17. Bigtable Location hierarchy Root tablet contains all tablet locations in Metadata table. 17
  • 18. Bigtable Location hierarchy Metadata table stores locations of actual tablets. 18
  • 19. Bigtable Location hierarchy Client moves up the hierarchy (Metadata -> Root -> Chubby), if location of tablet is unknown or incorrect. 19
  • 20. Bigtable How data is served? 20
  • 22. Bigtable Tablet serving Compactions Compactions occur regularly, advantages: -Shrinks memory usage. -Reduces amount of data read from log during recovery. 22
  • 23. Bigtable Outline • Introduction • Data model • Implementation • Performance evaluation • Conclusions 23
  • 24. Bigtable Benchmarks for perf evaluation • Scan: – Scans over values in a row range. • Random reads from memory. • Random reads/writes: – R keys to be read/written spread over N clients. • Sequential reads/writes: – 0 to R-1 keys to be read/written spread over N clients. 24
  • 25. Bigtable Performance evaluation Scan uses single RPC call and shows best performance. 25
  • 26. Bigtable Performance evaluation Sequential reads are better than random reads, since each fetched block is used to serve next requests. 26
  • 27. Bigtable Performance evaluation Random read shows the worst performance. Fetching 64KB every 1000 bytes is expensive. 27
  • 28. Bigtable Performance evaluation Not linear, but scales well. 28
  • 29. Bigtable Outline • Introduction • Data model • Implementation • Performance evaluation • Conclusions 29
  • 30. Bigtable Conclusions • Bigtable: highly scalable and available, without compromising performance. • Flexibility for Google – designed using their own data model. • Custom design gives Google the ability to remove or minimize bottlenecks. • Related work: – Apache Hbase (open source) – Boxwood (though targeted at a lower/FS level) 30
  • 31. Bigtable A Distributed Storage System for Structured Data Authors: Fay Chang et. al. Presenter: Zafar Gilani 31
  • 32. Bigtable B+ Trees • A tree with sorted data for: – Efficient insertion, retrieval and removal of records. • All records are stored at the leaf level, only keys stored in interior nodes. 32