SlideShare a Scribd company logo
Introduction to Maxtable


Xue Yingfei
http://code.google.com/p/maxtable/
Agenda

  Architecture Overview
  Key Features
  Maxtable Query Language (MQL)
  Operation and Maintenance
  Future Works




5 Mar 2012                         2
Architecture Overview ( 1 )

     Maxtable consists of three components:
     1.      Metadata server: This provides the global namespace for all the tables in this
             system. It keeps the B-tree structure in memory.
     2.      Ranger server: It holds some ranges of the data and the default size of one range
             is about 100GB.
     3.      Client library: The client library is linked with applications. This enables
             applications to read/write data stored in Maxtable.
     What components in the system and how they relate to one another.




5 Mar 2012                                                                                       3
Architecture Overview ( 2 )

     How to store the table in the disk ?




      One SSTable = 4M data.
      One Tablet = 25K SSTable = One range = 100G.
      One Table = 42K Tablet.
      So, one table can contain more than 4PB data, and we can extend the size of block
      or use two tablet levels to save index data to contain more data.




5 Mar 2012                                                                                4
Architecture Overview ( 3 )

        How does maxtable work
     •       Maxtable stores data in a table, sorted by a primary key(the first column).
     •       There are two types for data in the table: varchar (string) and int (number).
     •       Scaling is achieved by automatically splitting tables into contiguous ranges and
             assigning them up to different physical machines.
     •       There are two types of servers in a Maxtable cluster, Ranger Servers which hold
             some ranges of the data and Meta Servers which handle meta management
             works and oversee the Ranger Servers.
     •       A single Range Server may hold many continuous ranges, the Meta Server is
             responsible for farming them out in an intelligent way.
     •       If a single range fills up, the range is split in half(middle-split). The top half of the
             range remain in the current range and allocate a new range to save the lower half
             of the range, two ranges still locate at the current Ranger Server till the Ranger
             Server become overload, the Rebalancer will trigger Meta Server to reassign
             some ranges of the data locating at the overload Ranger Servers to other Range
             Servers that have enough space.


5 Mar 2012                                                                                               5
Key Features ( 1 )

  Scalability:
     • New ranger nodes can be added as storage service needs increase, the system
       automatically adapts to the new nodes while running the rebalance.
  Data writes:
     • When an application insert a data, writes can be cached at the Ranger server,
       periodically, the cache is flushed, for consistency, applications will force one data
       log to be flushed to the disk.
  SSTable Map:
     • This feature will reduce the data consistency control and improve the performance
       of data write, and we use a innovative method that it doesn't need any lock mutation
       for multi-writes to solve the conflicts between writes.
  Cache All Data:
     • In MaxTable we can cache all the metadata in the Metaserver and the hot data in ranger
       server.
  Re-balancing:
     • Using the tool to rebalance the tablets amongst Rangerservers. This is done to help
       with balancing the workload amongst nodes.
5 Mar 2012                                                                                      6
Key Features ( 2 )

  Index:
     • Maxtable will automatically build one unique index for each table by the first column.
  Recovery:
     • Maxtable implements the write ahead logging (WAL) to make sure this writing is
       safe. It can recover the crash server by replaying its log.
  Failover:
     • Metaserver maintains a heartbeat with each rangerserver, while the metaserver
       detects that the range server is unreachable, it will fail-over the data service locating
       on the crash rangerserver to another rangerserver and continue the service for this
       range.
  Metadata Consistency Checking (MCC):
     • Data checking tools to ensure the data consistency between on the metaserver and
       rangerserver.
  Backend Storage :
     • Maxtable’s backend storage can use distributed file system, currently it can use the
       KFS as its backend.

5 Mar 2012                                                                                         7
Key Features ( 3 )

  Range Query
     • It will support the range query by the index cloumn or the non-index column.
     • Support the AND and OR in the WHERE clause.
     • Split the work over all the range nodes in a cluster.
  Sharding
     • Automatic sharding support, distributing tablets over range servers.
     • Manually sharding support, it will scan all the tablet and split those tablets that have
       at least two blocks containing data. If customers want better scaling, they can do so
       manually by sharding tablets.
     • Generally, manually sharding will be followed by one rebalance operation that will
       rebalance the tablets because sharding may raise some new tablets.




5 Mar 2012                                                                                        8
Maxtable Query Language ( 1 )

  CREATE TABLE
     • Create one table.
         – create table table_name (column1 type1, ...,cloumnx type x)
         – create table blogdata (key varchar, num int, createtime varchar, comment varchar)
  INSERT
     • Insert one data row.
         – insert into table_name (column1_value,...columnx_value)
         – insert into blogdata (adidas, 1000, 2011-10-11, good)
  SELECT
     • Select one data by the default key column
         – select table_name (column1_value)
         – select blogdata (adidas)
  SELECTRANGE
     • Select data range by the range user specified
         – selectrange table_name (column1_value1, column1_value2)
         – selectrange blogdata (adidas, lining)

5 Mar 2012                                                                                     9
Maxtable Query Language ( 2 )

  SELECTWHERE
     • Select data by the WHERE clause
         – selectwhere table_name where columnX_name(columnX_value1, columnX_value2) and
           columnY_name(columnY_value1, columnY_value2)
  SELECTCOUNT
     • Get the # of rows by the WHERE clause
         – selectcount table_name where columnX_name(columnX_value1, columnX_value2) and
           columnY_name(columnY_value1, columnY_value2)
  SELECTSUM
     • Get the total values of some one column by the WHERE clause
         – selectsum (column_name) table_name where columnX_name(columnX_value1, columnX_value2)
           and columnY_name(columnY_value1, columnY_value2)
  DELETE
     • Delete one data
         – delete table_name (column1_value)
  DROP TABLE
     • Drop one table
         – drop table_name
5 Mar 2012                                                                                     10
Maxtable Query Language ( 3 )

 Following are the commands for the administrators.
  SHARDING
     • Sharding one table
         – sharding table_name
  MCC CHECKRANGER
     • Check the state of the rangers
         – mcc checkranger
  MCC CHECKTABLE
     • Checking the data of the table
         – mcc checktable table_name
  REBALANCE
     • Rebalancing the data load over the rangers
         – rebalance table_name




5 Mar 2012                                            11
Operation and Maintenance

  Platform requirement
     • http://code.google.com/p/maxtable/wiki/Platform
  How to build
     • http://code.google.com/p/maxtable/wiki/03HowToInstall
     • http://code.google.com/p/maxtable/wiki/05HowToBuildWithKFSFacer
  How to deploy
     • http://code.google.com/p/maxtable/wiki/04HowToDeploy
  How to use the client API
     • http://code.google.com/p/maxtable/wiki/08ClientSampleCode




5 Mar 2012                                                               12
Future Works

  Implement the master-slave in metaserver.
  Support secondary index
  Support the Join operation.
  Compaction & Compression




5 Mar 2012                                     13
Contact Information

  yingfei.xue@gmail.com




                  Thanks




5 Mar 2012                 14

More Related Content

What's hot

Bigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured DataBigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured Dataelliando dias
 
Db2 Important questions to read
Db2 Important questions to readDb2 Important questions to read
Db2 Important questions to read
Prasanth Dusi
 
Google Big Table
Google Big TableGoogle Big Table
Google Big Table
Omar Al-Sabek
 
Oracle 19c initialization parameters
Oracle 19c initialization parametersOracle 19c initialization parameters
Oracle 19c initialization parameters
Pablo Echeverria
 
HadoopDB a major step towards a dead end
HadoopDB a major step towards a dead endHadoopDB a major step towards a dead end
HadoopDB a major step towards a dead endthkoch
 
Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"
Lviv Startup Club
 
SKILLWISE-DB2 DBA
SKILLWISE-DB2 DBASKILLWISE-DB2 DBA
SKILLWISE-DB2 DBA
Skillwise Group
 
Vertica
VerticaVertica
DB2 and storage management
DB2 and storage managementDB2 and storage management
DB2 and storage management
Craig Mullins
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
solarisyougood
 
Big table
Big tableBig table
Big table
Manuel Correa
 
Ycsb benchmarking
Ycsb benchmarkingYcsb benchmarking
Ycsb benchmarking
Sqrrl
 
Solving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaSolving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration Dilemma
Randy Goering
 
Hadoop DB
Hadoop DBHadoop DB
DB2 LUW - Backup and Recovery
DB2 LUW - Backup and RecoveryDB2 LUW - Backup and Recovery
DB2 LUW - Backup and Recoveryimranasayed
 

What's hot (20)

Bigtable
BigtableBigtable
Bigtable
 
Bigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured DataBigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured Data
 
Bigtable
BigtableBigtable
Bigtable
 
Less06 Storage
Less06 StorageLess06 Storage
Less06 Storage
 
Db2 Important questions to read
Db2 Important questions to readDb2 Important questions to read
Db2 Important questions to read
 
Google Big Table
Google Big TableGoogle Big Table
Google Big Table
 
Oracle 19c initialization parameters
Oracle 19c initialization parametersOracle 19c initialization parameters
Oracle 19c initialization parameters
 
HadoopDB a major step towards a dead end
HadoopDB a major step towards a dead endHadoopDB a major step towards a dead end
HadoopDB a major step towards a dead end
 
Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"
 
GOOGLE BIGTABLE
GOOGLE BIGTABLEGOOGLE BIGTABLE
GOOGLE BIGTABLE
 
SKILLWISE-DB2 DBA
SKILLWISE-DB2 DBASKILLWISE-DB2 DBA
SKILLWISE-DB2 DBA
 
Vertica
VerticaVertica
Vertica
 
DB2 and storage management
DB2 and storage managementDB2 and storage management
DB2 and storage management
 
Presentation db2 best practices for optimal performance
Presentation   db2 best practices for optimal performancePresentation   db2 best practices for optimal performance
Presentation db2 best practices for optimal performance
 
Big table
Big tableBig table
Big table
 
Ycsb benchmarking
Ycsb benchmarkingYcsb benchmarking
Ycsb benchmarking
 
Solving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration DilemmaSolving the DB2 LUW Administration Dilemma
Solving the DB2 LUW Administration Dilemma
 
Hadoop DB
Hadoop DBHadoop DB
Hadoop DB
 
DB2 LUW - Backup and Recovery
DB2 LUW - Backup and RecoveryDB2 LUW - Backup and Recovery
DB2 LUW - Backup and Recovery
 
Big table
Big tableBig table
Big table
 

Viewers also liked

Federmanager Bologna - Personal Branding 8 marzo - Presidente Andrea Molza
Federmanager Bologna - Personal Branding 8 marzo - Presidente Andrea MolzaFedermanager Bologna - Personal Branding 8 marzo - Presidente Andrea Molza
Federmanager Bologna - Personal Branding 8 marzo - Presidente Andrea Molza
Marco Frullanti
 
Formulario de identificación
Formulario de identificaciónFormulario de identificación
Formulario de identificaciónNathalia Sanchez
 
April Webinar: Sample Balancing in 2012
April Webinar: Sample Balancing in 2012April Webinar: Sample Balancing in 2012
April Webinar: Sample Balancing in 2012
Research Now
 
Electrophoresis and blotting techniques by asheesh pandey
Electrophoresis and blotting techniques by asheesh pandeyElectrophoresis and blotting techniques by asheesh pandey
Electrophoresis and blotting techniques by asheesh pandey
Asheesh Pandey
 
Opportunities for students in the New World of Cloud and Big Data
Opportunities for students in the New World of Cloud and Big DataOpportunities for students in the New World of Cloud and Big Data
Opportunities for students in the New World of Cloud and Big Data
EMC
 
Leadership
Leadership Leadership
Leadership
Lalita Pandey
 
KNOWLEDGE MANAGEMENT - WHERE THEY ARE GONE WRONG?
KNOWLEDGE MANAGEMENT - WHERE  THEY ARE GONE WRONG?KNOWLEDGE MANAGEMENT - WHERE  THEY ARE GONE WRONG?
KNOWLEDGE MANAGEMENT - WHERE THEY ARE GONE WRONG?
Dr. Raju M. Mathew
 
A Day of Social Media Insights
A Day of Social Media InsightsA Day of Social Media Insights
A Day of Social Media Insights
Research Now
 
Tues wed reformation plays
Tues wed reformation playsTues wed reformation plays
Tues wed reformation playsTravis Klein
 
4 steps in Business Strategy for Start-ups
4 steps in Business Strategy for Start-ups4 steps in Business Strategy for Start-ups
4 steps in Business Strategy for Start-ups
Costin Ciora
 
A Long Day Second Draft Script by Sophie McAvoy
A Long Day Second Draft Script by Sophie McAvoyA Long Day Second Draft Script by Sophie McAvoy
A Long Day Second Draft Script by Sophie McAvoysophiemcavoy1
 
Media Evaluation
Media EvaluationMedia Evaluation
Media Evaluationloousmith
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
EMC
 
Fotonovel·la tutorial adrià, roger i gerard
Fotonovel·la tutorial adrià, roger i gerardFotonovel·la tutorial adrià, roger i gerard
Fotonovel·la tutorial adrià, roger i gerardmgonellgomez
 
EMC Hybrid Cloud for SAP - Enhanced Security and Compliance
EMC Hybrid Cloud for SAP - Enhanced Security and ComplianceEMC Hybrid Cloud for SAP - Enhanced Security and Compliance
EMC Hybrid Cloud for SAP - Enhanced Security and Compliance
EMC
 
Glossary
GlossaryGlossary
Glossary
ddkundaliya
 

Viewers also liked (20)

Federmanager Bologna - Personal Branding 8 marzo - Presidente Andrea Molza
Federmanager Bologna - Personal Branding 8 marzo - Presidente Andrea MolzaFedermanager Bologna - Personal Branding 8 marzo - Presidente Andrea Molza
Federmanager Bologna - Personal Branding 8 marzo - Presidente Andrea Molza
 
Informe consulta general
Informe consulta generalInforme consulta general
Informe consulta general
 
Formulario de identificación
Formulario de identificaciónFormulario de identificación
Formulario de identificación
 
Formulario clientes
Formulario clientesFormulario clientes
Formulario clientes
 
April Webinar: Sample Balancing in 2012
April Webinar: Sample Balancing in 2012April Webinar: Sample Balancing in 2012
April Webinar: Sample Balancing in 2012
 
Electrophoresis and blotting techniques by asheesh pandey
Electrophoresis and blotting techniques by asheesh pandeyElectrophoresis and blotting techniques by asheesh pandey
Electrophoresis and blotting techniques by asheesh pandey
 
Opportunities for students in the New World of Cloud and Big Data
Opportunities for students in the New World of Cloud and Big DataOpportunities for students in the New World of Cloud and Big Data
Opportunities for students in the New World of Cloud and Big Data
 
Leadership
Leadership Leadership
Leadership
 
KNOWLEDGE MANAGEMENT - WHERE THEY ARE GONE WRONG?
KNOWLEDGE MANAGEMENT - WHERE  THEY ARE GONE WRONG?KNOWLEDGE MANAGEMENT - WHERE  THEY ARE GONE WRONG?
KNOWLEDGE MANAGEMENT - WHERE THEY ARE GONE WRONG?
 
A Day of Social Media Insights
A Day of Social Media InsightsA Day of Social Media Insights
A Day of Social Media Insights
 
Tues wed reformation plays
Tues wed reformation playsTues wed reformation plays
Tues wed reformation plays
 
4 steps in Business Strategy for Start-ups
4 steps in Business Strategy for Start-ups4 steps in Business Strategy for Start-ups
4 steps in Business Strategy for Start-ups
 
A Long Day Second Draft Script by Sophie McAvoy
A Long Day Second Draft Script by Sophie McAvoyA Long Day Second Draft Script by Sophie McAvoy
A Long Day Second Draft Script by Sophie McAvoy
 
Changes to SRAD
Changes to SRADChanges to SRAD
Changes to SRAD
 
Media Evaluation
Media EvaluationMedia Evaluation
Media Evaluation
 
Finance
FinanceFinance
Finance
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Fotonovel·la tutorial adrià, roger i gerard
Fotonovel·la tutorial adrià, roger i gerardFotonovel·la tutorial adrià, roger i gerard
Fotonovel·la tutorial adrià, roger i gerard
 
EMC Hybrid Cloud for SAP - Enhanced Security and Compliance
EMC Hybrid Cloud for SAP - Enhanced Security and ComplianceEMC Hybrid Cloud for SAP - Enhanced Security and Compliance
EMC Hybrid Cloud for SAP - Enhanced Security and Compliance
 
Glossary
GlossaryGlossary
Glossary
 

Similar to Introduction To Maxtable

A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
Kel Graham
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial
Na Zhu
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationVolodymyr Rovetskiy
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
MariaDB plc
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
SingleStore
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
SnapLogic
 
Voldemort
VoldemortVoldemort
Voldemort
fasiha ikram
 
BDAS Shark study report 03 v1.1
BDAS Shark study report  03 v1.1BDAS Shark study report  03 v1.1
BDAS Shark study report 03 v1.1
Stefanie Zhao
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Amazon Web Services
 
Implementing the Databese Server session 02
Implementing the Databese Server session 02Implementing the Databese Server session 02
Implementing the Databese Server session 02Guillermo Julca
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and Optimization
Amazon Web Services
 
database-stucture-and-space-managment.ppt
database-stucture-and-space-managment.pptdatabase-stucture-and-space-managment.ppt
database-stucture-and-space-managment.ppt
Iftikhar70
 
database-stucture-and-space-managment.ppt
database-stucture-and-space-managment.pptdatabase-stucture-and-space-managment.ppt
database-stucture-and-space-managment.ppt
subbu998029
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
Amazon Web Services
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
Rahul Jain
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
Ajeet Singh
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
Fabio Fumarola
 
Gcp data engineer
Gcp data engineerGcp data engineer
Gcp data engineer
Narendranath Reddy T
 
Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019
Antonios Chatzipavlis
 

Similar to Introduction To Maxtable (20)

A tour of Amazon Redshift
A tour of Amazon RedshiftA tour of Amazon Redshift
A tour of Amazon Redshift
 
Cassandra Tutorial
Cassandra Tutorial Cassandra Tutorial
Cassandra Tutorial
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
MemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks WebcastMemSQL 201: Advanced Tips and Tricks Webcast
MemSQL 201: Advanced Tips and Tricks Webcast
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 
Voldemort
VoldemortVoldemort
Voldemort
 
BDAS Shark study report 03 v1.1
BDAS Shark study report  03 v1.1BDAS Shark study report  03 v1.1
BDAS Shark study report 03 v1.1
 
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon RedshiftBest Practices for Migrating Your Data Warehouse to Amazon Redshift
Best Practices for Migrating Your Data Warehouse to Amazon Redshift
 
Implementing the Databese Server session 02
Implementing the Databese Server session 02Implementing the Databese Server session 02
Implementing the Databese Server session 02
 
Amazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and OptimizationAmazon Redshift: Performance Tuning and Optimization
Amazon Redshift: Performance Tuning and Optimization
 
database-stucture-and-space-managment.ppt
database-stucture-and-space-managment.pptdatabase-stucture-and-space-managment.ppt
database-stucture-and-space-managment.ppt
 
database-stucture-and-space-managment.ppt
database-stucture-and-space-managment.pptdatabase-stucture-and-space-managment.ppt
database-stucture-and-space-managment.ppt
 
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
(BDT401) Amazon Redshift Deep Dive: Tuning and Best Practices
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Fudcon talk.ppt
Fudcon talk.pptFudcon talk.ppt
Fudcon talk.ppt
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
Gcp data engineer
Gcp data engineerGcp data engineer
Gcp data engineer
 
Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019Modernizing your database with SQL Server 2019
Modernizing your database with SQL Server 2019
 

Recently uploaded

Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 

Recently uploaded (20)

Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 

Introduction To Maxtable

  • 1. Introduction to Maxtable Xue Yingfei http://code.google.com/p/maxtable/
  • 2. Agenda  Architecture Overview  Key Features  Maxtable Query Language (MQL)  Operation and Maintenance  Future Works 5 Mar 2012 2
  • 3. Architecture Overview ( 1 )  Maxtable consists of three components: 1. Metadata server: This provides the global namespace for all the tables in this system. It keeps the B-tree structure in memory. 2. Ranger server: It holds some ranges of the data and the default size of one range is about 100GB. 3. Client library: The client library is linked with applications. This enables applications to read/write data stored in Maxtable.  What components in the system and how they relate to one another. 5 Mar 2012 3
  • 4. Architecture Overview ( 2 )  How to store the table in the disk ? One SSTable = 4M data. One Tablet = 25K SSTable = One range = 100G. One Table = 42K Tablet. So, one table can contain more than 4PB data, and we can extend the size of block or use two tablet levels to save index data to contain more data. 5 Mar 2012 4
  • 5. Architecture Overview ( 3 )  How does maxtable work • Maxtable stores data in a table, sorted by a primary key(the first column). • There are two types for data in the table: varchar (string) and int (number). • Scaling is achieved by automatically splitting tables into contiguous ranges and assigning them up to different physical machines. • There are two types of servers in a Maxtable cluster, Ranger Servers which hold some ranges of the data and Meta Servers which handle meta management works and oversee the Ranger Servers. • A single Range Server may hold many continuous ranges, the Meta Server is responsible for farming them out in an intelligent way. • If a single range fills up, the range is split in half(middle-split). The top half of the range remain in the current range and allocate a new range to save the lower half of the range, two ranges still locate at the current Ranger Server till the Ranger Server become overload, the Rebalancer will trigger Meta Server to reassign some ranges of the data locating at the overload Ranger Servers to other Range Servers that have enough space. 5 Mar 2012 5
  • 6. Key Features ( 1 )  Scalability: • New ranger nodes can be added as storage service needs increase, the system automatically adapts to the new nodes while running the rebalance.  Data writes: • When an application insert a data, writes can be cached at the Ranger server, periodically, the cache is flushed, for consistency, applications will force one data log to be flushed to the disk.  SSTable Map: • This feature will reduce the data consistency control and improve the performance of data write, and we use a innovative method that it doesn't need any lock mutation for multi-writes to solve the conflicts between writes.  Cache All Data: • In MaxTable we can cache all the metadata in the Metaserver and the hot data in ranger server.  Re-balancing: • Using the tool to rebalance the tablets amongst Rangerservers. This is done to help with balancing the workload amongst nodes. 5 Mar 2012 6
  • 7. Key Features ( 2 )  Index: • Maxtable will automatically build one unique index for each table by the first column.  Recovery: • Maxtable implements the write ahead logging (WAL) to make sure this writing is safe. It can recover the crash server by replaying its log.  Failover: • Metaserver maintains a heartbeat with each rangerserver, while the metaserver detects that the range server is unreachable, it will fail-over the data service locating on the crash rangerserver to another rangerserver and continue the service for this range.  Metadata Consistency Checking (MCC): • Data checking tools to ensure the data consistency between on the metaserver and rangerserver.  Backend Storage : • Maxtable’s backend storage can use distributed file system, currently it can use the KFS as its backend. 5 Mar 2012 7
  • 8. Key Features ( 3 )  Range Query • It will support the range query by the index cloumn or the non-index column. • Support the AND and OR in the WHERE clause. • Split the work over all the range nodes in a cluster.  Sharding • Automatic sharding support, distributing tablets over range servers. • Manually sharding support, it will scan all the tablet and split those tablets that have at least two blocks containing data. If customers want better scaling, they can do so manually by sharding tablets. • Generally, manually sharding will be followed by one rebalance operation that will rebalance the tablets because sharding may raise some new tablets. 5 Mar 2012 8
  • 9. Maxtable Query Language ( 1 )  CREATE TABLE • Create one table. – create table table_name (column1 type1, ...,cloumnx type x) – create table blogdata (key varchar, num int, createtime varchar, comment varchar)  INSERT • Insert one data row. – insert into table_name (column1_value,...columnx_value) – insert into blogdata (adidas, 1000, 2011-10-11, good)  SELECT • Select one data by the default key column – select table_name (column1_value) – select blogdata (adidas)  SELECTRANGE • Select data range by the range user specified – selectrange table_name (column1_value1, column1_value2) – selectrange blogdata (adidas, lining) 5 Mar 2012 9
  • 10. Maxtable Query Language ( 2 )  SELECTWHERE • Select data by the WHERE clause – selectwhere table_name where columnX_name(columnX_value1, columnX_value2) and columnY_name(columnY_value1, columnY_value2)  SELECTCOUNT • Get the # of rows by the WHERE clause – selectcount table_name where columnX_name(columnX_value1, columnX_value2) and columnY_name(columnY_value1, columnY_value2)  SELECTSUM • Get the total values of some one column by the WHERE clause – selectsum (column_name) table_name where columnX_name(columnX_value1, columnX_value2) and columnY_name(columnY_value1, columnY_value2)  DELETE • Delete one data – delete table_name (column1_value)  DROP TABLE • Drop one table – drop table_name 5 Mar 2012 10
  • 11. Maxtable Query Language ( 3 ) Following are the commands for the administrators.  SHARDING • Sharding one table – sharding table_name  MCC CHECKRANGER • Check the state of the rangers – mcc checkranger  MCC CHECKTABLE • Checking the data of the table – mcc checktable table_name  REBALANCE • Rebalancing the data load over the rangers – rebalance table_name 5 Mar 2012 11
  • 12. Operation and Maintenance  Platform requirement • http://code.google.com/p/maxtable/wiki/Platform  How to build • http://code.google.com/p/maxtable/wiki/03HowToInstall • http://code.google.com/p/maxtable/wiki/05HowToBuildWithKFSFacer  How to deploy • http://code.google.com/p/maxtable/wiki/04HowToDeploy  How to use the client API • http://code.google.com/p/maxtable/wiki/08ClientSampleCode 5 Mar 2012 12
  • 13. Future Works  Implement the master-slave in metaserver.  Support secondary index  Support the Join operation.  Compaction & Compression 5 Mar 2012 13
  • 14. Contact Information  yingfei.xue@gmail.com Thanks 5 Mar 2012 14