SlideShare a Scribd company logo
Introducing TiDB/TiKV
Kevin Xu (@kevinsxu; kevin@pingcap.com)
Agenda
● History and Community
● Technical Walkthrough
● Use Case with Mobike
● Q&A
● (Time Permitting) TiDB on Google Kubernetes Engine
A little about me
● General Manager of Global Strategy and
Operations
● Studied CS and Law at Stanford
● Program in Javascript, Python, and
(more recently) learning Rust
A little about PingCAP
● Founded in April 2015 by 3
infrastructure engineers
● Offices throughout North America
and China
PingCAP.com
Recent News
PingCAP.com
Our Product: the TiDB Platform
● TiDB Platform (Ti = Titanium)
○ TiDB (SQL Layer)
○ TiKV (Key-Value Storage)
○ TiSpark (Spark plugin to TiKV)
● Open source from Day 1
○ GA 1.0: October 2017
○ GA 2.0: April 2018
PingCAP.com
Community
Stars
● TiDB: 15,300+
● TiKV: 3,700+
Contributors
● TiDB: 200+
● TiKV: 100+
Common Use Cases
1. MySQL Scalability
2. Hybrid OLTP/OLAP Architecture
3. Unifying Data Storage/Management
TiDB Platform
PingCAP.com
Cloud-Native Architecture
TiDB
TiDB
TiDB
Applicationvia
MySQLProtocol
TiKV
TiKV
TiKV
TiKV
TiKV
TiKV
Worker
Worker
Worker
Spark Driver
... ...
...
SparkSQL
Spark Cluster
PD Cluster
DistSQL
API
KV API
DistSQL
API
PD
PD PD
Metadata
TSO / Data Location
Data Location
PingCAP.com
TiKV (in CNCF): The Storage Foundation
Region 5
Region 1
Region 3
TiKV node 1
Store 1
Region 4
gRPC
Region 1
Region 2
TiKV node 2
Store 2
Region 3
gRPC
Region 3
Region 1
Region 5
TiKV node 3
Store 3
gRPC
Region 5
Region 1
Region 2
TiKV node 4
Store 4
gRPC
Client
PD 1
PD 2
PD 3
Placement
Driver
Raft GroupRegion 4
Region 4
PingCAP.com
TiDB: The (My)SQL Layer
Node1 Node2 Node3 Node4
MySQL Network Protocol
SQL Parser
Cost-based Optimizer
Coprocessor API
ODBC/JDBC MySQL Client
Any ORM which
supports MySQL
TiDB
TiKV
PingCAP.com
SQL -> Parser -> Coprocessor
PingCAP.com
Join Support
● Hash Join (fastest; if table <= 50 million rows)
● Sort Merge Join (join on indexed column or ordered
data source)
● Index Lookup Join (join on indexed column; ideally
after filter, result < 10,000 rows)
● Chosen based on Cost-base Optimizer:
PingCAP.com
TiSpark: Complex OLAP
Spark ExecSpark Exec
Spark Driver
Spark Exec
TiKV TiKV TiKV TiKV
TiSpark
TiSpark TiSpark TiSpark
TiKV
Placement
Driver (PD)
gRPC
Distributed Storage Layer
gRPC
retrieve data location
retrieve real data from TiKV
PingCAP.com
Placement Driver
● Provide a God’s view of
the entire cluster
● Store metadata, balancing
workload, issue
timestamps
● Also a cluster with
embedded etcd
Placement
Driver
Placement
Driver
Placement
Driver
Raft Raft
Raft
PingCAP.com
Transaction Model
● Timestamp Oracle Service (from
Google’s Percolator)
● 2-Phase commit protocol (2PC)
● Problem: Single point of failure
● Solution: PD HA cluster
Placement
Driver
Placement
Driver
Placement
Driver
Raft Raft
Raft
PingCAP.com
TiDB Operator
● Operator pattern inspired by CoreOS...(now Redhat...(now IBM))
● Boostraps TiDB cluster and simplifies/automates:
○ Deployment
○ Scaling
○ Scheduling
○ Auto-Failover
○ Upgrade
● Open Sourced
○ https://github.com/pingcap/tidb-operator
PingCAP.com
TiDB Operator
API ServerController ManagerScheduler
Kubernetes Core
TiDB Controller Manager
TiDB Cluster Controller
PD Controller
TiKV
Controller
TiDB
Controller
TiDB Scheduler:
TiDB Cloud Manager
API Gateway
Control Plane
Cost Controller
Kube Scheduler
TiDB Scheduler
PingCAP.com
Cloud Native Tools
● Prometheus
○ (maintains Rust implementation:
https://github.com/pingcap/rust-prometheus)
● gRPC
○ (maintains Rust implementation: https://github.com/pingcap/grpc-rs)
● etcd
PingCAP.com
TiDB, Managed As A Cloud Service
Early Access: https://www.pingcap.com/tidb-cloud/
Who’s Using TiDB?
PingCAP.com
Who’s Using TiDB?
300+
Companies
Common Use Cases
1. MySQL Scalability
2. Hybrid OLTP/OLAP Architecture
3. Unifying Data Storage/Management
PingCAP.com
Mobike + TiDB
● 200 million users
● 200 cities
● 9 million smart bikes
● ~30 TB / day
PingCAP.com
Scenario #1: Locking/Unlocking
● Locking and unlocking of smart bikes
generates massive data
● Smooth experience is the key to user retention
● TiDB supports this system by alerting
administrators when the success rate of
locking/unlocking drops, within minutes
● Quickly find malfunctioning bikes
PingCAP.com
Scenario #2: Real-Time Analysis
● Synchronize TiDB with MySQL
instances using Syncer
(proprietary tool)
● TiDB + TiSpark empower
real-time analysis with
horizontal scalability
● No need for Hadoop + Hive
PingCAP.com
Scenario #3: Mobike Store
● An innovative loyalty program that
must be on 24x7x365
● TiDB provides:
○ High-concurrency for peak or
promotional season
○ Permanent storage
○ Horizontal scalability
● No interruption as business evolves
PingCAP.com
Thank You!
20% OFF KubeCon:
KCNA18SPR
kevin@pingcap.com
@kevinsxu; @pingcap
TiDB Cloud Early Access:
https://www.pingcap.com/
tidb-cloud/
TiDB Academy Sign-up:
www.pingcap.com/tidb-ac
ademy/
Appendix
PingCAP.com
CBO 101
or:
Imagine we got a logical plan:
Its physical plan could be:
PingCAP.com
CBO 101
Network cost Memory cost CPU cost
In TiDB, the default memory factor is 5 and CPU factor is 0.8.
For example: Operator Sort(r), its cost would be:
TiDB will maintain histogram of data
PingCAP.com
Relational -> KV
ID Name Email
1 Edward h@pingcap.com
2 Tom tom@pingcap.com
...
user/1 Edward,h@pingcap.com
user/2 Tom,tom@pingcap.com
...
In TiKV -∞
+∞
(-∞, +∞)
Sorted map
“User” Table
Some region...
PingCAP.com
Index Structure
Row:
Key: tablePrefix_rowPrefix_tableID_rowID (IDs are assigned by TiDB, all int64)
Value: [col1, col2, col3, col4]
Index:
Key: tablePrefix_idxPrefix_tableID_indexID_ColumnsValue_rowID
Value: [null]
Keys are ordered by byte array in TiKV, so can support SCAN
Every key is appended a timestamp, issued by Placement Driver
PingCAP.com
Guaranteeing Correctness
● Formal proof using TLA+
○ a formal specification and verification language to reason about and
prove aspects of complex systems
● Raft
● TSO/Percolator
● 2PC
● See details: https://github.com/pingcap/tla-plus
PingCAP.com
MySQL Compatibility - Summary
● Compatibility with MySQL 5.7
○ Joins, subqueries, DML, DDL
etc.
● On the roadmap:
○ Views, Window Functions
● Missing:
○ Stored Procedures, Triggers,
Events, Fulltext
pingcap.com
/docs/sql/mysql-compatibility/
PingCAP.com
MySQL Compatibility - Nuanced
● Some features work differently
○ Auto Increment
○ Optimistic Locking
● TiDB works better with smaller
transactions
○ Recommended to batch
updates, deletes, inserts to
5000 rows
pingcap.com
/docs/sql/mysql-compatibility/

More Related Content

What's hot

Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubole
kbajda
 
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
PingCAP
 
A Brief Introduction of TiDB (Percona Live)
A Brief Introduction of TiDB (Percona Live)A Brief Introduction of TiDB (Percona Live)
A Brief Introduction of TiDB (Percona Live)
PingCAP
 
Building a transactional key-value store that scales to 100+ nodes (percona l...
Building a transactional key-value store that scales to 100+ nodes (percona l...Building a transactional key-value store that scales to 100+ nodes (percona l...
Building a transactional key-value store that scales to 100+ nodes (percona l...
PingCAP
 
Rust in TiKV
Rust in TiKVRust in TiKV
Rust in TiKV
PingCAP
 
TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote
PingCAP
 
Introducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live FrankfurtIntroducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live Frankfurt
Morgan Tocker
 
How to build TiDB
How to build TiDBHow to build TiDB
How to build TiDB
PingCAP
 
Scale Relational Database with NewSQL
Scale Relational Database with NewSQLScale Relational Database with NewSQL
Scale Relational Database with NewSQL
PingCAP
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
kbajda
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Boston
kbajda
 
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
VMware Tanzu
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
kbajda
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyft
kbajda
 
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
PingCAP
 
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
PingCAP
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
InfluxData
 
InfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxDB 2.0: Dashboarding 101 by David G. SimmonsInfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxData
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data Pipeline
Manish Kumar
 
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Taro L. Saito
 

What's hot (20)

Presto Summit 2018 - 10 - Qubole
Presto Summit 2018  - 10 - QubolePresto Summit 2018  - 10 - Qubole
Presto Summit 2018 - 10 - Qubole
 
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
 
A Brief Introduction of TiDB (Percona Live)
A Brief Introduction of TiDB (Percona Live)A Brief Introduction of TiDB (Percona Live)
A Brief Introduction of TiDB (Percona Live)
 
Building a transactional key-value store that scales to 100+ nodes (percona l...
Building a transactional key-value store that scales to 100+ nodes (percona l...Building a transactional key-value store that scales to 100+ nodes (percona l...
Building a transactional key-value store that scales to 100+ nodes (percona l...
 
Rust in TiKV
Rust in TiKVRust in TiKV
Rust in TiKV
 
TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote TiDB DevCon 2020 Opening Keynote
TiDB DevCon 2020 Opening Keynote
 
Introducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live FrankfurtIntroducing TiDB - Percona Live Frankfurt
Introducing TiDB - Percona Live Frankfurt
 
How to build TiDB
How to build TiDBHow to build TiDB
How to build TiDB
 
Scale Relational Database with NewSQL
Scale Relational Database with NewSQLScale Relational Database with NewSQL
Scale Relational Database with NewSQL
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Boston
 
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
Pivotal Greenplum Cloud Marketplaces - Greenplum Summit 2019
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
Presto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - LyftPresto Summit 2018 - 07 - Lyft
Presto Summit 2018 - 07 - Lyft
 
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
 
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
 
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
Optimizing InfluxDB Performance in the Real World by Dean Sheehan, Senior Dir...
 
InfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxDB 2.0: Dashboarding 101 by David G. SimmonsInfluxDB 2.0: Dashboarding 101 by David G. Simmons
InfluxDB 2.0: Dashboarding 101 by David G. Simmons
 
Challenges in Building a Data Pipeline
Challenges in Building a Data PipelineChallenges in Building a Data Pipeline
Challenges in Building a Data Pipeline
 
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021Unifying Frontend and Backend Development with Scala - ScalaCon 2021
Unifying Frontend and Backend Development with Scala - ScalaCon 2021
 

Similar to Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV

Introducing TiDB @ SF DevOps Meetup
Introducing TiDB @ SF DevOps MeetupIntroducing TiDB @ SF DevOps Meetup
Introducing TiDB @ SF DevOps Meetup
Kevin Xu
 
TiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup GroupTiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup Group
Morgan Tocker
 
"Smooth Operator" [Bay Area NewSQL meetup]
"Smooth Operator" [Bay Area NewSQL meetup]"Smooth Operator" [Bay Area NewSQL meetup]
"Smooth Operator" [Bay Area NewSQL meetup]
Kevin Xu
 
Introducing TiDB Operator [Cologne, Germany]
Introducing TiDB Operator [Cologne, Germany]Introducing TiDB Operator [Cologne, Germany]
Introducing TiDB Operator [Cologne, Germany]
Kevin Xu
 
Introducing TiDB Operator
Introducing TiDB OperatorIntroducing TiDB Operator
Introducing TiDB Operator
Kevin Xu
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu Ma
Databricks
 
TiDB + Mobike by Kevin Xu (@kevinsxu)
TiDB + Mobike by Kevin Xu (@kevinsxu)TiDB + Mobike by Kevin Xu (@kevinsxu)
TiDB + Mobike by Kevin Xu (@kevinsxu)
Kevin Xu
 
How QBerg scaled to store data longer, query it faster
How QBerg scaled to store data longer, query it fasterHow QBerg scaled to store data longer, query it faster
How QBerg scaled to store data longer, query it faster
MariaDB plc
 
Pro Tips: Building for Hyperscale
Pro Tips: Building for HyperscalePro Tips: Building for Hyperscale
Pro Tips: Building for Hyperscale
Penguin Computing
 
FOSDEM MySQL and Friends Devroom
FOSDEM MySQL and Friends DevroomFOSDEM MySQL and Friends Devroom
FOSDEM MySQL and Friends Devroom
Morgan Tocker
 
Keynote -- Percona Live Europe 2018
Keynote -- Percona Live Europe 2018Keynote -- Percona Live Europe 2018
Keynote -- Percona Live Europe 2018
Kevin Xu
 
SQREAM DB on IBM Power9
SQREAM DB on IBM Power9SQREAM DB on IBM Power9
SQREAM DB on IBM Power9
Ganesan Narayanasamy
 
KSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKristofferson A
 
Free GitOps Workshop + Intro to Kubernetes & GitOps
Free GitOps Workshop + Intro to Kubernetes & GitOpsFree GitOps Workshop + Intro to Kubernetes & GitOps
Free GitOps Workshop + Intro to Kubernetes & GitOps
Weaveworks
 
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Yohei Onishi
 
TiDB vs Aurora.pdf
TiDB vs Aurora.pdfTiDB vs Aurora.pdf
TiDB vs Aurora.pdf
ssuser3fb50b
 
Using druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleUsing druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scale
Itai Yaffe
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
DataStax
 
[Study Guide] Google Professional Cloud Architect (GCP-PCA) Certification
[Study Guide] Google Professional Cloud Architect (GCP-PCA) Certification[Study Guide] Google Professional Cloud Architect (GCP-PCA) Certification
[Study Guide] Google Professional Cloud Architect (GCP-PCA) Certification
Amaaira Johns
 
Cloud TiDB deep dive
Cloud TiDB deep diveCloud TiDB deep dive
Cloud TiDB deep dive
臣 成
 

Similar to Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV (20)

Introducing TiDB @ SF DevOps Meetup
Introducing TiDB @ SF DevOps MeetupIntroducing TiDB @ SF DevOps Meetup
Introducing TiDB @ SF DevOps Meetup
 
TiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup GroupTiDB Introduction - Boston MySQL Meetup Group
TiDB Introduction - Boston MySQL Meetup Group
 
"Smooth Operator" [Bay Area NewSQL meetup]
"Smooth Operator" [Bay Area NewSQL meetup]"Smooth Operator" [Bay Area NewSQL meetup]
"Smooth Operator" [Bay Area NewSQL meetup]
 
Introducing TiDB Operator [Cologne, Germany]
Introducing TiDB Operator [Cologne, Germany]Introducing TiDB Operator [Cologne, Germany]
Introducing TiDB Operator [Cologne, Germany]
 
Introducing TiDB Operator
Introducing TiDB OperatorIntroducing TiDB Operator
Introducing TiDB Operator
 
When Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu MaWhen Apache Spark Meets TiDB with Xiaoyu Ma
When Apache Spark Meets TiDB with Xiaoyu Ma
 
TiDB + Mobike by Kevin Xu (@kevinsxu)
TiDB + Mobike by Kevin Xu (@kevinsxu)TiDB + Mobike by Kevin Xu (@kevinsxu)
TiDB + Mobike by Kevin Xu (@kevinsxu)
 
How QBerg scaled to store data longer, query it faster
How QBerg scaled to store data longer, query it fasterHow QBerg scaled to store data longer, query it faster
How QBerg scaled to store data longer, query it faster
 
Pro Tips: Building for Hyperscale
Pro Tips: Building for HyperscalePro Tips: Building for Hyperscale
Pro Tips: Building for Hyperscale
 
FOSDEM MySQL and Friends Devroom
FOSDEM MySQL and Friends DevroomFOSDEM MySQL and Friends Devroom
FOSDEM MySQL and Friends Devroom
 
Keynote -- Percona Live Europe 2018
Keynote -- Percona Live Europe 2018Keynote -- Percona Live Europe 2018
Keynote -- Percona Live Europe 2018
 
SQREAM DB on IBM Power9
SQREAM DB on IBM Power9SQREAM DB on IBM Power9
SQREAM DB on IBM Power9
 
KSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success StoryKSCOPE 2013: Exadata Consolidation Success Story
KSCOPE 2013: Exadata Consolidation Success Story
 
Free GitOps Workshop + Intro to Kubernetes & GitOps
Free GitOps Workshop + Intro to Kubernetes & GitOpsFree GitOps Workshop + Intro to Kubernetes & GitOps
Free GitOps Workshop + Intro to Kubernetes & GitOps
 
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)Building a Data Pipeline using Apache Airflow (on AWS / GCP)
Building a Data Pipeline using Apache Airflow (on AWS / GCP)
 
TiDB vs Aurora.pdf
TiDB vs Aurora.pdfTiDB vs Aurora.pdf
TiDB vs Aurora.pdf
 
Using druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleUsing druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scale
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
[Study Guide] Google Professional Cloud Architect (GCP-PCA) Certification
[Study Guide] Google Professional Cloud Architect (GCP-PCA) Certification[Study Guide] Google Professional Cloud Architect (GCP-PCA) Certification
[Study Guide] Google Professional Cloud Architect (GCP-PCA) Certification
 
Cloud TiDB deep dive
Cloud TiDB deep diveCloud TiDB deep dive
Cloud TiDB deep dive
 

Recently uploaded

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
Peter Caitens
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
varshanayak241
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
MayankTawar1
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
KrzysztofKkol1
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 

Recently uploaded (20)

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Advanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should KnowAdvanced Flow Concepts Every Developer Should Know
Advanced Flow Concepts Every Developer Should Know
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
Designing for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web ServicesDesigning for Privacy in Amazon Web Services
Designing for Privacy in Amazon Web Services
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 

Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV

  • 1. Introducing TiDB/TiKV Kevin Xu (@kevinsxu; kevin@pingcap.com)
  • 2. Agenda ● History and Community ● Technical Walkthrough ● Use Case with Mobike ● Q&A ● (Time Permitting) TiDB on Google Kubernetes Engine
  • 3. A little about me ● General Manager of Global Strategy and Operations ● Studied CS and Law at Stanford ● Program in Javascript, Python, and (more recently) learning Rust
  • 4. A little about PingCAP ● Founded in April 2015 by 3 infrastructure engineers ● Offices throughout North America and China
  • 6. PingCAP.com Our Product: the TiDB Platform ● TiDB Platform (Ti = Titanium) ○ TiDB (SQL Layer) ○ TiKV (Key-Value Storage) ○ TiSpark (Spark plugin to TiKV) ● Open source from Day 1 ○ GA 1.0: October 2017 ○ GA 2.0: April 2018
  • 7. PingCAP.com Community Stars ● TiDB: 15,300+ ● TiKV: 3,700+ Contributors ● TiDB: 200+ ● TiKV: 100+
  • 8. Common Use Cases 1. MySQL Scalability 2. Hybrid OLTP/OLAP Architecture 3. Unifying Data Storage/Management
  • 10. PingCAP.com Cloud-Native Architecture TiDB TiDB TiDB Applicationvia MySQLProtocol TiKV TiKV TiKV TiKV TiKV TiKV Worker Worker Worker Spark Driver ... ... ... SparkSQL Spark Cluster PD Cluster DistSQL API KV API DistSQL API PD PD PD Metadata TSO / Data Location Data Location
  • 11. PingCAP.com TiKV (in CNCF): The Storage Foundation Region 5 Region 1 Region 3 TiKV node 1 Store 1 Region 4 gRPC Region 1 Region 2 TiKV node 2 Store 2 Region 3 gRPC Region 3 Region 1 Region 5 TiKV node 3 Store 3 gRPC Region 5 Region 1 Region 2 TiKV node 4 Store 4 gRPC Client PD 1 PD 2 PD 3 Placement Driver Raft GroupRegion 4 Region 4
  • 12. PingCAP.com TiDB: The (My)SQL Layer Node1 Node2 Node3 Node4 MySQL Network Protocol SQL Parser Cost-based Optimizer Coprocessor API ODBC/JDBC MySQL Client Any ORM which supports MySQL TiDB TiKV
  • 13. PingCAP.com SQL -> Parser -> Coprocessor
  • 14. PingCAP.com Join Support ● Hash Join (fastest; if table <= 50 million rows) ● Sort Merge Join (join on indexed column or ordered data source) ● Index Lookup Join (join on indexed column; ideally after filter, result < 10,000 rows) ● Chosen based on Cost-base Optimizer:
  • 15. PingCAP.com TiSpark: Complex OLAP Spark ExecSpark Exec Spark Driver Spark Exec TiKV TiKV TiKV TiKV TiSpark TiSpark TiSpark TiSpark TiKV Placement Driver (PD) gRPC Distributed Storage Layer gRPC retrieve data location retrieve real data from TiKV
  • 16. PingCAP.com Placement Driver ● Provide a God’s view of the entire cluster ● Store metadata, balancing workload, issue timestamps ● Also a cluster with embedded etcd Placement Driver Placement Driver Placement Driver Raft Raft Raft
  • 17. PingCAP.com Transaction Model ● Timestamp Oracle Service (from Google’s Percolator) ● 2-Phase commit protocol (2PC) ● Problem: Single point of failure ● Solution: PD HA cluster Placement Driver Placement Driver Placement Driver Raft Raft Raft
  • 18. PingCAP.com TiDB Operator ● Operator pattern inspired by CoreOS...(now Redhat...(now IBM)) ● Boostraps TiDB cluster and simplifies/automates: ○ Deployment ○ Scaling ○ Scheduling ○ Auto-Failover ○ Upgrade ● Open Sourced ○ https://github.com/pingcap/tidb-operator
  • 19. PingCAP.com TiDB Operator API ServerController ManagerScheduler Kubernetes Core TiDB Controller Manager TiDB Cluster Controller PD Controller TiKV Controller TiDB Controller TiDB Scheduler: TiDB Cloud Manager API Gateway Control Plane Cost Controller Kube Scheduler TiDB Scheduler
  • 20. PingCAP.com Cloud Native Tools ● Prometheus ○ (maintains Rust implementation: https://github.com/pingcap/rust-prometheus) ● gRPC ○ (maintains Rust implementation: https://github.com/pingcap/grpc-rs) ● etcd
  • 21. PingCAP.com TiDB, Managed As A Cloud Service Early Access: https://www.pingcap.com/tidb-cloud/
  • 24. Common Use Cases 1. MySQL Scalability 2. Hybrid OLTP/OLAP Architecture 3. Unifying Data Storage/Management
  • 25. PingCAP.com Mobike + TiDB ● 200 million users ● 200 cities ● 9 million smart bikes ● ~30 TB / day
  • 26. PingCAP.com Scenario #1: Locking/Unlocking ● Locking and unlocking of smart bikes generates massive data ● Smooth experience is the key to user retention ● TiDB supports this system by alerting administrators when the success rate of locking/unlocking drops, within minutes ● Quickly find malfunctioning bikes
  • 27. PingCAP.com Scenario #2: Real-Time Analysis ● Synchronize TiDB with MySQL instances using Syncer (proprietary tool) ● TiDB + TiSpark empower real-time analysis with horizontal scalability ● No need for Hadoop + Hive
  • 28. PingCAP.com Scenario #3: Mobike Store ● An innovative loyalty program that must be on 24x7x365 ● TiDB provides: ○ High-concurrency for peak or promotional season ○ Permanent storage ○ Horizontal scalability ● No interruption as business evolves
  • 29. PingCAP.com Thank You! 20% OFF KubeCon: KCNA18SPR kevin@pingcap.com @kevinsxu; @pingcap TiDB Cloud Early Access: https://www.pingcap.com/ tidb-cloud/ TiDB Academy Sign-up: www.pingcap.com/tidb-ac ademy/
  • 31. PingCAP.com CBO 101 or: Imagine we got a logical plan: Its physical plan could be:
  • 32. PingCAP.com CBO 101 Network cost Memory cost CPU cost In TiDB, the default memory factor is 5 and CPU factor is 0.8. For example: Operator Sort(r), its cost would be: TiDB will maintain histogram of data
  • 33. PingCAP.com Relational -> KV ID Name Email 1 Edward h@pingcap.com 2 Tom tom@pingcap.com ... user/1 Edward,h@pingcap.com user/2 Tom,tom@pingcap.com ... In TiKV -∞ +∞ (-∞, +∞) Sorted map “User” Table Some region...
  • 34. PingCAP.com Index Structure Row: Key: tablePrefix_rowPrefix_tableID_rowID (IDs are assigned by TiDB, all int64) Value: [col1, col2, col3, col4] Index: Key: tablePrefix_idxPrefix_tableID_indexID_ColumnsValue_rowID Value: [null] Keys are ordered by byte array in TiKV, so can support SCAN Every key is appended a timestamp, issued by Placement Driver
  • 35. PingCAP.com Guaranteeing Correctness ● Formal proof using TLA+ ○ a formal specification and verification language to reason about and prove aspects of complex systems ● Raft ● TSO/Percolator ● 2PC ● See details: https://github.com/pingcap/tla-plus
  • 36. PingCAP.com MySQL Compatibility - Summary ● Compatibility with MySQL 5.7 ○ Joins, subqueries, DML, DDL etc. ● On the roadmap: ○ Views, Window Functions ● Missing: ○ Stored Procedures, Triggers, Events, Fulltext pingcap.com /docs/sql/mysql-compatibility/
  • 37. PingCAP.com MySQL Compatibility - Nuanced ● Some features work differently ○ Auto Increment ○ Optimistic Locking ● TiDB works better with smaller transactions ○ Recommended to batch updates, deletes, inserts to 5000 rows pingcap.com /docs/sql/mysql-compatibility/