Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV

Introducing TiDB/TiKV
Kevin Xu (@kevinsxu; kevin@pingcap.com)

Agenda
● History and Community
● Technical Walkthrough
● Use Case with Mobike
● Q&A
● (Time Permitting) TiDB on Google Kubernetes Engine

A little about me
● General Manager of Global Strategy and
Operations
● Studied CS and Law at Stanford
● Program in Javascript, Python, and
(more recently) learning Rust

A little about PingCAP
● Founded in April 2015 by 3
infrastructure engineers
● Offices throughout North America
and China

PingCAP.com
Our Product: the TiDB Platform
● TiDB Platform (Ti = Titanium)
○ TiDB (SQL Layer)
○ TiKV (Key-Value Storage)
○ TiSpark (Spark plugin to TiKV)
● Open source from Day 1
○ GA 1.0: October 2017
○ GA 2.0: April 2018

PingCAP.com
Community
Stars
● TiDB: 15,300+
● TiKV: 3,700+
Contributors
● TiDB: 200+
● TiKV: 100+

Common Use Cases
1. MySQL Scalability
2. Hybrid OLTP/OLAP Architecture
3. Unifying Data Storage/Management

PingCAP.com
Cloud-Native Architecture
TiDB
TiDB
TiDB
Applicationvia
MySQLProtocol
TiKV
TiKV
TiKV
TiKV
TiKV
TiKV
Worker
Worker
Worker
Spark Driver
... ...
...
SparkSQL
Spark Cluster
PD Cluster
DistSQL
API
KV API
DistSQL
API
PD
PD PD
Metadata
TSO / Data Location
Data Location

PingCAP.com
TiKV (in CNCF): The Storage Foundation
Region 5
Region 1
Region 3
TiKV node 1
Store 1
Region 4
gRPC
Region 1
Region 2
TiKV node 2
Store 2
Region 3
gRPC
Region 3
Region 1
Region 5
TiKV node 3
Store 3
gRPC
Region 5
Region 1
Region 2
TiKV node 4
Store 4
gRPC
Client
PD 1
PD 2
PD 3
Placement
Driver
Raft GroupRegion 4
Region 4

PingCAP.com
TiDB: The (My)SQL Layer
Node1 Node2 Node3 Node4
MySQL Network Protocol
SQL Parser
Cost-based Optimizer
Coprocessor API
ODBC/JDBC MySQL Client
Any ORM which
supports MySQL
TiDB
TiKV

PingCAP.com
SQL -> Parser -> Coprocessor

PingCAP.com
Join Support
● Hash Join (fastest; if table <= 50 million rows)
● Sort Merge Join (join on indexed column or ordered
data source)
● Index Lookup Join (join on indexed column; ideally
after filter, result < 10,000 rows)
● Chosen based on Cost-base Optimizer:

PingCAP.com
TiSpark: Complex OLAP
Spark ExecSpark Exec
Spark Driver
Spark Exec
TiKV TiKV TiKV TiKV
TiSpark
TiSpark TiSpark TiSpark
TiKV
Placement
Driver (PD)
gRPC
Distributed Storage Layer
gRPC
retrieve data location
retrieve real data from TiKV

PingCAP.com
Placement Driver
● Provide a God’s view of
the entire cluster
● Store metadata, balancing
workload, issue
timestamps
● Also a cluster with
embedded etcd
Placement
Driver
Placement
Driver
Placement
Driver
Raft Raft
Raft

PingCAP.com
Transaction Model
● Timestamp Oracle Service (from
Google’s Percolator)
● 2-Phase commit protocol (2PC)
● Problem: Single point of failure
● Solution: PD HA cluster
Placement
Driver
Placement
Driver
Placement
Driver
Raft Raft
Raft

PingCAP.com
TiDB Operator
● Operator pattern inspired by CoreOS...(now Redhat...(now IBM))
● Boostraps TiDB cluster and simplifies/automates:
○ Deployment
○ Scaling
○ Scheduling
○ Auto-Failover
○ Upgrade
● Open Sourced
○ https://github.com/pingcap/tidb-operator

PingCAP.com
TiDB Operator
API ServerController ManagerScheduler
Kubernetes Core
TiDB Controller Manager
TiDB Cluster Controller
PD Controller
TiKV
Controller
TiDB
Controller
TiDB Scheduler:
TiDB Cloud Manager
API Gateway
Control Plane
Cost Controller
Kube Scheduler
TiDB Scheduler

PingCAP.com
Cloud Native Tools
● Prometheus
○ (maintains Rust implementation:
https://github.com/pingcap/rust-prometheus)
● gRPC
○ (maintains Rust implementation: https://github.com/pingcap/grpc-rs)
● etcd

PingCAP.com
TiDB, Managed As A Cloud Service
Early Access: https://www.pingcap.com/tidb-cloud/

PingCAP.com
Who’s Using TiDB?
300+
Companies

PingCAP.com
Mobike + TiDB
● 200 million users
● 200 cities
● 9 million smart bikes
● ~30 TB / day

PingCAP.com
Scenario #1: Locking/Unlocking
● Locking and unlocking of smart bikes
generates massive data
● Smooth experience is the key to user retention
● TiDB supports this system by alerting
administrators when the success rate of
locking/unlocking drops, within minutes
● Quickly find malfunctioning bikes

PingCAP.com
Scenario #2: Real-Time Analysis
● Synchronize TiDB with MySQL
instances using Syncer
(proprietary tool)
● TiDB + TiSpark empower
real-time analysis with
horizontal scalability
● No need for Hadoop + Hive

PingCAP.com
Scenario #3: Mobike Store
● An innovative loyalty program that
must be on 24x7x365
● TiDB provides:
○ High-concurrency for peak or
promotional season
○ Permanent storage
○ Horizontal scalability
● No interruption as business evolves

PingCAP.com
Thank You!
20% OFF KubeCon:
KCNA18SPR
kevin@pingcap.com
@kevinsxu; @pingcap
TiDB Cloud Early Access:
https://www.pingcap.com/
tidb-cloud/
TiDB Academy Sign-up:
www.pingcap.com/tidb-ac
ademy/

PingCAP.com
CBO 101
or:
Imagine we got a logical plan:
Its physical plan could be:

PingCAP.com
CBO 101
Network cost Memory cost CPU cost
In TiDB, the default memory factor is 5 and CPU factor is 0.8.
For example: Operator Sort(r), its cost would be:
TiDB will maintain histogram of data

PingCAP.com
Relational -> KV
ID Name Email
1 Edward h@pingcap.com
2 Tom tom@pingcap.com
...
user/1 Edward,h@pingcap.com
user/2 Tom,tom@pingcap.com
...
In TiKV -∞
+∞
(-∞, +∞)
Sorted map
“User” Table
Some region...

PingCAP.com
Index Structure
Row:
Key: tablePrefix_rowPrefix_tableID_rowID (IDs are assigned by TiDB, all int64)
Value: [col1, col2, col3, col4]
Index:
Key: tablePrefix_idxPrefix_tableID_indexID_ColumnsValue_rowID
Value: [null]
Keys are ordered by byte array in TiKV, so can support SCAN
Every key is appended a timestamp, issued by Placement Driver

PingCAP.com
Guaranteeing Correctness
● Formal proof using TLA+
○ a formal specification and verification language to reason about and
prove aspects of complex systems
● Raft
● TSO/Percolator
● 2PC
● See details: https://github.com/pingcap/tla-plus

PingCAP.com
MySQL Compatibility - Summary
● Compatibility with MySQL 5.7
○ Joins, subqueries, DML, DDL
etc.
● On the roadmap:
○ Views, Window Functions
● Missing:
○ Stored Procedures, Triggers,
Events, Fulltext
pingcap.com
/docs/sql/mysql-compatibility/

PingCAP.com
MySQL Compatibility - Nuanced
● Some features work differently
○ Auto Increment
○ Optimistic Locking
● TiDB works better with smaller
transactions
○ Recommended to batch
updates, deletes, inserts to
5000 rows
pingcap.com
/docs/sql/mysql-compatibility/

Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV

Similar to Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV (20)

Recently uploaded

Recently uploaded (20)

Presentation at SF Kubernetes Meetup (10/30/18), Introducing TiDB/TiKV