SlideShare a Scribd company logo
Charlie Yang
rizhao.ych@oceanbase.com
The Architecture Overview of
OceanBase Database
1
About OceanBase
• Distributed SQL database, starting from 2010
• Serves all payment requests of Alipay since 2017, 61 million peak QPS in 2019.11.11
• Adopted by 400+ customers in mission critical scenario
• TPC-C: 707 million tpmC (No. 1), TPC-H: 15 million qphH @30000GB (No. 2)
• Scalable OLTP:linear scalability with strong consistency and high availability
• HTAP:real-time operational analytics in one unify system
• Compatible to MySQL with high performance and much lower cost
2
Design Goals
Monolithic Database
• Full SQL functionality
• High Performance of single node
Distributed Storage System
• High scalability, high availability
• Key-value store or limited SQL
functionality
OceanBase = A Distributed SQL Database with full SQL support and high
performance of single node
3
Scalable OLTP
Unlimited Storage In One Cluster
• Max 1000+ servers
• 6PB+ data of storage
• 320 billion+ records (one single table)
Linear Scalability
• Fault Tolerance using Paxos
• Distributed Transaction using 2PC
• Data shuffle at partition granularity
Leader Follower
P1 P2
P4
OBServer
ZONE_1
P5 P6
P8
OBServer
P7
P1
P3 P4
OBServer
ZONE_2
P5 P6
P8
OBServer
P7
P1
P3
P2
OBServer
ZONE_3
P5 P6
P8
OBServer
P7
Paxos Group
P3
P2
P4
4
Real-time Operational Analytics
HTAP Integration
Provide services for real-time operational
analytics scenarios
• Heavy OLAP workload: individual replica
to do OLAP
• Light OLAP workload: do OLTP and OLAP
in the same replica (mixed row-columnar
storage)
P1 P2
P4
Server1
IDC1
P5 P6
P8
Server2
P7
P1
P3 P4
Server3
P5 P6
P8
Server4
P7
P1
P3
P2
Server5
P5 P6
P8
Server6
P7
Paxos Group
Proxy Proxy Proxy
OLTP business
P3
P2
P4
Leader
Follower
IDC2 IDC3
OLTP business
OLAP real time
data analytics
5
TPC Benchmark
• 2019 TPC-C 60.88 million tpmC
• 2020 TPC-C 707 million tpmC
6
• 2021 TPC-H 15.26 million qphH (30,000GB dataset)
Architecture
• Each cluster consists of several zones in one
or multiple regions.
• OBProxy is used to route requests to
OBServer.
• Each OBServer is similar to a classical
RDBMS; Compiles SQL statement(s) to
produce a SQL execution plan.
• One OBServer is elected to host root service.
• Redo logs are replicated among the zones
using Paxos.
• Transactions for only one partition are
executed locally.
• Transactions for multiple partitions are
executed using 2PC.
7
Basic Concept
Cluster
Zone OBServer
Admin
APP
Tenant
Database
Table
Partition
Replica
Resource
Pool
Zone
Zone
OBServer
OBServer
Each cluster has
multiple Zones
Each Zone has
multiple OBServers
Replica
Replica
• Zone: Availability Zone, an IDC in most case
• Multi-tenant architecture: divides each cluster into multiple resource pools owned by tenants,
resource isolation is done internally by the database
Each resource pool has
multiple resource units
8
Transaction Engine
Leader Follower
P1 P2
P4
OBServer
ZONE_1
P5 P6
P8
OBServer
P7
P1
P3 P4
OBServer
ZONE_2
P5 P6
P8
OBServer
P7
P1
P3
P2
OBServer
ZONE_3
P5 P6
P8
OBServer
P7
Paxos Group
P3
P2
P4
• Paxos: Quorum-based consensus (2 out of
3, or 3 out of 5 replicas)
• High availability: RPO = 0, RTO < 30 s
• RPO: recovery point objective
• RTO: recovery time objective
• Distributed Transaction
• Two Phase Commit (2PC)
• MVCC & Snapshot Isolation
• Linearizability: uses GTS (Global
Transaction Service) to retrieve the
global unique id for each transaction
9
Storage Engine
Logs
Update
Replicas
MemTable(WOS) ROW Cache
Minor SSTable Major SSTable(ROS)
Disk
Row-Level
In-Memory
Redo/MVCC
Memory
In-Memory Hash In-Memory
B +-Trees
Scan
Big-Query
Get
Small-Query
Block Cache
• LSM Tree: MemTable and SSTable
• Compaction: Merges several sstables and
memtables into one single sstable
• MemTable: Btree and hash index
• SSTable: divided into data blocks, order by
primary key
• Macro Block: mostly 2MB, write unit
Micro Block: mostly 8KB ~ 512KB, read
unit, encoding and compression unit
• Cache: Row Cache (for single row get) and
Block Cache (for scan)
10
SQL Engine
• Fast parser attempts to match an existing
plan in the plan cache.
• Resolver translates SQL request and
generates a statement tree.
• Transformer analyzes and rewrites the user
SQL.
• Optimizer(System-R like cost based
optimizer) performs query transformation and
optimization.
• Code Generator does code generation.
• Vector execution and Parallel execution
are used for OLAP big query
11
Comparation with other distributed SQL database
OceanBase Cloud Spanner Cockroachdb TiDB
Multi-tenancy YES YES YES NO
SQL Compatibility MySQL PostgreSQL PostgreSQL MySQL
SQL Join YES YES YES YES
Foreign Key YES YES YES NO
XA, Stored
Procedure
YES NO NO NO
Multi-Model Json, GIS, KV API JSON, KV API JSON, GIS JSON, TiKV
HTAP YES NO NO YES
Global Database NO YES YES NO
Replication Paxos Paxos Raft Raft
Global Time GTS Truetime HLC GTS
Linearizability YES YES NO YES
Commit Wait NO YES (7~10ms) NO NO
Sysbench(1 node) ~~ MySQL < 1/3 MySQL < 1/3 MySQL < 1/3 MySQL
Language C++ C++ Go Go, Rust 12
* Statistics above are from real production in Alipay
OceanBase @ Alipay
61
millions
Queries per
second
Peak performance
> 200
Nodes
in one cluster
Cluster Size
> 6
PB data
Data size in one instance
> 320
billions
Billions of rows
single table size
RPO = 0
RTO < 30
Disaster Tolerance
Seconds
Trade
Payment
Accounting
CIF
Promotion
Real time data
13
Outside Validation on Mission Critical System Across the
Industry
Used by 400+ customers on Mission Critical Systems
Mission critical BOSS and CRM systems
Debit and credit card transaction, billing, accounting systems
High concurrency scenarios: payment, accounting, customer info systems
FIs
High tech
Telco
14
Charlie Yang
rizhao.ych@oceanbase.com
Thank you!
https://github.com/oceanbase/oceanbase
https://www.oceanbase.com/en/
15

More Related Content

More from Alluxio, Inc.

Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
Alluxio, Inc.
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Alluxio, Inc.
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio, Inc.
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio, Inc.
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Alluxio, Inc.
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Alluxio, Inc.
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Alluxio, Inc.
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
Alluxio, Inc.
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
Alluxio, Inc.
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio, Inc.
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
Alluxio, Inc.
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
Alluxio, Inc.
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
Alluxio, Inc.
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
Alluxio, Inc.
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
Alluxio, Inc.
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
Alluxio, Inc.
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio, Inc.
 
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio, Inc.
 
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to ProductionAlluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio, Inc.
 
Alluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model TrainingAlluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio, Inc.
 

More from Alluxio, Inc. (20)

Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWSAlluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
 
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
 
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to ProductionAlluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to Production
 
Alluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model TrainingAlluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model Training
 

The Architecture Overview of OceanBase DataBase

  • 2. About OceanBase • Distributed SQL database, starting from 2010 • Serves all payment requests of Alipay since 2017, 61 million peak QPS in 2019.11.11 • Adopted by 400+ customers in mission critical scenario • TPC-C: 707 million tpmC (No. 1), TPC-H: 15 million qphH @30000GB (No. 2) • Scalable OLTP:linear scalability with strong consistency and high availability • HTAP:real-time operational analytics in one unify system • Compatible to MySQL with high performance and much lower cost 2
  • 3. Design Goals Monolithic Database • Full SQL functionality • High Performance of single node Distributed Storage System • High scalability, high availability • Key-value store or limited SQL functionality OceanBase = A Distributed SQL Database with full SQL support and high performance of single node 3
  • 4. Scalable OLTP Unlimited Storage In One Cluster • Max 1000+ servers • 6PB+ data of storage • 320 billion+ records (one single table) Linear Scalability • Fault Tolerance using Paxos • Distributed Transaction using 2PC • Data shuffle at partition granularity Leader Follower P1 P2 P4 OBServer ZONE_1 P5 P6 P8 OBServer P7 P1 P3 P4 OBServer ZONE_2 P5 P6 P8 OBServer P7 P1 P3 P2 OBServer ZONE_3 P5 P6 P8 OBServer P7 Paxos Group P3 P2 P4 4
  • 5. Real-time Operational Analytics HTAP Integration Provide services for real-time operational analytics scenarios • Heavy OLAP workload: individual replica to do OLAP • Light OLAP workload: do OLTP and OLAP in the same replica (mixed row-columnar storage) P1 P2 P4 Server1 IDC1 P5 P6 P8 Server2 P7 P1 P3 P4 Server3 P5 P6 P8 Server4 P7 P1 P3 P2 Server5 P5 P6 P8 Server6 P7 Paxos Group Proxy Proxy Proxy OLTP business P3 P2 P4 Leader Follower IDC2 IDC3 OLTP business OLAP real time data analytics 5
  • 6. TPC Benchmark • 2019 TPC-C 60.88 million tpmC • 2020 TPC-C 707 million tpmC 6 • 2021 TPC-H 15.26 million qphH (30,000GB dataset)
  • 7. Architecture • Each cluster consists of several zones in one or multiple regions. • OBProxy is used to route requests to OBServer. • Each OBServer is similar to a classical RDBMS; Compiles SQL statement(s) to produce a SQL execution plan. • One OBServer is elected to host root service. • Redo logs are replicated among the zones using Paxos. • Transactions for only one partition are executed locally. • Transactions for multiple partitions are executed using 2PC. 7
  • 8. Basic Concept Cluster Zone OBServer Admin APP Tenant Database Table Partition Replica Resource Pool Zone Zone OBServer OBServer Each cluster has multiple Zones Each Zone has multiple OBServers Replica Replica • Zone: Availability Zone, an IDC in most case • Multi-tenant architecture: divides each cluster into multiple resource pools owned by tenants, resource isolation is done internally by the database Each resource pool has multiple resource units 8
  • 9. Transaction Engine Leader Follower P1 P2 P4 OBServer ZONE_1 P5 P6 P8 OBServer P7 P1 P3 P4 OBServer ZONE_2 P5 P6 P8 OBServer P7 P1 P3 P2 OBServer ZONE_3 P5 P6 P8 OBServer P7 Paxos Group P3 P2 P4 • Paxos: Quorum-based consensus (2 out of 3, or 3 out of 5 replicas) • High availability: RPO = 0, RTO < 30 s • RPO: recovery point objective • RTO: recovery time objective • Distributed Transaction • Two Phase Commit (2PC) • MVCC & Snapshot Isolation • Linearizability: uses GTS (Global Transaction Service) to retrieve the global unique id for each transaction 9
  • 10. Storage Engine Logs Update Replicas MemTable(WOS) ROW Cache Minor SSTable Major SSTable(ROS) Disk Row-Level In-Memory Redo/MVCC Memory In-Memory Hash In-Memory B +-Trees Scan Big-Query Get Small-Query Block Cache • LSM Tree: MemTable and SSTable • Compaction: Merges several sstables and memtables into one single sstable • MemTable: Btree and hash index • SSTable: divided into data blocks, order by primary key • Macro Block: mostly 2MB, write unit Micro Block: mostly 8KB ~ 512KB, read unit, encoding and compression unit • Cache: Row Cache (for single row get) and Block Cache (for scan) 10
  • 11. SQL Engine • Fast parser attempts to match an existing plan in the plan cache. • Resolver translates SQL request and generates a statement tree. • Transformer analyzes and rewrites the user SQL. • Optimizer(System-R like cost based optimizer) performs query transformation and optimization. • Code Generator does code generation. • Vector execution and Parallel execution are used for OLAP big query 11
  • 12. Comparation with other distributed SQL database OceanBase Cloud Spanner Cockroachdb TiDB Multi-tenancy YES YES YES NO SQL Compatibility MySQL PostgreSQL PostgreSQL MySQL SQL Join YES YES YES YES Foreign Key YES YES YES NO XA, Stored Procedure YES NO NO NO Multi-Model Json, GIS, KV API JSON, KV API JSON, GIS JSON, TiKV HTAP YES NO NO YES Global Database NO YES YES NO Replication Paxos Paxos Raft Raft Global Time GTS Truetime HLC GTS Linearizability YES YES NO YES Commit Wait NO YES (7~10ms) NO NO Sysbench(1 node) ~~ MySQL < 1/3 MySQL < 1/3 MySQL < 1/3 MySQL Language C++ C++ Go Go, Rust 12
  • 13. * Statistics above are from real production in Alipay OceanBase @ Alipay 61 millions Queries per second Peak performance > 200 Nodes in one cluster Cluster Size > 6 PB data Data size in one instance > 320 billions Billions of rows single table size RPO = 0 RTO < 30 Disaster Tolerance Seconds Trade Payment Accounting CIF Promotion Real time data 13
  • 14. Outside Validation on Mission Critical System Across the Industry Used by 400+ customers on Mission Critical Systems Mission critical BOSS and CRM systems Debit and credit card transaction, billing, accounting systems High concurrency scenarios: payment, accounting, customer info systems FIs High tech Telco 14