3. What is NewSQL?
“...NewSQL is a class of modern relational database management systems
that seek to provide the same scalable performance of NoSQL systems for
online transaction processing (read-write) workloads while still maintaining
the ACID guarantees of a traditional database system…”
3
- Wikipedia
4. OLTP (Online Transaction Processing)
4
Old OLTP New OLTP
OldSQL for New OLTP ● Too slow
● Does not Scale
NoSQL for New OLTP ● Cannot guarantee consistency
NewSQL for New OLTP ● Fast, Scalable and consistent
● Supports SQL
5. State of the Database
5
ACID transactions
SQL support
Standardized
Horizontal Scaling
High Availability
Horizontal Scaling
High Availability
ACID transactions
SQL support
Standardized
ACID transactions
Horizontal Scaling
High Availability
SQL support
Standardized
RDBMS (OLDSQL) NOSQL NEWSQL
6. A more comprehensive look
6
● Traditional OldSQL
○ SQL
○ ACID compliant
○ Re-write and re-architect to scale (Sharding, Denormalizing, Distributed Caching)
● NoSQL
○ Scalability and Availability
○ Schema-less (great for non-transactional systems)
○ Give up SQL
○ Give up ACID transactions (not fit for OLTP systems)
● NewSQL
○ SQL
○ Scalable, shared nothing architecture
○ ACID compliant
○ Schema
7. Why do we need NewSQL (Summary)?
● Provide the same scalable performance of NoSQL for OLTP, and still
maintaining the ACID.
● With relations and SQL.
7
10. 1. Architecture: New architectures
● Provide concurrency control.
● Traditional relational db concurrency control
○ 2 phase locking
● Newsql db concurrency control
○ MVCC (Multi Version Concurrency Control)
○ Basic Timestamp Concurrency Control
○ Optimistic Concurrency Control
○ T/O with Partition-Level Locking
○ And others.
● e.g. Google Spanner, VoltDB, MemSQL
10
11. MVCC (Multi Version Concurrency Control)
● Read data without blocking update.
● Each transaction keeps a snapshot.
● By reading the snapshot, gets a consistent view of the database.
● Cost:
○ Garbage collection on old snapshot.
11New architectures
snapshots time
12. Basic Timestamp Concurrency Control
● Timestamp on tuple.
● For read or write:
○ rejects if the timestamp is less than the timestamp of the last write to that tuple.
● For a write operation:
○ rejects if the timestamp is less than the timestamp of the last read to that tuple.
● Cost:
○ Each site maintains a logical clock, need to be accurate.
12New architectures
13. Optimistic Concurrency Control
● Tracks the read/write transaction; Stores all write operations in private
workspace.
● The system determines whether that transaction’s read set overlaps with
the write set of any concurrent transactions.
● Transactions write their updates to shared memory only at commit
time, the contention period is short.
● Cost:
○ Rollback
13New architectures
14. T/O with Partition-Level Locking
● Database is divided into disjoint subsets, called partitions.
● Partition
○ Lock.
○ Single-threaded execution engine.
● Apply timestamp on a transaction, and add to queues.
● Execution the oldest timestamp transaction in the queue.
14New architectures
15. 2. Architecture: SQL engines
● Provide highly optimized storage engines for SQL.
○ use MySQL Cluster as an example.
● Separate nodes into 3 kinds of node
○ Data node
■ Store the data
○ Management node
■ Configuration and monitoring of the cluster.
○ Application node or SQL node
■ Connects to all of the data nodes and perform data storage and retrieval.
● Consistency will be controlled by Application nodes.
15
16. 3. Architecture: Transparent sharding
● Use sharding middleware.
● All the node can connect to middleware.
● Middleware will control all the process to
ensure the consistency.
● e.g. dbShards and ScaleBase.
16
17. Main drawback
● Write latency.
○ With the concurrency control, need more time to make sure the data is consistent.
● Can use in-memory mechanism to help us reduce latency, but restricted
by memory size.
17Source: http://www.planetcassandra.org/nosql-performance-benchmarks/
Write latency for workload Read/Write
18. Conclusion
● A database trend to watch
● NewSQL is ACID compliant, SQL based, scalable, distributed, highly
available RDBMS system
● NewSQL databases are becoming more demanded due to the rise of
data-oriented industries (e.g. IoT)
18
Something to think about: In fact, both NoSQL
and NewSQL databases can offer a degree of
consistency, and availability, as well as partition
tolerance.