The End of a
Myth: Ultra-
Scalable
Transactional
Management
Presented by:
Ricardo Jimenez-Peris
CEO & Co-founder
@ LeanXcale
About the Speaker
Top researcher on scalable transactional management and
distributed data management with 100+ publications in top
conferences and journals
Co-author of a book on Database Replication
Professor on distributed systems and data management for over 25
years
Co-inventor of two granted patents and 8 new patent applications
Invited speaker to top-tech companies in Silicon Valley, such as
Facebook, Twitter, Salesforce, Heroku, EMC-Pivotal (when it was
EMC-Greenplum), HP, Microsoft
About LeanXcale
Vendor of a NewSQL
ultra-scalable database,
Full ACID, Full SQL
LeanXcale – HTAP
Database: blending
Operational and
Analytical capabilities
delivering real-time
data
LeanXcale leverages an
ultra-efficient storage
engine, which is a
relational key-value
data store
Product Team
45
%
30%
15
Awards
Total number
PhD Holders
10-25 years of
Industry expertise
Engineers from Top
technical universities
The Myth
”Operational databases can not scale”
WHY?
Nobody managed to scale them in
three decades.
Some say that is due to the CAP
Theorem.
- vendors that do not provide ACID properties
C - Consistency
A - Availability
P – Partitions
The CAP theorem states something very well
known in distributed systems, i.e. if you want to
tolerate partitions, choose:
Availability at all nodes and no consistency
OR
Consistency and no Availability at all nodes
The CAP Theorem
Q: Where is the S of
Scalability?
A: Nowhere
Solved how to scale
transactions to large
scale (i.e. 100 million
update transactions
per second) in a fully
seamless way
Breakthrough result of
15+ years of research
by a tenacious team
The End of the Myth: Ultra-Scalable
Transactions
Evaluation without data manager/logging to see how much
throughput can attain the transactional processing
2.35
Million
transactio
ns
per
second
Scalability
LeanXcale
Process &
commits
transaction
s
in parallel
Tim
e
Provides a
consistent
view
Traditional systems
have a single-node
bottleneck
vs
Tim
e
Traditional transactional DB
Ultra-Scalable Transactions
LeanXcale
Centralized Transaction Manager
Centra
l TM
Atomicity Isolation
DurabilityConsistenc
y
Traditional Approach
Single-Node
Bottleneck
Centra
l TM
Atomicity Isolation
Writes
DurabilityIsolation
Reads
Centralized Transaction Manager
Traditional Approach
Single-Node
Bottleneck
AtomicityAtomicity
AtomicityAtomicity
Atomicit
y
Isolation
Reads
Durabilit
y
Isolation
Writes
Scaling ACID Properties
Snapshot
Server
Commit
Sequencer
Isolation
Reads
Conflict
Managers
Isolation
Writes
Loggers
Durabilit
y
Local
TMs
Atomicit
y
Scaling ACID Properties
Separation of commit from the visibility of committed
data
Proactive pre-assignment of commit timestamps to
committing transactions
Transactions can commit in parallel due to:
• They do not conflict
• They have their commit timestamp already assigned that will
determine its serialization order
• Visibility is regulated separately to guarantee the reading of fully
consistent states
Detection and resolution of conflicts before commit
Main Principles
Snapsh
ot
Server
Current consistent
snapshot
The local txn
mng gets the
“start TS” from
the snapshot
server.
Get start TS
Local Txn
Manager
Transactional Life Cycle: Start
Local
Transaction
Manager
Get start TS
Run on start
TS snapshot
Conflict
Manag
er
The transaction will read
the state as of “start TS”.
Write-write conflicts are
detected by conflict
managers on the fly.
Transactional Life Cycle: Execution
Get start TS
Run on start
TS snapshot
Commit
The local transaction
manager orchestrates
the commit.
Local Txn
Manager
Transactional Life Cycle: Commit
Logger
Commit
Sequencer
Data Store
Snapshot
Server
Commit
TS
writese
t
writese
t
Commit
TS
Local
Transaction
Manager
Get
Commit
TS
Log
Public
Updates
Report
Snaps
Serv
Transactional Life Cycle: Commit
TIMESTAMP 11
TIMESTAMP 15
TIMESTAMP 12
TIMESTAMP 14
TIMESTAMP 13
Time
Sequence of timestamps received by the Snapshot Server
Evolution of the current snapshot at the Snapshot Server
TIMESTAMP
11
TIMESTAMP
12 TIMESTAMP
12 TIMESTAMP
15TIMESTAMP
11
1
1
1
5
1
2
1
4
1
3
1
1
1
1
1
2
1
2
1
5
Transactional Life Cycle: Commit
The described approach so far is the original reactive
approach
It results in multiple messages per update transaction.
The adopted approach is proactive:
• The local transaction managers report periodically about
the number of committed update transactions per second
• The commit sequencer distributes batches of commit
timestamps to the local transaction managers
• The snapshot server gets periodically batches of
timestamps (both used and discarded) from local
transaction managers
• The snapshot server reports periodically to local transaction
managers the most current consistent snapshot
Increasing Efficiency
The transactional management provides ultra-scalability
Fully transparent:
• No sharding.
• No required a priori knowledge about rows to be
accessed.
• Syntactically: no changes required in the application.
• Semantically: equivalent behavior to a centralized
system.
Provides Snapshot Isolation
(the isolation level provided by Oracle when set to
“Serializable” isolation).
+
+
Transactional Processing
KiVi Key-Value
Data Store
OLTP & OLAP
Query Engine
Storage
Transaction Manager
SQL Engine
Ultra-Scalable
Transactions
Architecture
 Cutting costs of business analytics by 80%
 Real-time Analytical Queries
 No more ETLs
Analytical Queries
on Operational Data
Operational Database
OLTP
Data Warehouse
OLAP
OLTP + OLAP
Blending OLTP & OLAP:
Making Decisions at the Right Time
Use Cases
LeanXcale is the first database technology that can substitute
the mainframe.
It can bear the operational workloads of a mainframe, but at
the same time provide real-time analytics over the
operational data.
It can be deployed by the mainframe to be loaded/updated in
real-time, and applications can be offloaded from the
mainframe one by one.
LeanXcale is partnering with Bull Atos to provide a database
appliance that will provide the substitute of the mainframe.
Offloading/Substituting Mainframe
Enabling to implement the Customer Experience Management
(CEM) halving the number of nodes.
Leveraging the computation of aggregates in real-time as raw
KPIs are inserted.
Analytical aggregation queries become simple single-row
queries.
Elasticity enables to substantially reduce the operation
personnel cost during the non-working hours with low loads.
Reducing Cost of Ownership at
Telcos
Using the key-value interface for large data ingestion of IoT
applications while still accessible through SQL and reducing by
several times the infrastructure needed.
Real-time analytics.
Computation of aggregates in real-time to reduce the cost of
aggregation analytical queries, e.g., for the smart grid.
Elasticity enable to adjust the consumption of resources to the
load received.
Large IoT Applications
Using the key-value interface to reduce the footprint needed
to get clicks
Real-time analytics for implementing availability checking
Elasticity enable to adjust the consumption of resources to
the load received
Full ACIDity to guarantee the consistency of the truth of
sales and actual availability
Disrupting Travel Tech
Ricardo Jimenez-Peris
LeanXcale CEO & Co-
Founder
info@leanxcale.com
www.LeanXcale.com
@LeanXcale

End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-Peris at Big Data Spain 2017

  • 2.
    The End ofa Myth: Ultra- Scalable Transactional Management Presented by: Ricardo Jimenez-Peris CEO & Co-founder @ LeanXcale
  • 3.
    About the Speaker Topresearcher on scalable transactional management and distributed data management with 100+ publications in top conferences and journals Co-author of a book on Database Replication Professor on distributed systems and data management for over 25 years Co-inventor of two granted patents and 8 new patent applications Invited speaker to top-tech companies in Silicon Valley, such as Facebook, Twitter, Salesforce, Heroku, EMC-Pivotal (when it was EMC-Greenplum), HP, Microsoft
  • 4.
    About LeanXcale Vendor ofa NewSQL ultra-scalable database, Full ACID, Full SQL LeanXcale – HTAP Database: blending Operational and Analytical capabilities delivering real-time data LeanXcale leverages an ultra-efficient storage engine, which is a relational key-value data store Product Team 45 % 30% 15 Awards Total number PhD Holders 10-25 years of Industry expertise Engineers from Top technical universities
  • 5.
    The Myth ”Operational databasescan not scale” WHY? Nobody managed to scale them in three decades. Some say that is due to the CAP Theorem. - vendors that do not provide ACID properties
  • 6.
    C - Consistency A- Availability P – Partitions The CAP theorem states something very well known in distributed systems, i.e. if you want to tolerate partitions, choose: Availability at all nodes and no consistency OR Consistency and no Availability at all nodes The CAP Theorem Q: Where is the S of Scalability? A: Nowhere
  • 7.
    Solved how toscale transactions to large scale (i.e. 100 million update transactions per second) in a fully seamless way Breakthrough result of 15+ years of research by a tenacious team The End of the Myth: Ultra-Scalable Transactions
  • 8.
    Evaluation without datamanager/logging to see how much throughput can attain the transactional processing 2.35 Million transactio ns per second Scalability
  • 9.
    LeanXcale Process & commits transaction s in parallel Tim e Providesa consistent view Traditional systems have a single-node bottleneck vs Tim e Traditional transactional DB Ultra-Scalable Transactions LeanXcale
  • 10.
    Centralized Transaction Manager Centra lTM Atomicity Isolation DurabilityConsistenc y Traditional Approach Single-Node Bottleneck
  • 11.
    Centra l TM Atomicity Isolation Writes DurabilityIsolation Reads CentralizedTransaction Manager Traditional Approach Single-Node Bottleneck
  • 12.
  • 13.
  • 14.
    Separation of commitfrom the visibility of committed data Proactive pre-assignment of commit timestamps to committing transactions Transactions can commit in parallel due to: • They do not conflict • They have their commit timestamp already assigned that will determine its serialization order • Visibility is regulated separately to guarantee the reading of fully consistent states Detection and resolution of conflicts before commit Main Principles
  • 15.
    Snapsh ot Server Current consistent snapshot The localtxn mng gets the “start TS” from the snapshot server. Get start TS Local Txn Manager Transactional Life Cycle: Start
  • 16.
    Local Transaction Manager Get start TS Runon start TS snapshot Conflict Manag er The transaction will read the state as of “start TS”. Write-write conflicts are detected by conflict managers on the fly. Transactional Life Cycle: Execution
  • 17.
    Get start TS Runon start TS snapshot Commit The local transaction manager orchestrates the commit. Local Txn Manager Transactional Life Cycle: Commit
  • 18.
  • 19.
    TIMESTAMP 11 TIMESTAMP 15 TIMESTAMP12 TIMESTAMP 14 TIMESTAMP 13 Time Sequence of timestamps received by the Snapshot Server Evolution of the current snapshot at the Snapshot Server TIMESTAMP 11 TIMESTAMP 12 TIMESTAMP 12 TIMESTAMP 15TIMESTAMP 11 1 1 1 5 1 2 1 4 1 3 1 1 1 1 1 2 1 2 1 5 Transactional Life Cycle: Commit
  • 20.
    The described approachso far is the original reactive approach It results in multiple messages per update transaction. The adopted approach is proactive: • The local transaction managers report periodically about the number of committed update transactions per second • The commit sequencer distributes batches of commit timestamps to the local transaction managers • The snapshot server gets periodically batches of timestamps (both used and discarded) from local transaction managers • The snapshot server reports periodically to local transaction managers the most current consistent snapshot Increasing Efficiency
  • 21.
    The transactional managementprovides ultra-scalability Fully transparent: • No sharding. • No required a priori knowledge about rows to be accessed. • Syntactically: no changes required in the application. • Semantically: equivalent behavior to a centralized system. Provides Snapshot Isolation (the isolation level provided by Oracle when set to “Serializable” isolation). + + Transactional Processing
  • 22.
    KiVi Key-Value Data Store OLTP& OLAP Query Engine Storage Transaction Manager SQL Engine Ultra-Scalable Transactions Architecture
  • 23.
     Cutting costsof business analytics by 80%  Real-time Analytical Queries  No more ETLs Analytical Queries on Operational Data Operational Database OLTP Data Warehouse OLAP OLTP + OLAP Blending OLTP & OLAP: Making Decisions at the Right Time
  • 24.
  • 25.
    LeanXcale is thefirst database technology that can substitute the mainframe. It can bear the operational workloads of a mainframe, but at the same time provide real-time analytics over the operational data. It can be deployed by the mainframe to be loaded/updated in real-time, and applications can be offloaded from the mainframe one by one. LeanXcale is partnering with Bull Atos to provide a database appliance that will provide the substitute of the mainframe. Offloading/Substituting Mainframe
  • 26.
    Enabling to implementthe Customer Experience Management (CEM) halving the number of nodes. Leveraging the computation of aggregates in real-time as raw KPIs are inserted. Analytical aggregation queries become simple single-row queries. Elasticity enables to substantially reduce the operation personnel cost during the non-working hours with low loads. Reducing Cost of Ownership at Telcos
  • 27.
    Using the key-valueinterface for large data ingestion of IoT applications while still accessible through SQL and reducing by several times the infrastructure needed. Real-time analytics. Computation of aggregates in real-time to reduce the cost of aggregation analytical queries, e.g., for the smart grid. Elasticity enable to adjust the consumption of resources to the load received. Large IoT Applications
  • 28.
    Using the key-valueinterface to reduce the footprint needed to get clicks Real-time analytics for implementing availability checking Elasticity enable to adjust the consumption of resources to the load received Full ACIDity to guarantee the consistency of the truth of sales and actual availability Disrupting Travel Tech
  • 29.
    Ricardo Jimenez-Peris LeanXcale CEO& Co- Founder info@leanxcale.com www.LeanXcale.com @LeanXcale