the NewSQL database you’ll never outgrow




OldSQL vs. NoSQL vs. NewSQL
       on New OLTP


Michael Stonebraker, CTO
VoltDB, Inc.
Old OLTP

  Remember how we used to buy airplane tickets in the
   1980s
         + By telephone
         + Through an intermediary (professional terminal operator)


  Commerce at the speed of the intermediary


  In 1985, 1,000 transactions per second was considered
   an incredible stretch goal!!!!
         + HPTS (1985)

VoltDB                                                                2
How has OLTP Changed in 25 Years?

 The internet
         + Client is no longer a professional terminal operator
         + Instead Aunt Martha is using the web herself
         + Sends volume through the roof




VoltDB                                                            3
How has OLTP Changed in 25 Years?

 PDAs and sensors
         + Your cell phone is a transaction originator
         + Everything is being geo-positioned by sensors (marathon
           runners, your car, ….)
         + Sends volume through the roof




VoltDB                                                               4
How has OLTP Changed in 25 Years?

 The definitions
         + “Online” no longer exclusively means a human operator
            — The oncoming data tsunami is often device and system-generated
         + “Transaction” now transcends the traditional business
           transaction
            — High-throughput ACID write operations are a new requirement
         + “HA” and “durability” are now core database requirements




VoltDB                                                                      5
Example NewOLTP Use Cases
                      Data                    High-frequency               Lower-frequency
                     Source                     operations                   operations

                                        Write/index all trades, store Show consolidated risk
              Capital markets
                                        tick data                     across traders


              Call initiation request   Real-time authorization       Fraud detection/analysis


              Inbound HTTP              Visitor logging, analysis,
                                                                      Traffic pattern analytics
              requests                  alerting

                                        Rank scores:
              Online game               •Defined intervals            Leaderboard lookups
                                        •Player “bests”

              Real-time ad trading      Match form factor,            Report ad performance
              systems                   placement criteria, bid/ask   from exhaust stream

              Mobile device location Location updates, QoS,
                                                                      Analytics on transactions
              sensor                 transactions

VoltDB
     VoltDB                                                                                       6   6
                                                                                                      6
New OLTP Challenges

                                                 New OLTP and You
       You need to ingest the firehose in real
       time
       You need to process, validate, enrich
       and respond in real-time

       You often need real-time analytics




VoltDB
     VoltDB                                                    7    7
                                                                    7
Solution Choices

  OldSQL
         + Legacy RDBMS vendors

  NoSQL
         + Give up SQL and ACID for performance

  NewSQL
         + Preserve SQL and ACID
         + Get performance from a new architecture




VoltDB                                               8
OldSQL

 Traditional SQL vendors (the “elephants”)
         + Code lines dating from the 1980’s

         + “bloatware”

         + Not very good at anything
            — Can be beaten by at least an order of magnitude in every vertical
              market I know of

         + Mediocre performance on New OLTP
            — At low velocity it doesn’t matter

            — Otherwise you get to tear your hair out


VoltDB                                                                            9
Reality Check
  TPC-C CPU cycles
  On the Shore DBMS prototype
  Elephants should be similar




VoltDB                           10
The Elephants

  Are slow because they spend all of their time on
   overhead!!!
         + Not on useful work

  Would have to re-architect their legacy code to do
   better




VoltDB                                                  11
To Go a Lot Faster You Have to……

  Focus on overhead
         + Better B-trees affects only 4% of the path length


  Get rid of ALL major sources of overhead
         + Main memory deployment – gets rid of buffer pool
             — Leaving other 75% of overhead intact
             — i.e. win is 25%




VoltDB                                                         12
NoSQL

  Give up SQL
  Give up ACID




VoltDB            13
Give Up SQL?
 Compiler translates SQL at compile time into a
  sequence of low level operations
 Similar to what the NoSQL products make you
  program in your application
 30 years of RDBMS experience
         + Hard to beat the compiler
         + High level languages are good (data independence, less code, …)
         + Stored procedures are good!
            — One round trip from app to DBMS rather than one one round trip
              per record
            — Move the code to the data, not the other way around


VoltDB                                                                         14
Give Up ACID?

 If you have multi-record transactions
         + Need ACID or have to program it in user-level code
         + Rolling your own is messy, brittle, error-prone

 Compensating transactions = pain
         + They’re people intensive and expensive
         + They cause customer/user dissatisfaction
         + Non-commutative single-record
           transactions (e.g. buy an item) simply fail




VoltDB                                                          15
NoSQL Summary
  Appropriate for non-transactional systems
  Appropriate for single record transactions that are
   commutative
  Not a good fit for New OLTP
  Use the right tool for the job
                 Interesting …
         Two recently-proposed NoSQL
         language standards – CQL and
         UnQL – are amazingly similar to
         (you guessed it!) SQL


VoltDB                                                   16
NewSQL

  SQL
  ACID
  Performance and scalability through modern
   innovative software architecture




VoltDB                                          17
NewSQL

  Needs something other than traditional record level
   locking (1st big source of overhead)
         + timestamp order
         + MVCC
         + Your good idea goes here




VoltDB                                                   18
NewSQL

  Needs a solution to buffer pool overhead (2nd big
   source of overhead)
         + Main memory (at least for data that is not cold)
         + Some other way to reduce buffer pool cost




VoltDB                                                        19
NewSQL

  Needs a solution to latching for shared data
   structures (3rd big source of overhead)
         + Some innovative use of B-trees
         + Single-threading
         + Your good idea goes here




VoltDB                                            20
NewSQL

  Needs a solution to write-ahead logging (4th big
   source of overhead)
         + Obvious answer is built-in replication and failover
         + New OLTP views this as a requirement anyway

  Some details
         + On-line failover?
         + On-line failback?
         + LAN network partitioning?
         + WAN network partitioning?



VoltDB                                                           21
Node Speed Really Matters
            Application Requirement: 1MM operations/sec at scale

         Operations/sec/node     10,000   Operations/sec/node     100,000
         Nodes (without HA)      100      Nodes (without HA)      10
         HA Replication factor   2        HA Replication factor   2
         Nodes (including HA)    300      Nodes (including HA)    30




    Greater scale through reduced network traffic
    Reduced probability of hardware failures
    Lower monetary and environmental costs
  A high-scaling RDBMS must combine transparent
  distribution, built-in durability and blazing node speed
VoltDB                                                                      22
A NewSQL Example – VoltDB

  Main-memory storage

  Single threaded, run Xacts to completion
         + No locking

         + No latching

  Built-in HA and durability
         + No log



VoltDB                                        23
Yabut: What About Multicore?

  For A K-core CPU, divide memory into K (non
    overlapping) buckets

  i.e. convert multi-core to K single cores




VoltDB                                           24
Where all the time goes… revisited

         Before             VoltDB




VoltDB                                25
How Scalable is VoltDB?



“   VoltDB is very scalable; it should
    scale to 120 partitions, 39 servers,
    and 1.6 million complex transactions
    per second at over 300 CPU cores.      SGI Announces Support and Record
                                            Benchmarks for VoltDB Database
                                             High Performance SGI® Rackable™
    Baron Schwartz                           Cluster Delivers Over Three Million
    Chief Performance Architect                   Transactions per Second

    Percona



           Vanilla Hardware                      Pumped-up Hardware

VoltDB                                                                             26
Current VoltDB Status

  Runs a subset of SQL (which is getting larger)
  On VoltDB clusters (in memory on commodity gear)
  No WAN support yet
         + Discussions abound on exactly what to do here

  45X a popular OldSQL DBMS on TPC-C
  5-7X Cassandra on VoltDB K-V layer
  Clearly note this is an open source system!



VoltDB                                                     27
Summary
         Old OLTP              New OLTP




                          Too slow
   OldSQL for New OLTP    Does not scale
                          Lacks consistency guarantees
   NoSQL for New OLTP     Low-level interface
                          Fast, scalable and consistent
   NewSQL for New OLTP    Supports SQL

VoltDB                                                     28
the NewSQL database you’ll never outgrow




                Thank You

NewSQL vs NoSQL for New OLTP

  • 1.
    the NewSQL databaseyou’ll never outgrow OldSQL vs. NoSQL vs. NewSQL on New OLTP Michael Stonebraker, CTO VoltDB, Inc.
  • 2.
    Old OLTP Remember how we used to buy airplane tickets in the 1980s + By telephone + Through an intermediary (professional terminal operator)  Commerce at the speed of the intermediary  In 1985, 1,000 transactions per second was considered an incredible stretch goal!!!! + HPTS (1985) VoltDB 2
  • 3.
    How has OLTPChanged in 25 Years? The internet + Client is no longer a professional terminal operator + Instead Aunt Martha is using the web herself + Sends volume through the roof VoltDB 3
  • 4.
    How has OLTPChanged in 25 Years? PDAs and sensors + Your cell phone is a transaction originator + Everything is being geo-positioned by sensors (marathon runners, your car, ….) + Sends volume through the roof VoltDB 4
  • 5.
    How has OLTPChanged in 25 Years? The definitions + “Online” no longer exclusively means a human operator — The oncoming data tsunami is often device and system-generated + “Transaction” now transcends the traditional business transaction — High-throughput ACID write operations are a new requirement + “HA” and “durability” are now core database requirements VoltDB 5
  • 6.
    Example NewOLTP UseCases Data High-frequency Lower-frequency Source operations operations Write/index all trades, store Show consolidated risk Capital markets tick data across traders Call initiation request Real-time authorization Fraud detection/analysis Inbound HTTP Visitor logging, analysis, Traffic pattern analytics requests alerting Rank scores: Online game •Defined intervals Leaderboard lookups •Player “bests” Real-time ad trading Match form factor, Report ad performance systems placement criteria, bid/ask from exhaust stream Mobile device location Location updates, QoS, Analytics on transactions sensor transactions VoltDB VoltDB 6 6 6
  • 7.
    New OLTP Challenges New OLTP and You You need to ingest the firehose in real time You need to process, validate, enrich and respond in real-time You often need real-time analytics VoltDB VoltDB 7 7 7
  • 8.
    Solution Choices OldSQL + Legacy RDBMS vendors  NoSQL + Give up SQL and ACID for performance  NewSQL + Preserve SQL and ACID + Get performance from a new architecture VoltDB 8
  • 9.
    OldSQL Traditional SQLvendors (the “elephants”) + Code lines dating from the 1980’s + “bloatware” + Not very good at anything — Can be beaten by at least an order of magnitude in every vertical market I know of + Mediocre performance on New OLTP — At low velocity it doesn’t matter — Otherwise you get to tear your hair out VoltDB 9
  • 10.
    Reality Check TPC-C CPU cycles  On the Shore DBMS prototype  Elephants should be similar VoltDB 10
  • 11.
    The Elephants Are slow because they spend all of their time on overhead!!! + Not on useful work  Would have to re-architect their legacy code to do better VoltDB 11
  • 12.
    To Go aLot Faster You Have to……  Focus on overhead + Better B-trees affects only 4% of the path length  Get rid of ALL major sources of overhead + Main memory deployment – gets rid of buffer pool — Leaving other 75% of overhead intact — i.e. win is 25% VoltDB 12
  • 13.
    NoSQL  Giveup SQL  Give up ACID VoltDB 13
  • 14.
    Give Up SQL? Compiler translates SQL at compile time into a sequence of low level operations  Similar to what the NoSQL products make you program in your application  30 years of RDBMS experience + Hard to beat the compiler + High level languages are good (data independence, less code, …) + Stored procedures are good! — One round trip from app to DBMS rather than one one round trip per record — Move the code to the data, not the other way around VoltDB 14
  • 15.
    Give Up ACID? If you have multi-record transactions + Need ACID or have to program it in user-level code + Rolling your own is messy, brittle, error-prone  Compensating transactions = pain + They’re people intensive and expensive + They cause customer/user dissatisfaction + Non-commutative single-record transactions (e.g. buy an item) simply fail VoltDB 15
  • 16.
    NoSQL Summary Appropriate for non-transactional systems  Appropriate for single record transactions that are commutative  Not a good fit for New OLTP  Use the right tool for the job Interesting … Two recently-proposed NoSQL language standards – CQL and UnQL – are amazingly similar to (you guessed it!) SQL VoltDB 16
  • 17.
    NewSQL  SQL  ACID  Performance and scalability through modern innovative software architecture VoltDB 17
  • 18.
    NewSQL  Needssomething other than traditional record level locking (1st big source of overhead) + timestamp order + MVCC + Your good idea goes here VoltDB 18
  • 19.
    NewSQL  Needsa solution to buffer pool overhead (2nd big source of overhead) + Main memory (at least for data that is not cold) + Some other way to reduce buffer pool cost VoltDB 19
  • 20.
    NewSQL  Needsa solution to latching for shared data structures (3rd big source of overhead) + Some innovative use of B-trees + Single-threading + Your good idea goes here VoltDB 20
  • 21.
    NewSQL  Needsa solution to write-ahead logging (4th big source of overhead) + Obvious answer is built-in replication and failover + New OLTP views this as a requirement anyway  Some details + On-line failover? + On-line failback? + LAN network partitioning? + WAN network partitioning? VoltDB 21
  • 22.
    Node Speed ReallyMatters Application Requirement: 1MM operations/sec at scale Operations/sec/node 10,000 Operations/sec/node 100,000 Nodes (without HA) 100 Nodes (without HA) 10 HA Replication factor 2 HA Replication factor 2 Nodes (including HA) 300 Nodes (including HA) 30  Greater scale through reduced network traffic  Reduced probability of hardware failures  Lower monetary and environmental costs A high-scaling RDBMS must combine transparent distribution, built-in durability and blazing node speed VoltDB 22
  • 23.
    A NewSQL Example– VoltDB  Main-memory storage  Single threaded, run Xacts to completion + No locking + No latching  Built-in HA and durability + No log VoltDB 23
  • 24.
    Yabut: What AboutMulticore?  For A K-core CPU, divide memory into K (non overlapping) buckets  i.e. convert multi-core to K single cores VoltDB 24
  • 25.
    Where all thetime goes… revisited Before VoltDB VoltDB 25
  • 26.
    How Scalable isVoltDB? “ VoltDB is very scalable; it should scale to 120 partitions, 39 servers, and 1.6 million complex transactions per second at over 300 CPU cores. SGI Announces Support and Record Benchmarks for VoltDB Database High Performance SGI® Rackable™ Baron Schwartz Cluster Delivers Over Three Million Chief Performance Architect Transactions per Second Percona Vanilla Hardware Pumped-up Hardware VoltDB 26
  • 27.
    Current VoltDB Status  Runs a subset of SQL (which is getting larger)  On VoltDB clusters (in memory on commodity gear)  No WAN support yet + Discussions abound on exactly what to do here  45X a popular OldSQL DBMS on TPC-C  5-7X Cassandra on VoltDB K-V layer  Clearly note this is an open source system! VoltDB 27
  • 28.
    Summary Old OLTP New OLTP  Too slow OldSQL for New OLTP  Does not scale  Lacks consistency guarantees NoSQL for New OLTP  Low-level interface  Fast, scalable and consistent NewSQL for New OLTP  Supports SQL VoltDB 28
  • 29.
    the NewSQL databaseyou’ll never outgrow Thank You