A Paradigm Shift: The
Increasing Dominance of
Memory-Oriented Solutions for
High Performance Data Access!

Ben Stopford : RBS
How fast is a HashMap lookup?




~20 ns
That’s how long it takes light to
         travel a room
How fast is a database lookup?




~20 ms
That’s how long it takes light to go
       to Australia and back
3 times
Computers really are very fast!
The problem is we’re quite good at
writing software that slows them down
Desktop Virtualization
We love
abstraction
There are many reasons
why abstraction is a
good idea… 

…performance just isn’t
one of them
Question: is it fair to compare a
Database with a HashMap?
Not really…
Key Point



 On one end of
                     ..on the other sits
the scale sits the
                       the database… 
  HashMap…


 …but it’s a very very long scale that
         sits between them.
Times are changing
Database Architecture is
Aging
The Traditional Architecture
Traditional


       Shared                  Shared
                In Memory
        Disk
                  Nothing


Distributed                        Simpler
In Memory
                         Contract
Simplifying the
   Contract
How big is the internet?



     5 exabytes
              
(which is 5,000 petabytes or
   5,000,000 terabytes)
How big is an average enterprise
           database


   80% < 1TB
           (in 2009)
Simplifying the Contract
Databases have huge operational
          overheads




                             Taken from “OLTP Through
                             the Looking Glass, and What
                             We Found There”
                             Harizopoulos et al
Avoid that overhead with a simpler
contract and avoiding IO
Improving Database Performance !
Shared Disk Architecture




                            Shared
                             Disk
Improving Database Performance !
Shared Nothing Architecture
Each machine is responsible for a subset of the
   records. Each record exists on only one
                  machine.!
                     

                     1, 2, 3…
   97, 98, 99…




             765, 769…
                  169, 170…
  Client
       

                    333, 334…
   244, 245…
Improving Database Performance (3)!
 In Memory Databases!
(single address-space)
Databases must cache subsets of
      the data in memory




             Cache
Not knowing what you don’t know




        90% in Cache


           Data on Disk
If you can fit it ALL in memory you
know everything!!
The architecture of an in memory
            database
Memory is at least 100x faster than disk
               ms
   μs
       ns
           ps

1MB Disk/Network
        1MB Main Memory


          0.000,000,000,000
Cross Continental    Main Memory
                  L1 Cache Ref
Round Trip
          Ref
         Cross Network             L2 Cache Ref
         Round Trip
       * L1 ref is about 2 clock cycles or 0.7ns. This is
                           the time it takes light to travel 20cm
Memory allows random access.
Disk only works well for sequential
reads
This makes them very fast!!
The proof is in the stats. TPC-H
Benchmarks on a 1TB data set
So why haven’t in memory
  databases taken off?
Address-Spaces are relatively small
     and of a finite, fixed size
Durability
One solution is
 distribution
Distributed In Memory (Shared
           Nothing)
Again we spread our data but this time only
               using RAM.




                   1, 2, 3…
   97, 98, 99…




           765, 769…
                  169, 170…
Client
     

                  333, 334…
   244, 245…
Distribution solves our two
          problems
We get massive amounts of parallel
          processing
But at the cost of
loosing the single
  address space
Traditional


       Shared                  Shared
                In Memory
        Disk
                  Nothing


Distributed                        Simpler
In Memory
                         Contract
There are three key themes here:


                  Simplify the
Distribution
                        No Disk
                   contract

                    Improve
  Gain
                    scalability by
  scalability
                    picking          All data is
  through a
                    appropriate      held in RAM
  distributed
                    ACID
  architecture
                    properties.
ODC
ODC – Distributed, Shared Nothing, In
Memory, Semi-Normalised, Graph DB

      450 processes
      2TB of RAM




  Messaging (Topic Based) as a system of record
                  (persistence)
ODC represents a
balance between
 throughput and
     latency
What is Latency?
What is Throughput
Which is best for latency?


                 Shared
                Nothing 
              (Distributed)
Traditional                    In-Memory
Database
                       Database



               Latency?
Which is best for throughput?


                   Shared
                  Nothing 
                (Distributed)
  Traditional                    In-Memory
  Database
                       Database



                 Latency?
So why do we use distributed in
          memory?
                     Plentiful
       In Memory
                    hardware




        Latency
    Throughput
This is the technology of
the now. 
So what is the technology
of the future?
Terabyte Memory Architectures
Fast Persistent Storage
New Innovations on the Horizon
These factors are remolding the
hardware landscape to one where
 memory both vast and durable
This is changing the way we write
             software
Huge servers in the
commodity space are
driving us towards single
process architectures that
utilise many cores and
large address spaces
We can attain hundreds of
thousands of executions
per second from a single
process if it is well
optimised.
“All computers wait at the
same speed” 
!
We need to optimise for our CPU architecture

               ms
   μs
       ns
           ps

1MB Disk/Network
        1MB Main Memory


          0.000,000,000,000
Cross Continental    Main Memory
                  L1 Cache Ref
Round Trip
          Ref
         Cross Network             L2 Cache Ref
         Round Trip
       * L1 ref is about 2 clock cycles or 0.7ns. This is
                           the time it takes light to travel 20cm
Tools like Vtune allow us to
optimise software to truly leverage
           our hardware
So what does this all mean?
Further Reading

A Paradigm Shift: The Increasing Dominance of Memory-Oriented Solutions for High Performance Data Access