DON’T USE
MY
DATABASE
 PRESENTATION FOR NOSQL ROADSHOW
       ON NOSQL USE CASES
@antonyfalco

ANTONY FALCO
COO and Basho Co-Founder
CAN WE PLEASE NOT TALK ABOUT

BIG DATA?
LET’S TALK ABOUT

CRITICAL DATA
Is your data
  really that
critical, dude?
ON OFFER

• what… is NoSQL? … Riak?

• who … uses Riak?

• why … should I NOT use Riak?
           (Or other NoSQL for that matter?)
TRUTH

               Enter the
                    NoSQL
                       Jungle
YOU ARE HERE
what are you prepared to trade?
Would you trade….


Your current, familiar consistency model
                       for
    a somewhat alien, but perfectly safe, consistency model and
                   better availability?
Would you trade….


Storage space
                 for

                low latency?
Would you trade….


late night heroics
                     for

                                  high
                      availability?
Would you trade….


35 years of RDBMS success
                 for

     technology used by a few companies of which
                you may have heard?
Would you trade….


Two million developers worldwide
                    for

         “It’s a piece of plumbing; it has never been
         a root cause of any of our problems.”

         Coda Hale, Yammer
BEFORE YOU ANSWER,
    CONSIDER….
Can anyone
call RDBMS
a failure?
IF ANYONE
MAKES THESE
TRADE-OFFS
SEEM EASY…..
THEY ARE


LYING!!!
Lie # 1:
          Distributed Systems:
          Desirable Properties
               Highly Available      Fault Tolerant
               Low Latency           Ops-Friendly
               Scalable              Predictable



 Any new app must use
this newfangled NoSQL.
 Wednesday, November 16, 11
Lie # 2:
            Distributed Systems:
            Desirable Properties
                 Highly Available      Fault Tolerant
                 Low Latency           Ops-Friendly
                 Scalable              Predictable



Non-distributed databases
provide these properties.
   Wednesday, November 16, 11
Lie # 3:

MongoDB              Cassandra

CouchDB                Riak

 Neo4J               Voldemort

 Words like “distributed”
 have “relative” meaning.
Lie # 4: The Big One
                       When really I
I say I want:            want:
 1. safety              1. speed
2. scale out            2. speed
 3. speed               3. speed


     The lie developers tell
          themselves.
REMEMBER


     At small scale everything works.
REMEMBER


       At small scale everything works.




 At large scale
 things start to
                                          brea
RIAK – THE TECHNOLOGY
An open-source, globally distributed database with trade-offs
DISTRIBUTE

                     Replicate client data
                     to your Riak clusters
                    anywhere in the world




 Clients run Riak
    clusters in
Your data centers
RIAK DETAILS
Dynamo-inspired
modular – core plus storage (or…)
     1) bitcask
     2) levelDB

Document-oriented AND key-value

Tunable CAP values

Open-source and commercial
BITCASK AND ydir is atomically updated withTRADEOFF st dat
   Whe awriteoccurs, thekeTHE LATENCY theloca
      n                                         tion of thenewe
  still pre nt on disk, but a ne re dswill usethela st ve
           se                ny w a                  te  rsion a ila
                                                                va blein thekeydir. Asw

Tradeoff: Index must fit in memory
  me Latency: All ntua re hash lookup . 1 seek
   Low
  ke
      rgeproce sswill e reads =
      Re ding ava
         a
                        ve lly movetheold va +
                  lueissimple a doe
     ydir, aTRADEOFF: All a
            ndfromthe
                               , nd
                                                 lue
                                       sn’t e r re
                                             ve quiremoretha asingledisk se k. W
                                                             n
                       rewerekeys must fit in memoryposition, a
                                dthedatausingthefile id,        ndsizetha a
                                                                            e
                                                                        t rere
                                                                                  elook
                                                                              turne fro
                                                                                    d
  ma ca s, theope ting syste
      ny se           ra          m’sfilesystemre d-a a ca
                                                 a he d chema sthisamuch fa r ope
                                                                ke              ste
Who cares about latency?

                      SOA
Who cares about latency?




Sometimes high latency looks like an outage to the end user.
Who cares about latency?

                                        I do




        I do                                          I do


 I do
                                               I do          I do

                      I do     I do



                                      I do
                        I do




               I do
USE CASES
“We would not have been          1. Cloud Storage (S3)
able to build this application
without node.js and Riak         2. Low-latency interactive
Search.”
                                    apps
Clipboard
                                 3. Smartphone apps
                                 4. CDN
                                 5. Medical
                                 6. Session Store
                                 7. Gaming
                                 8. Mobile Infrastructure
STUFF
                   CO.
                                 Real-time Search
                            US Retailer Online Inventory




          Mochi – casual interactive gaming




                                      Social Media Networks


“REAL-TIME” APPS
Advantages: fast, reliable, scalable. Distribute to a global audience.
DANISH HEALTH SERVICES
Advantages: Doctors manage prescriptions from highly-available servers and, soon,
smartphones, 95% cheaper than Oracle, easier to run
Telco distributes content to 10 data centers




MULTI-TENANT OBJECT STORAGE
Advantages: build competitive services to Amazon, reliable storage, easy to operate
Write to Riak
                         Next…sync with each other




SMARTPHONE APPS
Advantages: better quality of service for write-intensive smartphone users; reduce
infrastructure costs
A QUICK LITMUS TEST:
ARE THE TRADEOFFS WORTH IT?
USE RIAK IF YOU
NEED…

Predictable latency
Predictable scaling profile
Predictable operations


and if data unavailability costs
           thousands of $$$’s/minute
     or is even more important.
DON’T USE RIAK IF
YOU NEED…


Familiar query patterns
Familiar hiring patterns
Familiar operational problems


and if you
     don’t have a problem right now.
THANK YOU

Basho and Riak at GOTO Stockholm: "Don't Use My Database."

  • 1.
    DON’T USE MY DATABASE PRESENTATIONFOR NOSQL ROADSHOW ON NOSQL USE CASES
  • 2.
  • 3.
    CAN WE PLEASENOT TALK ABOUT BIG DATA?
  • 4.
  • 5.
    Is your data really that critical, dude?
  • 6.
    ON OFFER • what…is NoSQL? … Riak? • who … uses Riak? • why … should I NOT use Riak? (Or other NoSQL for that matter?)
  • 7.
    TRUTH Enter the NoSQL Jungle YOU ARE HERE
  • 8.
    what are youprepared to trade?
  • 9.
    Would you trade…. Yourcurrent, familiar consistency model for a somewhat alien, but perfectly safe, consistency model and better availability?
  • 10.
    Would you trade…. Storagespace for low latency?
  • 11.
    Would you trade…. latenight heroics for high availability?
  • 12.
    Would you trade…. 35years of RDBMS success for technology used by a few companies of which you may have heard?
  • 13.
    Would you trade…. Twomillion developers worldwide for “It’s a piece of plumbing; it has never been a root cause of any of our problems.” Coda Hale, Yammer
  • 14.
    BEFORE YOU ANSWER, CONSIDER….
  • 15.
  • 16.
  • 17.
  • 18.
    Lie # 1: Distributed Systems: Desirable Properties Highly Available Fault Tolerant Low Latency Ops-Friendly Scalable Predictable Any new app must use this newfangled NoSQL. Wednesday, November 16, 11
  • 19.
    Lie # 2: Distributed Systems: Desirable Properties Highly Available Fault Tolerant Low Latency Ops-Friendly Scalable Predictable Non-distributed databases provide these properties. Wednesday, November 16, 11
  • 20.
    Lie # 3: MongoDB Cassandra CouchDB Riak Neo4J Voldemort Words like “distributed” have “relative” meaning.
  • 21.
    Lie # 4:The Big One When really I I say I want: want: 1. safety 1. speed 2. scale out 2. speed 3. speed 3. speed The lie developers tell themselves.
  • 22.
    REMEMBER At small scale everything works.
  • 23.
    REMEMBER At small scale everything works. At large scale things start to brea
  • 24.
    RIAK – THETECHNOLOGY An open-source, globally distributed database with trade-offs
  • 25.
    DISTRIBUTE Replicate client data to your Riak clusters anywhere in the world Clients run Riak clusters in Your data centers
  • 26.
    RIAK DETAILS Dynamo-inspired modular –core plus storage (or…) 1) bitcask 2) levelDB Document-oriented AND key-value Tunable CAP values Open-source and commercial
  • 27.
    BITCASK AND ydiris atomically updated withTRADEOFF st dat Whe awriteoccurs, thekeTHE LATENCY theloca n tion of thenewe still pre nt on disk, but a ne re dswill usethela st ve se ny w a te rsion a ila va blein thekeydir. Asw Tradeoff: Index must fit in memory me Latency: All ntua re hash lookup . 1 seek Low ke rgeproce sswill e reads = Re ding ava a ve lly movetheold va + lueissimple a doe ydir, aTRADEOFF: All a ndfromthe , nd lue sn’t e r re ve quiremoretha asingledisk se k. W n rewerekeys must fit in memoryposition, a dthedatausingthefile id, ndsizetha a e t rere elook turne fro d ma ca s, theope ting syste ny se ra m’sfilesystemre d-a a ca a he d chema sthisamuch fa r ope ke ste
  • 28.
    Who cares aboutlatency? SOA
  • 29.
    Who cares aboutlatency? Sometimes high latency looks like an outage to the end user.
  • 30.
    Who cares aboutlatency? I do I do I do I do I do I do I do I do I do I do I do
  • 31.
    USE CASES “We wouldnot have been 1. Cloud Storage (S3) able to build this application without node.js and Riak 2. Low-latency interactive Search.” apps Clipboard 3. Smartphone apps 4. CDN 5. Medical 6. Session Store 7. Gaming 8. Mobile Infrastructure
  • 32.
    STUFF CO. Real-time Search US Retailer Online Inventory Mochi – casual interactive gaming Social Media Networks “REAL-TIME” APPS Advantages: fast, reliable, scalable. Distribute to a global audience.
  • 33.
    DANISH HEALTH SERVICES Advantages:Doctors manage prescriptions from highly-available servers and, soon, smartphones, 95% cheaper than Oracle, easier to run
  • 34.
    Telco distributes contentto 10 data centers MULTI-TENANT OBJECT STORAGE Advantages: build competitive services to Amazon, reliable storage, easy to operate
  • 35.
    Write to Riak Next…sync with each other SMARTPHONE APPS Advantages: better quality of service for write-intensive smartphone users; reduce infrastructure costs
  • 36.
    A QUICK LITMUSTEST: ARE THE TRADEOFFS WORTH IT?
  • 37.
    USE RIAK IFYOU NEED… Predictable latency Predictable scaling profile Predictable operations and if data unavailability costs thousands of $$$’s/minute or is even more important.
  • 38.
    DON’T USE RIAKIF YOU NEED… Familiar query patterns Familiar hiring patterns Familiar operational problems and if you don’t have a problem right now.
  • 39.