SlideShare a Scribd company logo
1 of 21
Download to read offline
Beware of your Hype – Value store !

           Ignite Velocity 2009




   Jérémie BORDIER -


    @ahfeel / jeremie.bordier@exalead.com
Get
the
Hype
!


•  Rela&onal
DBMS
are
so
1990…

•  So
simple,
so
fast,
looks
powerful
!


   BerkeleyDB                   Persevere
                Cassandra
 Hypertable                    Project Voldemort
                 Redis    Dynomite
  SimpleDB MemcacheDB
                                           Dynamo
                            Scalaris
Tokyo Tyrant      CouchDB              MongoDB
                                                 2
10
000+
writes
/
sec
!


•  RDBMS
only
performs
±300
w
/
sec…
wC
?




   Let’s do some simple math !



                                             3
State
of
the
art
mathema8cs


•  Average
SATA
II
disk
seek
&me:
±6ms

•  Very
good
SCSI
disk:
±3ms




          1 second / 3ms =
        ±300 REAL writes / sec

                                          4
Eventual
Persistency


•  Doesn’t
sync
the
file
system

•  Keep
that
in
mind
!


 Your
servers
WILL
crash,
you
WILL
lose
data





                                                5
Don’t
care
about
writes…


•  If
can
do
30
000+
lookup
/
sec
!





                Not for long !


                                       6
Eventual
memory


•  Relies
on
B+Trees

•  Map
all
the
data
structures
in
memory




     What if your data goes too big ?



                                            7
Lookup…


•  get(“elevator”)

  –  Visit
8
nodes

  –  Read
the
data
from
disk




         A                                 R
    E          S          O               Data:
         L                                 42

                      V         T
               E          A         O
                                S         M
                                                  8
Why
elevator
?


•  American
elevators
are
wayyyy
too
fast
!

•  Feels
like
a
NASA
training




       Makes
me
want
to
throw
up
:(
                                  




                                               9
Hardware
limit
!


   Up to 9 RANDOM I/Os per lookup !

   1 second / (9 * 3ms) = 37 get / sec


•  On
very
good
hardware
!



                                         10
We
did
that
too
:)


•  We
had
similar
algorithms

•  Encountered
horrible
perf
decreases…



              REDUCE I/O !

      Ensure only 1 I/O per lookup.

                                           11
Don’t
trust
benchmarks


•  A
few
million
entries
isn’t
enough

•  You’re
benchmarking
your
Disk
/
OS
   
cache
:)





 (Common, DON’T BENCHMARK IN RUBY…)

                                         12
Compare
what’s
comparable


•  Distributed
column
stores

  –  BigTable
like
systems

•  Key
value
stores

  –  Tokyo,
Dynamo
like
systems





                                    13
Distributed
column
stores


                Google Megastore
                          Persistent
Hypertable                                          SimpleDB

             Eventually                Replicated
             Persistent


                Mature             Distributed

Cassandra                                           HBase

                                                               14
Key
–
value
stores

                                       Persevere
   BerkeleyDB
                             Mutable
                                                          MongoDB
                Persistent               Replicated
                                                           Dynomite
            Embedded                        Distributed


Tokyo Cabinet                              Eventually
                                                             Scalaris
                Mature
                                           Persistent

    MemcacheDB               Immutable                     Redis

                                             CouchDB
                      Voldemort                                    15
How
to
choose
?


•  Maturity
is
priceless

•  Most
suitable
stores:

  –  Persistent:
BerkeleyDB,
MySQL

  –  Ev.
Persistent:
Tokyo
Cabinet

  –  Ev.
Persistent
+
Distributed
+
…
:
Voldemort

  –  Distributed
column
store:
Cassandra



                                                     16
You
are
not
Google

                              (Well, not all of you )


•  Build
something

•  Make
it
work

•  Think
about
scaling

•  Think
about
being
hyped



                  Why ?

                                                         17
Key
value
pain…


•  You
will
end
up
crossing
data.

•  Doing
joins
with
KV
stores
?





              Only for ninjas !


                                         18
Query
8me
joins



•  Coding
what
RDBMS
are
made
for
?
…

•  Slow
!





         (you shouldn’t do this…)
                                         19
Build
8me
schema
flaTening



•  Not
flexible
!

•  Needs
Map
Reduce
(Hadoop…)
to
scale





         Think before doing this :)
                                          20
And
finally..


•  HRD
hard
drive:
160
000
random
IO/sec
!!!

•  Yahoo
announced
open
sourcing
Sherpa
!

•  Resources

  –    hip://developer.yahoo.net/blog/archives/2009/06/nosql_meetup.html

  –    hip://metabrew.com/ar&cle/an&‐rdbms‐a‐list‐of‐distributed‐key‐value‐stores/

  –    hip://www.ryanpark.org/2008/04/top‐10‐avoid‐the‐simpledb‐hype.html

  –    hip://project‐voldemort.com/blog/2009/06/building‐a‐1‐tb‐data‐cycle‐at
       ‐linkedin‐with‐hadoop‐and‐project‐voldemort/




                      Thanks !
                  Contact: @ahfeel :)                                                 21

More Related Content

Similar to Beware of your Hype Value Stores

What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
jbellis
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
Don Demcsak
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
Don Demcsak
 
Kerry osborne hadoop meets exadata
Kerry osborne hadoop meets exadataKerry osborne hadoop meets exadata
Kerry osborne hadoop meets exadata
Enkitec
 

Similar to Beware of your Hype Value Stores (20)

What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010What every developer should know about database scalability, PyCon 2010
What every developer should know about database scalability, PyCon 2010
 
Games for the Masses: Scaling Rails to the Extreme
Games for the Masses: Scaling Rails to the ExtremeGames for the Masses: Scaling Rails to the Extreme
Games for the Masses: Scaling Rails to the Extreme
 
What Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database ScalabilityWhat Every Developer Should Know About Database Scalability
What Every Developer Should Know About Database Scalability
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQL
 
NoSQL
NoSQLNoSQL
NoSQL
 
The NoSQL Ecosystem
The NoSQL Ecosystem The NoSQL Ecosystem
The NoSQL Ecosystem
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Intro to Big Data and NoSQL
Intro to Big Data and NoSQLIntro to Big Data and NoSQL
Intro to Big Data and NoSQL
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Java
 
Chris Lea - What does NoSQL Mean for You
Chris Lea - What does NoSQL Mean for YouChris Lea - What does NoSQL Mean for You
Chris Lea - What does NoSQL Mean for You
 
第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra第17回Cassandra勉強会: MyCassandra
第17回Cassandra勉強会: MyCassandra
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svcc
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
 
Spring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_dataSpring one2gx2010 spring-nonrelational_data
Spring one2gx2010 spring-nonrelational_data
 
Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)Big Data (NJ SQL Server User Group)
Big Data (NJ SQL Server User Group)
 
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
[db tech showcase Tokyo 2016] E32: My Life as a Disruptor by Jim Starkey
 
Kerry osborne hadoop meets exadata
Kerry osborne hadoop meets exadataKerry osborne hadoop meets exadata
Kerry osborne hadoop meets exadata
 
High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2High Scalability Toronto: Meetup #2
High Scalability Toronto: Meetup #2
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Beware of your Hype Value Stores