NoSQL Solutions

  Byeongweon Moon
   tasyblue@gmail.com
     2012.01.26
Redis, Couchbase, MongoDB, Membase
Memory Base Cache vs.
       Database
NoSQL
• Collections (versus tables)
• Documents (versus rows)
• Loosely defined fields (versus
  columns)
• Scale out (versus scale up)
• Denormalization (versus
  normalization)
Data Model
•   Relational
•   Key-value
•   Column-oriented
•   Document-oriented
Relational
Key-value
Column-oriented
Document-oriented

    FirstName="Jonathan",
    Address="15 Wanamassa Point
    Road",
K   Children=[
E        {Name:"Michael",Age:10},
Y        {Name:"Jennifer", Age:8},
         {Name:"Samantha", Age:5},
         {Name:"Elena", Age:2}
         ]
Memory Base Cache
•   Weak Persistence
•   Weak Consistency
•   Strong Performance
•   Low Latency
Membase
• Based on Memcached
• Written by C++(Memcached),
  Erlang(Membase)
• Distributed, in-memory key-value
  database management system
• Optimized for storing data behind
  web applications
Membase (cont’)
• Persistence
  – Asynchronously writes data to disk after
    acknowledging write to client
  – Guarantees data consistency
• Replication and failover (server
  failures recoverable in under 100ms)
• Scalability and performance
  –   Distributed object store
  –   Dynamic cluster resizing and rebalancing
  –   Guaranteed data consistency
  –   High sustained throughput
  –   Low, predictable latency
Redis
• In-memory, key-value data store
• Written by ANSI C, support various
  client
Redis (cont’)
• Various Data Models
  – List, Set, Sorted Set, Hash
  – Support atomic operation about data types
• Persistence
  – Data is held in memory but written to disk
   asynchronously
• Replication
  – Master-Slave replication
• Performance
  – Non-blocking I/O. Single threaded
• Publish/Subscribe
Membase vs. Redis
         Membase               Redis
String                 Set, List, Sorted List,
                       Hash..

Master-Master          Master-Slave
Storing, inc/dec API   Various operations
                       includes pop, push,
                       extract …

Web management UI      Console management
                       tool
How to use…
• Normally use …

$key = md5('SELECT * FROM rest_of_sql_statement_goes_here');
if ($memcache->get($key)) {
 return $memcache->get($key);
}
else {
 $result = $query_results_mangled_into_most_likely_an_array
 $memcache->set($key, $result, TRUE, 86400);
 return $result;
}
How to use … (cont’)
• Structured Data (array, struct…)
  – Serialize
       KEY                     VALUE

  user:$user_id        name:문병원|call:하겐다즈|…



  – Normalization
          KEY                    VALUE
  user:$user_id:name             문병원

  user:$user_id:call            하겐다즈
Application Design using
            Membase
• Cache result other than SQL data!
• Use a cache hierarchy
• Update membase as your data
  updates
• Race conditions and stale data
• Pre warm your cache
• Storing lists with keys
• Batch your requests with get_multi
                          From memcached FAQ
Database
•   Persistence
•   Reliable
•   Scalable
•   Distributed
•   Clustered
MongoDB
•   Document-oriented Storage
•   High Write Performance
•   Full index support
•   Master/Slave Replication
•   Support Map/Reduce
•   Auto-Sharding
•   Querying
•   GridFS
•   Written in C++
CouchDB
• Document-oriented Storage
• High Read Performance
• ACID Semantics
• Map/Reduce View and Indexes
• Distributed Architecture with
  Replication
• REST API
• Eventual Consistency
• Written in Erlang
CASE STUDY
Twitter
Facebook Timeline
APPENDIX
CouchDB              MongoDB            MySQL
              Document-Oriented (JS Document-Oriented
Data Model                                              Relational
              ON)                   (BSON)
                                    string, int, doubl
              string, number, boole e, boolean, date, Various Types L
Data Types
              an, array, object     byte array, object ink
                                    , array, others
Large Object
             Yes (attachments)       Yes (GridFS)       BlobZ
s (Files)
Horizontal p
artitioning CouchDB Lounge           Auto-sharding      Partitioning
scheme
                                                       Master-slave, m
              Master-master (with d
                                    Master-slave and r ulti-master, an
Replication   eveloper supplied con
                                    eplica sets        d circular repl
              flict resolution)
                                                       ication

Object(row)
              One large repository   Collection-based   Table-based
Storage
Map/reduce of ja
                    vascript functio Dynamic; object-
Query Method        ns to lazily bui based query lang Dynamic; SQL
                    ld an index per uage
                    query
Secondary Indexes   Yes                Yes                Yes
Atomicity           Single document    Single document    Yes - advanced
                                       Native drivers ;
Interface           REST                                Native drivers
                                       REST add-on
                                       Map/Reduce, serv
Server-side batch d
                    Map/Reduce         er-side javascri Yes (SQL)
ata manipulation
                                       pt
Written in          Erlang             C++                C++
                    Eventually consi   Strong consisten   Strong consiste
                    stent (master-ma   cy. Eventually     ncy. Eventuall
Distributed Consist ster replication   consistent reads   y consistent re
ency Model          with versioning    from secondaries   ads from second
                    and version reco   are available.     aries are avail
                    nciliation)                           able.
References
•   NoSQL solutions: Membase, Redis, CouchDB and MongoDB :
    http://blog.fedecarg.com/2011/01/25/nosql-solutions-membase-redis-
    couchdb-and-mongodb/
•   Visual Guide to NoSQL Systems : http://blog.nahurst.com/visual-guide-to-
    nosql-systems
•   MongoDB, CouchDB, MySQL Compare Grid :
    http://www.mongodb.org/display/DOCS/MongoDB,+CouchDB,+MySQL+Compare+Grid
•   SQL to Mongo Mapping Chart :
    http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart
•   Memcached FAQ :
    http://code.google.com/p/memcached/wiki/FAQ#Simple_query_result_caching
•   Couchbase 2.0 Manual : http://docs.couchbase.org/couchbase-manual-2.0.pdf
•   Building Timeline : Facebook http://www.facebook.com/notes/facebook-
    engineering/building-timeline-scaling-up-to-hold-your-life-
    story/10150468255628920

No sql solutions - 공개용

  • 1.
    NoSQL Solutions Byeongweon Moon tasyblue@gmail.com 2012.01.26
  • 2.
  • 3.
    Memory Base Cachevs. Database
  • 5.
    NoSQL • Collections (versustables) • Documents (versus rows) • Loosely defined fields (versus columns) • Scale out (versus scale up) • Denormalization (versus normalization)
  • 6.
    Data Model • Relational • Key-value • Column-oriented • Document-oriented
  • 7.
  • 8.
  • 9.
  • 10.
    Document-oriented FirstName="Jonathan", Address="15 Wanamassa Point Road", K Children=[ E {Name:"Michael",Age:10}, Y {Name:"Jennifer", Age:8}, {Name:"Samantha", Age:5}, {Name:"Elena", Age:2} ]
  • 11.
    Memory Base Cache • Weak Persistence • Weak Consistency • Strong Performance • Low Latency
  • 12.
    Membase • Based onMemcached • Written by C++(Memcached), Erlang(Membase) • Distributed, in-memory key-value database management system • Optimized for storing data behind web applications
  • 13.
    Membase (cont’) • Persistence – Asynchronously writes data to disk after acknowledging write to client – Guarantees data consistency • Replication and failover (server failures recoverable in under 100ms) • Scalability and performance – Distributed object store – Dynamic cluster resizing and rebalancing – Guaranteed data consistency – High sustained throughput – Low, predictable latency
  • 14.
    Redis • In-memory, key-valuedata store • Written by ANSI C, support various client
  • 15.
    Redis (cont’) • VariousData Models – List, Set, Sorted Set, Hash – Support atomic operation about data types • Persistence – Data is held in memory but written to disk asynchronously • Replication – Master-Slave replication • Performance – Non-blocking I/O. Single threaded • Publish/Subscribe
  • 16.
    Membase vs. Redis Membase Redis String Set, List, Sorted List, Hash.. Master-Master Master-Slave Storing, inc/dec API Various operations includes pop, push, extract … Web management UI Console management tool
  • 17.
    How to use… •Normally use … $key = md5('SELECT * FROM rest_of_sql_statement_goes_here'); if ($memcache->get($key)) { return $memcache->get($key); } else { $result = $query_results_mangled_into_most_likely_an_array $memcache->set($key, $result, TRUE, 86400); return $result; }
  • 18.
    How to use… (cont’) • Structured Data (array, struct…) – Serialize KEY VALUE user:$user_id name:문병원|call:하겐다즈|… – Normalization KEY VALUE user:$user_id:name 문병원 user:$user_id:call 하겐다즈
  • 19.
    Application Design using Membase • Cache result other than SQL data! • Use a cache hierarchy • Update membase as your data updates • Race conditions and stale data • Pre warm your cache • Storing lists with keys • Batch your requests with get_multi From memcached FAQ
  • 20.
    Database • Persistence • Reliable • Scalable • Distributed • Clustered
  • 21.
    MongoDB • Document-oriented Storage • High Write Performance • Full index support • Master/Slave Replication • Support Map/Reduce • Auto-Sharding • Querying • GridFS • Written in C++
  • 22.
    CouchDB • Document-oriented Storage •High Read Performance • ACID Semantics • Map/Reduce View and Indexes • Distributed Architecture with Replication • REST API • Eventual Consistency • Written in Erlang
  • 23.
  • 24.
  • 25.
  • 26.
  • 28.
    CouchDB MongoDB MySQL Document-Oriented (JS Document-Oriented Data Model Relational ON) (BSON) string, int, doubl string, number, boole e, boolean, date, Various Types L Data Types an, array, object byte array, object ink , array, others Large Object Yes (attachments) Yes (GridFS) BlobZ s (Files) Horizontal p artitioning CouchDB Lounge Auto-sharding Partitioning scheme Master-slave, m Master-master (with d Master-slave and r ulti-master, an Replication eveloper supplied con eplica sets d circular repl flict resolution) ication Object(row) One large repository Collection-based Table-based Storage
  • 29.
    Map/reduce of ja vascript functio Dynamic; object- Query Method ns to lazily bui based query lang Dynamic; SQL ld an index per uage query Secondary Indexes Yes Yes Yes Atomicity Single document Single document Yes - advanced Native drivers ; Interface REST Native drivers REST add-on Map/Reduce, serv Server-side batch d Map/Reduce er-side javascri Yes (SQL) ata manipulation pt Written in Erlang C++ C++ Eventually consi Strong consisten Strong consiste stent (master-ma cy. Eventually ncy. Eventuall Distributed Consist ster replication consistent reads y consistent re ency Model with versioning from secondaries ads from second and version reco are available. aries are avail nciliation) able.
  • 30.
    References • NoSQL solutions: Membase, Redis, CouchDB and MongoDB : http://blog.fedecarg.com/2011/01/25/nosql-solutions-membase-redis- couchdb-and-mongodb/ • Visual Guide to NoSQL Systems : http://blog.nahurst.com/visual-guide-to- nosql-systems • MongoDB, CouchDB, MySQL Compare Grid : http://www.mongodb.org/display/DOCS/MongoDB,+CouchDB,+MySQL+Compare+Grid • SQL to Mongo Mapping Chart : http://www.mongodb.org/display/DOCS/SQL+to+Mongo+Mapping+Chart • Memcached FAQ : http://code.google.com/p/memcached/wiki/FAQ#Simple_query_result_caching • Couchbase 2.0 Manual : http://docs.couchbase.org/couchbase-manual-2.0.pdf • Building Timeline : Facebook http://www.facebook.com/notes/facebook- engineering/building-timeline-scaling-up-to-hold-your-life- story/10150468255628920