Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Yan Cui@theburningmonk
Server-side Developer @
iwi by numbers• 400k+ DAU• ~100m requests/day• 25k+ concurrent users• 1500+ requests/s• 7000+ cache opts/s• 100+ commodity...
Sign Posts• Why NOSQL?• Types of NOSQL DBs• NOSQL In Practice• Q&A
A look at the…CURRENT TRENDS
Digital Universe2000                                                       1.8 ZettaBytes!!16001200 800 400       161 ExaB...
Big Data“…data sets whose size is beyond the ability of commonly used software tools to capture, manage and process within...
Big Data  Unit                     Symbol            BytesKilobyte                     KB               1024Megabyte      ...
Vertical ScalingServer                                  CostPowerEdge T110 II (basic)                                     ...
Horizontal Scaling• Incremental scaling• Cost grows incrementally• Easy to scale down• Linear gains
Hardware Vendor
Here’s an alternative…INTRODUCING NOSQL
NOSQL is …• No SQL• Not Only SQL• A movement away from relational model• Consisted of 4 main types of DBs
NOSQL is …• Hard• A new dimension of trade-offs• CAP theorem
CAP Theorem                           A   Availability:                               Each client can always              ...
NOSQL DBs are …• Specialized for particular use cases• Non-relational• Semi-structured• Horizontally scalable (usually)
Motivations• Horizontal Scalability• Low Latency• Cost• Minimize Downtime
MotivationsUse the right tool for the right job!
RDBMS• CAN scale horizontally (via sharding)• Manual client side hashing• Cross-server queries are difficult• Loses ACIDci...
TYPES OF NOSQL DBS
Types Of NOSQL DBs• Key-Value Store• Document Store• Column Database• Graph Database
Key-Value Store “key”           “value”           101110100110101001100           110100100100010101011morpheus   10101010...
Key-Value Store• It’s a Hash• Basic get/put/delete ops• Crazy fast!• Easy to scale horizontally• Membase, Redis, ORACLE…
Document Store “key”            “document”           {               name : “Morpheus”,morpheus       rank : “Captain”,   ...
Document Store• Document = self-contained piece of data• Semi-structured data• Querying• MongoDB, RavenDB…
Column DatabaseName              Last Name Age    Rank   Occupation Version LanguageThomas Anderson             29Morpheus...
Column Database• Data stored by column• Semi-structured data• Cassandra, HBase, …
Graph Database                               name = “Morpheus”                               rank = “Captain”name = “Thoma...
Graph Database• Nodes, properties, edges• Based on graph theory• Node adjacency instead of indices• Neo4j, VertexDB, …
Real-world use cases for NoSQL DBs...NOSQL IN PRACTICE
Redis• Remote dictionary server• Key-Value store• In-memory, persistent• Data structures
Redis                            Sorted SetsLists        Sets           Hashes
Redis
Redis in Practice #1COUNTERS
Counters• Potentially massive numbers of ops• Valuable data, but not mission critical
Counters• Lots of row contention in SQL• Requires lots of transactions
Counters• Redis has atomic incr/decr  INCR          Increments value by 1  INCRBY        Increments value by given amount ...
Counters           Image by Mike Rohde
Redis in Practice #2RANDOM ITEMS
Random Items• Give user a random article• SQL implementation  – select count(*) from TABLE  – var n = random.Next(0, (coun...
Random Items• Redis has built-in randomize operation  SRANDMEMBER   Gets a random member from a set
Random Items• About sets: – 0 to N unique elements – Unordered – Atomic add
Random Items          Image by Mike Rohde
Redis in Practice #3PRESENCE
Presence• Who’s online?• Needs to be scalable• Pseudo-real time
Presence• Each user ‘checks-in’ once every 3 mins        00:22am 00:23am 00:24am 00:25am 00:26am          A       C       ...
Presence• Redis natively supports set operations  SADD           Add item(s) to a set  SREM           Remove item(s) from ...
Presence           Image by Mike Rohde
Redis in Practice #4LEADERBOARDS
Leaderboards• Gamification• Users ranked by some score
Leaderboards• About sorted sets: – Similar to a set – Every member is associated with a score – Elements are taken in order
Leaderboards• Redis has ‘Sorted Sets’  ZADD        Add/update item(s) to a sorted set  ZRANK       Get item’s rank in a so...
Leaderboards         Image by Mike Rohde
Redis in Practice #5QUEUES
Queues• Redis has push/pop support for lists  LPOP      Remove and get the 1st item in a list  LPUSH     Prepend item(s) t...
Queues• Redis supports ‘blocking’ pop  BLPOP     Remove and get the 1st item in a list, or            block until one is a...
Queues         Image by Mike Rohde
Redis• Supports data structures• No built-in clustering• Master-slave replication• Redis Cluster is on the way...
Before we go...SUMMARIES
Considerations• In memory?• Disk-backed persistence?• Managed? Database As A Service?• Cluster support?
SQL or NoSQL?• Wrong question• What’s your problem? – Transactions – Amount of data – Data structure
http://blog.nahurst.com/visual-guide-to-nosql-systems
Dynamo DB• Fully managed• Provisioned through-put• Predictable cost & performance• SSD-backed• Auto-replicated
Google BigQuery• Game changer for Analytics industry• Analyze billions of rows in seconds• SQL-like query syntax• Predicti...
Scalability• Success can come unexpectedly and quickly• Not just about the DB
Thank You!@theburningmonk
Introduction to NoSQL
Introduction to NoSQL
Introduction to NoSQL
Upcoming SlideShare
Loading in …5
×

Introduction to NoSQL

8,164 views

Published on

A run down on the available NoSQL options and practical examples of using Redis to solve real-world web use cases.

Published in: Technology
  • Be the first to comment

Introduction to NoSQL

  1. 1. Yan Cui@theburningmonk
  2. 2. Server-side Developer @
  3. 3. iwi by numbers• 400k+ DAU• ~100m requests/day• 25k+ concurrent users• 1500+ requests/s• 7000+ cache opts/s• 100+ commodity servers (EC2 small instance)• 75ms average latency
  4. 4. Sign Posts• Why NOSQL?• Types of NOSQL DBs• NOSQL In Practice• Q&A
  5. 5. A look at the…CURRENT TRENDS
  6. 6. Digital Universe2000 1.8 ZettaBytes!!16001200 800 400 161 ExaBytes 0 2006 2007 2008 2009 2010 2011
  7. 7. Big Data“…data sets whose size is beyond the ability of commonly used software tools to capture, manage and process within a tolerable elapsed time…”
  8. 8. Big Data Unit Symbol BytesKilobyte KB 1024Megabyte MB 1048576Gigabyte GB 1073741824Terabyte TB 1099511627776 PAIN-O-MeterPetabyte PB 1125899906842624Exabyte EB 1152921504606846976Zettabyte ZB 1180591620717411303424Yottabyte YB 1208925819614629174706176
  9. 9. Vertical ScalingServer CostPowerEdge T110 II (basic) $1,3508 GB, 3.1 Ghz Quad 4TPowerEdge T110 II (basic) $12,10332 GB, 3.4 Ghz Quad 8TPowerEdge C2100 $19,960192 GB, 2 x 3 GhzIBM System x3850 X5 $646,6052048 GB, 8 x 2.4 GhzBlue Gene/P $1,300,00014 teraflops, 4096 CPUsK Computer (fastest super computer) $10,000,00010 petaflops, 705,024 cores, 1,377 TB annual operating cost
  10. 10. Horizontal Scaling• Incremental scaling• Cost grows incrementally• Easy to scale down• Linear gains
  11. 11. Hardware Vendor
  12. 12. Here’s an alternative…INTRODUCING NOSQL
  13. 13. NOSQL is …• No SQL• Not Only SQL• A movement away from relational model• Consisted of 4 main types of DBs
  14. 14. NOSQL is …• Hard• A new dimension of trade-offs• CAP theorem
  15. 15. CAP Theorem A Availability: Each client can always read and write dataConsistency: Partition Tolerant:All clients have the System works despitesame view of data network partitions C P
  16. 16. NOSQL DBs are …• Specialized for particular use cases• Non-relational• Semi-structured• Horizontally scalable (usually)
  17. 17. Motivations• Horizontal Scalability• Low Latency• Cost• Minimize Downtime
  18. 18. MotivationsUse the right tool for the right job!
  19. 19. RDBMS• CAN scale horizontally (via sharding)• Manual client side hashing• Cross-server queries are difficult• Loses ACIDcity• Schema update = PAIN
  20. 20. TYPES OF NOSQL DBS
  21. 21. Types Of NOSQL DBs• Key-Value Store• Document Store• Column Database• Graph Database
  22. 22. Key-Value Store “key” “value” 101110100110101001100 110100100100010101011morpheus 101010101010110000101 000110011111010110000 101000111110001100000
  23. 23. Key-Value Store• It’s a Hash• Basic get/put/delete ops• Crazy fast!• Easy to scale horizontally• Membase, Redis, ORACLE…
  24. 24. Document Store “key” “document” { name : “Morpheus”,morpheus rank : “Captain”, occupation: “Total badass” }
  25. 25. Document Store• Document = self-contained piece of data• Semi-structured data• Querying• MongoDB, RavenDB…
  26. 26. Column DatabaseName Last Name Age Rank Occupation Version LanguageThomas Anderson 29Morpheus Captain Total badassCypher ReaganAgent Smith 1.0b C++The Architect
  27. 27. Column Database• Data stored by column• Semi-structured data• Cassandra, HBase, …
  28. 28. Graph Database name = “Morpheus” rank = “Captain”name = “Thomas Anderson” occupation = “Total badass” name = “Cypher”age = 29 last name = “Reagan” name = “The Architect” 7 3 1 9 disclosure = public disclosure = secret age = 3 days age = 6 months CODED_BY 2 5 name = “Trinity” name = “Agent Smith” version = 1.0b language = C++
  29. 29. Graph Database• Nodes, properties, edges• Based on graph theory• Node adjacency instead of indices• Neo4j, VertexDB, …
  30. 30. Real-world use cases for NoSQL DBs...NOSQL IN PRACTICE
  31. 31. Redis• Remote dictionary server• Key-Value store• In-memory, persistent• Data structures
  32. 32. Redis Sorted SetsLists Sets Hashes
  33. 33. Redis
  34. 34. Redis in Practice #1COUNTERS
  35. 35. Counters• Potentially massive numbers of ops• Valuable data, but not mission critical
  36. 36. Counters• Lots of row contention in SQL• Requires lots of transactions
  37. 37. Counters• Redis has atomic incr/decr INCR Increments value by 1 INCRBY Increments value by given amount DECR Decrements value by 1 DECRBY Decrements value by given amount
  38. 38. Counters Image by Mike Rohde
  39. 39. Redis in Practice #2RANDOM ITEMS
  40. 40. Random Items• Give user a random article• SQL implementation – select count(*) from TABLE – var n = random.Next(0, (count – 1)) – select * from TABLE where primary_key = n – inefficient, complex
  41. 41. Random Items• Redis has built-in randomize operation SRANDMEMBER Gets a random member from a set
  42. 42. Random Items• About sets: – 0 to N unique elements – Unordered – Atomic add
  43. 43. Random Items Image by Mike Rohde
  44. 44. Redis in Practice #3PRESENCE
  45. 45. Presence• Who’s online?• Needs to be scalable• Pseudo-real time
  46. 46. Presence• Each user ‘checks-in’ once every 3 mins 00:22am 00:23am 00:24am 00:25am 00:26am A C E A ? B D A, C, D & E are online at 00:26am
  47. 47. Presence• Redis natively supports set operations SADD Add item(s) to a set SREM Remove item(s) from a set SINTER Intersect multiple sets SUNION Union multiple sets SRANDMEMBER Gets a random member from a set ... ...
  48. 48. Presence Image by Mike Rohde
  49. 49. Redis in Practice #4LEADERBOARDS
  50. 50. Leaderboards• Gamification• Users ranked by some score
  51. 51. Leaderboards• About sorted sets: – Similar to a set – Every member is associated with a score – Elements are taken in order
  52. 52. Leaderboards• Redis has ‘Sorted Sets’ ZADD Add/update item(s) to a sorted set ZRANK Get item’s rank in a sorted set (low -> high) ZREVRANK Get item’s rank in a sorted set (high -> low) ZRANGE Get range of items, by rank (low -> high) ZREVRANGE Get range of items, by rank (high -> low) ... ...
  53. 53. Leaderboards Image by Mike Rohde
  54. 54. Redis in Practice #5QUEUES
  55. 55. Queues• Redis has push/pop support for lists LPOP Remove and get the 1st item in a list LPUSH Prepend item(s) to a list RPOP Remove and get the last item in a list RPUSH Append item(s) to a list• Allows you to use list as queue/stack
  56. 56. Queues• Redis supports ‘blocking’ pop BLPOP Remove and get the 1st item in a list, or block until one is available BRPOP Remove and get the last item in a list, or block until one is available• Message queues without polling!
  57. 57. Queues Image by Mike Rohde
  58. 58. Redis• Supports data structures• No built-in clustering• Master-slave replication• Redis Cluster is on the way...
  59. 59. Before we go...SUMMARIES
  60. 60. Considerations• In memory?• Disk-backed persistence?• Managed? Database As A Service?• Cluster support?
  61. 61. SQL or NoSQL?• Wrong question• What’s your problem? – Transactions – Amount of data – Data structure
  62. 62. http://blog.nahurst.com/visual-guide-to-nosql-systems
  63. 63. Dynamo DB• Fully managed• Provisioned through-put• Predictable cost & performance• SSD-backed• Auto-replicated
  64. 64. Google BigQuery• Game changer for Analytics industry• Analyze billions of rows in seconds• SQL-like query syntax• Prediction API• NOT a database system
  65. 65. Scalability• Success can come unexpectedly and quickly• Not just about the DB
  66. 66. Thank You!@theburningmonk

×