Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

No sql for sql professionals

545 views

Published on

Published in: Software, Technology
  • Be the first to comment

  • Be the first to like this

No sql for sql professionals

  1. 1. NoSQL for SQL Professionals Don Pinto Product Manager
  2. 2. NoSQL + + More Data More Users Interactive Apps Macro Trends Driving NoSQL Technology
  3. 3. Lacking Solutions, Users Forced to Invent Dynamo October 2007 Cassandra August 2008 Voldemort February 2009November 2006 Bigtable Very few organizations can build and maintain database software technology. But every organization building interactive web applications needs this technology.
  4. 4. What Is Biggest Data Management Problem Driving Use of NoSQL in Coming Year? Lack of flexibility/ rigid schemas Inability to scale out data Performance challenges Cost All of these Other 49% 35% 29% 16% 12% 11% Source: Couchbase Survey, December 2011, n = 1351.
  5. 5. Relational vs. NoSQL
  6. 6. Key Differences
  7. 7. RDBMS Scales Up Get a bigger, more complex server Users Application Scales Out Just add more commodity web servers Users System Cost Application Performance Relational Technology Scales Up Relational Database Web/App Server Tier Expensive and disruptive sharding, doesn’t perform at web scale System Cost Application Performance Won’t scale beyond this point
  8. 8. NoSQL Database Scales Out Like App Tier NoSQL Database Scales Out Cost and performance mirrors app tier Users Scaling out flattens the cost and performance curves Couchbase Distributed Data Store Application Scales Out Just add more commodity web servers Users System Cost Application Performance Application Performance System Cost Web/App Server Tier
  9. 9. Relational vs Document Data Model Relational data model Document data model Collection of complex documents with arbitrary, nested data formats and varying “record” format. Highly-structured table organization with rigidly-defined data formats and record structure. C1 C2 C3 C4 JSON JSON JSON { }
  10. 10. RDBMS Example: User Profile Address Info 1 DEN 30303CO 2 MV 94040CA 3 CHI 60609IL User Info KEY First ZIP_idLast 4 NY 10010NY 1 Dipti 2Borkar 2 Joe 2Smith 3 Ali 2Dodson 4 John 3Doe ZIP_id CITY ZIPSTATE 1 2 2 MV 94040CA To get information about specific user, you perform a join across two tables
  11. 11. Document Example: User Profile All data in a single document { “ID”: 1, “FIRST”: “Dipti”, “LAST”: “Borkar”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA” } JSON = +
  12. 12. Making a Change Using RDBMS User ID First Last Zip 1 Dipti Borkar 94040 2 Joe Smith 94040 3 Ali Dodson 94040 4 Sarah Gorin NW1 5 Bob Young 30303 6 Nancy Baker 10010 7 Ray Jones 31311 8 Lee Chen V5V3M • • • 50000 Doug Moore 04252 50001 Mary White SW195 50002 Lisa Clark 12425 Country ID TEL 3 001 Country ID Country name 001 USA 002 UK 003 Argentina 004 Australia 005 Aruba 006 Austria 007 Brazil 008 Canada 009 Chile • • • 130 Portugal 131 Romania 132 Russia 133 Spain 134 Sweden User ID Photo ID Comment 2 d043 NYC 2 b054 Bday 5 c036 Miami 7 d072 Sunset 5002 e086 Spain Photo Table 001 007 001 133 133 User ID Status ID Text 1 a42 At conf 4 b26 excited 5 c32 hockey 12 d83 Go A’s 5000 e34 sailing Status Table 134 007 008 001 005 Country Table User ID Affl ID Affl Name 2 a42 Cal 4 b96 USC 7 c14 UW 8 e22 Oxford Affiliations Table Country ID 001 001 001 002 Country ID Country ID 001 001 002 001 001 001 008 001 002 001 User Table . . .
  13. 13. Making the Same Change With a Document DB { “ID”: 1, “FIRST”: “Don”, “LAST”: “Pinto”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA”, “STATUS”: { “TEXT”: “At Conf” } } “GEO_LOC”: “134” }, “COUNTRY”: ”USA” Just add information to a document JSON ,}
  14. 14. User ID First Last Zip 1 Frank Wiegel 94040 2 Joe Smith 94040 3 Ali Dodson 94040 4 Sarah Gorin NW1 5 Bob Young 30303 6 Nancy Baker 10010 7 Ray Jones 31311 8 Lee Chen V5V3 • • • 5000 Doug Moore 04252 5001 Mary White 41694 5002 Lisa Clark 12425 User ID Photo ID Comment 2 d043 NYC 2 b054 Bday 5 c036 Miami 7 d072 Sunset 5002 e086 Spain User Table Photo Table User ID Status ID Text 1 a42 At conf 4 b26 excited 5 c32 hockey 12 d83 Go A’s 5000 e34 sailing Status Table User ID Affiliations ID Affiliations Name 2 a42 Cal 4 b96 USC 7 c14 UW 8 e22 Oxford Affiliations Table Relational vs Document Performance 1 Frank 94040Weigel a421 At conf 5 Bob 30303Young c0365 Miami 4 Sarah NW1Gorin b264 hockey JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } JSON { } 8 Lee V5V3Chen e228 Oxford5002 Lisa 12425Clark e0865002 Spain c0325 excited Faster response times and higher throughput
  15. 15. Document Databases Easily Accommodate Unstructured Data { “ID”: 1, “NAME”: “Fairmont San Francisco”, “DESCRIPTION”: “Historic grandeur…”, “AVG_REVIEWER_SCORE”: “4.3”, “AMENITY”: {“TYPE”: “gym”, DESCRIPTION: “fitness center” }, {“TYPE”: “wifi”, “DESCRIPTION”: “free wifi”}, “RATE_TYPE”: “nightly”, “PRICE”: “$199”, “REVIEWS”: [“review_1”, “review_2”], “ATTRACTIONS”: “Chinatown”, } JSON { “ID”: 2, “NAME”: “W San Francisco”, “DESCRIPTION”: “Chic, hip accommodations..”, “AVG_REVIEWER_SCORE”: “4.0”, “AMENITY”: {“TYPE”: “spa”, DESCRIPTION: “Bliss Spa” }, {“TYPE”: “wifi”, “DESCRIPTION”: “free wifi”}, {“TYPE”: “dining”, “DESCRIPTION”: “bar/lounge”}, “RATE_TYPE”: “nightly”, “PRICE”: “$194”, “REVIEWS”: [“review_1”, “review_2”], } JSON Hotels
  16. 16. Document Databases Easily Accommodate Unstructured Data { “ID”: 1, “NAME”: “Fairmont San Francisco”, …} JSON { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel & Location”, “WOULD RECOMMEND”: “yes”, “AVG_REVIEWER_SCORE”: “5”, “REVIEW_DATE”: “May 29, 2013”, “USER_PROFILE_ID”: “271”, } JSON { “REVIEW_ID”: 2, “REVIEW”: “Nice, but a few kinks”, “WOULD RECOMMEND”: “yes”, “AVG_REVIEWER_SCORE”: “4”, “REVIEW_DATE”: “May 22, 2013”, “USER_PROFILE_ID”: “923”, } JSON Hotels Reviews
  17. 17. Document Databases Easily Accommodate Unstructured Data { “ID”: 1, “NAME”: “Fairmont San Francisco”, …} JSON Hotel Descriptions Reviews { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel…”, …} JSON { “REVIEW_ID”: 2, “REVIEW”: “Nice, but …”, …} JSON User Profiles { “USER_ID”: 1, “DISPLAY_NAME ”: “Ted’s Trip Experience”, “CITY”: “Saratoga”, “STATE”: “California”, “NUM_OF_REVIEWS”: “8”, } JSON { “USER_ID”: 1, “DISPLAY_NAME ”: “WhatWhat567”, “CITY”: “Kansas City”, “STATE”: “MO”, “NUM_OF_REVIEWS”: “3”, } JSON
  18. 18. Document Databases Easily Accommodate Unstructured Data { “ID”: 1, “NAME”: “Fairmont San Francisco”, …} JSON Hotel Descriptions Reviews { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel…”, …} JSON { “REVIEW_ID”: 2, “REVIEW”: “Nice, but …”, …} JSON User Profiles { “USER_ID”: 1, “DISPLAY”: “Ted’s Trip…”, …} JSON { “USER_ID”: 2, “DISPLAY”: “WhatWhat …”, …} JSON Document IDs associates related objects Hotels points to reviews Reviews points to users
  19. 19. Indexing with Document Databases Index on AVG_REVIEWER_SCORE
  20. 20. Indexing with Document Databases Index on AVG_REVIEWER_SCORE … 4.0, doc_id 4.0, doc_id 4.1, doc_id 4.3, doc_id 5.0, doc_id … Index
  21. 21. Querying with Document Databases Query on AVG_REVIEWER_SCORE … 3.4, doc_id 3.4, doc_id 3.5, doc_id 3.6, doc_id 3.7, doc_id 3.8, doc_id 4.0, doc_id 4.1, doc_id 4.3, doc_id 4.5, doc_id 4.7, doc_id 4.9, doc_id 5.0, doc_id … 5.0, doc_id Index Matching ResultsQuery
  22. 22. Flavors of NoSQL
  23. 23. NoSQL catalog Key-Value memcached redis Data Structure Document Column Graph mongoDB couchbase cassandra Cache (memoryonly) Database (memory/disk) Neo4j
  24. 24. Couchbase Open Source Project • Leading NoSQL database project focused on distributed database technology and surrounding ecosystem • Supports both key-value and document-oriented use cases • All components are available under the Apache 2.0 Public License • Obtained as packaged software in both enterprise and community editions. Couchbase Open Source Project
  25. 25. Easy Scalability Consistent High Performance Always On 24x365 Grow cluster without application changes, without downtime with a single click Consistent sub-millisecond read and write response times with consistent high throughput No downtime for software upgrades, hardware maintenance, etc. JSON JSON JSON JSONJSON Flexible Data Model JSON document model with no fixed schema. Couchbase Server
  26. 26. Couchbase Server Architecture Heartbeat Processmonitor Globalsingletonsupervisor Configurationmanager on each node Rebalanceorchestrator Nodehealthmonitor one per cluster vBucketstateandreplicationmanager http RESTmanagementAPI/WebUI HTTP 8091 Erlang port mapper 4369 Distributed Erlang 21100 - 21199 Erlang/OTP storage interface Couchbase EP Engine 11210 Memcapable 2.0 Moxi 11211 Memcapable 1.0 Memcached New Persistence Layer 8092 Query APIQueryEngine Data Manager Cluster Manager
  27. 27. Couchbase Server Architecture Replication, Rebalance, Shard State Manager REST management API/Web UI 8091 Admin Console Erlang/OTP 11210 / 11211 Data access ports Object-managed Cache Multi-threaded Persistence Engine 8092 Query APIQueryEngine http Data Manager Cluster Manager
  28. 28. Where is NoSQL a good fit?
  29. 29. Market Adoption Internet Companies Enterprises • Communications • Retail • Financial Services • Health Care • Automotive/Airline • Agriculture • Consumer Electronics • Business Systems • Social Gaming • Ad Networks • Social Networks • Online Business Services • E-Commerce • Online Media • Content Management • Cloud Services
  30. 30. Application Characteristics - Data driven • 3rd party or user defined structure (Twitter feeds) • Support for unlimited data growth (Viral apps) • Data with non-homogenous structure • Need to quickly and often change data structure • Variable length documents • Sparse data records • Hierarchical data NoSQL is a good fit
  31. 31. Application Characteristics - Performance driven • Low latency critical (ex. 1millisecond) • High throughput (ex. 200000 ops / sec) • Large number of users • Unknown demand with sudden growth of users/data • Predominantly direct document access • Read / Mixed / Write heavy workloads NoSQL is a good fit
  32. 32. Q & A
  33. 33. Thank you! don@couchbase.com @nosqldon www.linkedin.com/in/donpinto/
  34. 34. Extra - Couchbase Operations
  35. 35. 33 2 Single node - Couchbase Write Operation Managed Cache DiskQueue Disk Replication Queue App Server Couchbase Server Node Doc 1Doc 1 Doc 1 To other node
  36. 36. 33 2 Single node - Couchbase Update Operation Managed Cache DiskQueue Replication Queue App Server Doc 1’ Doc 1 Doc 1’Doc 1 Doc 1’ Disk To other node Couchbase Server Node
  37. 37. GET Doc1 33 2 Single node - Couchbase Read Operation DiskQueue Replication Queue App Server Doc 1 Doc 1Doc 1 Managed Cache Disk To other node Couchbase Server Node
  38. 38. 33 2 Single node – Couchbase Cache Miss 2 DiskQueue Replication Queue App Server Couchbase Server Node Doc 1 Doc 3Doc 5 Doc 2Doc 4 Doc 6 Doc 5 Doc 4 Doc 3 Doc 2 Doc 4 GET Doc1 Doc 1 Doc 1 Managed Cache Disk To other node
  39. 39. COUCHBASE SERVER CLUSTER Basic Operation • Docs distributed evenly across servers • Each server stores both active and replica docs Only one server active at a time • Client library provides app with simple interface to database • Cluster map provides map to which server doc is on App never needs to know • App reads, writes, updates docs • Multiple app servers can access same document at same time User Configured Replica Count = 1 READ/WRITE/UPDATE ACTIVE Doc 5 Doc 2 Doc Doc Doc SERVER 1 ACTIVE Doc 4 Doc 7 Doc Doc Doc SERVER 2 Doc 8 ACTIVE Doc 1 Doc 2 Doc Doc Doc REPLICA Doc 4 Doc 1 Doc 8 Doc Doc Doc REPLICA Doc 6 Doc 3 Doc 2 Doc Doc Doc REPLICA Doc 7 Doc 9 Doc 5 Doc Doc Doc SERVER 3 Doc 6 APP SERVER 1 COUCHBASE Client Library CLUSTER MAP COUCHBASE Client Library CLUSTER MAP APP SERVER 2 Doc 9
  40. 40. Add Nodes to Cluster • Two servers added One-click operation • Docs automatically rebalanced across cluster Even distribution of docs Minimum doc movement • Cluster map updated • App database calls now distributed over larger number of servers REPLICA ACTIVE Doc 5 Doc 2 Doc Doc Doc 4 Doc 1 Doc Doc SERVER 1 REPLICA ACTIVE Doc 4 Doc 7 Doc Doc Doc 6 Doc 3 Doc Doc SERVER 2 REPLICA ACTIVE Doc 1 Doc 2 Doc Doc Doc 7 Doc 9 Doc Doc SERVER 3 SERVER 4 SERVER 5 REPLICA ACTIVE REPLICA ACTIVE Doc Doc 8 Doc Doc 9 Doc Doc 2 Doc Doc 8 Doc Doc 5 Doc Doc 6 READ/WRITE/UPDATE READ/WRITE/UPDATE APP SERVER 1 COUCHBASE Client Library CLUSTER MAP COUCHBASE Client Library CLUSTER MAP APP SERVER 2 COUCHBASE SERVER CLUSTER User Configured Replica Count = 1
  41. 41. Fail Over Node REPLICA ACTIVE Doc 5 Doc 2 Doc Doc Doc 4 Doc 1 Doc Doc SERVER 1 REPLICA ACTIVE Doc 4 Doc 7 Doc Doc Doc 6 Doc 3 Doc Doc SERVER 2 REPLICA ACTIVE Doc 1 Doc 2 Doc Doc Doc 7 Doc 9 Doc Doc SERVER 3 SERVER 4 SERVER 5 REPLICA ACTIVE REPLICA ACTIVE Doc 9 Doc 8 Doc Doc 6 Doc Doc Doc 5 Doc Doc 2 Doc 8 Doc Doc • App servers accessing docs • Requests to Server 3 fail • Cluster detects server failed Promotes replicas of docs to active Updates cluster map • Requests for docs now go to appropriate server • Typically rebalance would follow Doc Doc 1 Doc 3 APP SERVER 1 COUCHBASE Client Library CLUSTER MAP COUCHBASE Client Library CLUSTER MAP APP SERVER 2 User Configured Replica Count = 1 COUCHBASE SERVER CLUSTER

×