Successfully reported this slideshow.

Webinar - Making the Shift from Relational to NoSQL

2,775 views

Published on

NoSQL database technology is increasingly chosen as viable alternative to relational databases, particularly for interactive web applications. Developers accustomed to the RDBMS structure and data models need to change their approach when transitioning to NoSQL.

In this webinar you'll learn:

The main characteristics of document-oriented databases
Differences between a distributed document-oriented and relational database
Implications for application development when moving from RDBMS to NoSQL

Published in: Technology
  • Be the first to comment

Webinar - Making the Shift from Relational to NoSQL

  1. 1. Making the Shift from Relational to NoSQL Tugdual “Tug” Grall Technical Evangelist
  2. 2. Agenda • Why moving to NoSQL? • Key differences • How to move? • Modeling the data • Accessing the data • Migrating the data You can download Couchbase at www.couchbase.com/download
  3. 3. Why transition to NoSQL?
  4. 4. You are here because.. You want to build applications
  5. 5. and currently have a relational database
  6. 6. It’s probably gotten a little bit big.. http://www.seoclerks.com/imagedb/2005/BIG-CAT-FOUND-Spoh
  7. 7. Or maybe huge http://www.2pep.com/extreme-funny-stuff/
  8. 8. And complicated
  9. 9. You’re not alone Top two big drivers for NoSQL adoption Lack of flexibility/ rigid schemas Inability to scale out data Performance challenges Cost All of these Other 49% 35% 29% 16% 12% 11% Source: Couchbase Survey, December 2011, n = 1351.
  10. 10. NoSQL catalog Key-Value memcached redis Data Structure Document Column Graph mongoDB couchbase cassandra Cache (memoryonly) Database (memory/disk) Neo4j
  11. 11. Key Differences
  12. 12. Relational vs Document data model Relational data model Document data model Collection of complex documents with arbitrary, nested data formats and varying “record” format. Highly-structured table organization with rigidly-defined data formats and record structure. JSON JSON JSON C1 C2 C3 C4 { }
  13. 13. Document Databases • Each record in the database is a self- describing document • Each document has an independent structure • Documents can be complex • All databases require a unique key • Documents are stored using JSON or XML or their derivatives • Content can be indexed and queried • Offer auto-sharding for scaling and replication for high-availability { “UUID”: “21f7f8de-8051-5b89-86 “Time”: “2011-04-01T13:01:02.42 “Server”: “A2223E”, “Calling Server”: “A2213W”, “Type”: “E100”, “Initiating User”: “dsallings@spy.net”, “Details”: { “IP”: “10.1.1.22”, “API”: “InsertDVDQueueItem”, “Trace”: “cleansed”, “Tags”: [ “SERVER”, “US-West”, “API” ] } }
  14. 14. Example: User Profile Address Info 1 DEN 30303CO 2 MV 94040CA 3 CHI 60609IL User Info KEY First ZIP_idLast 4 NY 10010NY 1 Dipti 2Borkar 2 Joe 2Smith 3 Ali 2Dodson 4 John 3Doe ZIP_id CITY ZIPSTATE 1 2 2 MV 94040CA To get information about specific user, you perform a join across two tables
  15. 15. All data in a single document Document Example: User Profile { “ID”: 1, “FIRST”: “Tug”, “LAST”: “Grall”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA” } JSON = +
  16. 16. User ID First Last Zip 1 Dipti Borkar 94040 2 Joe Smith 94040 3 Ali Dodson 94040 4 Sarah Gorin NW1 5 Bob Young 30303 6 Nancy Baker 10010 7 Ray Jones 31311 8 Lee Chen V5V3M • • • 50000 Doug Moore 04252 50001 Mary White SW195 50002 Lisa Clark 12425 Country ID TEL 3 001 Country ID Country name 001 USA 002 UK 003 Argentina 004 Australia 005 Aruba 006 Austria 007 Brazil 008 Canada 009 Chile • • • 130 Portugal 131 Romania 132 Russia 133 Spain 134 Sweden User ID Photo ID Comment 2 d043 NYC 2 b054 Bday 5 c036 Miami 7 d072 Sunset 5002 e086 Spain Photo Table 001 007 001 133 133 User ID Status ID Text 1 a42 At conf 4 b26 excited 5 c32 hockey 12 d83 Go A’s 5000 e34 sailing Status Table 134 007 008 001 005 Country Table User ID Affl ID Affl Name 2 a42 Cal 4 b96 USC 7 c14 UW 8 e22 Oxford Affiliations Table Country ID 001 001 001 002 Country ID Country ID 001 001 002 001 001 001 008 001 002 001 User Table . . . Making a Change Using RDBMS
  17. 17. Making the Same Change with a Document Database { “ID”: 1, “FIRST”: “Tug”, “LAST”: “Grall”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA”, “STATUS”: { “TEXT”: “At Conf” } } “GEO_LOC”: “134” }, “COUNTRY”: ”USA” Just add information to a document JSON ,}
  18. 18. Where is NoSQL a good fit?
  19. 19. Market Adoption Internet Companies Enterprises • Social Gaming • Ad Networks • Social Networks • Online Business Services • E-Commerce • Online Media • Content Management • Cloud Services • Communications • Retail • Financial Services • Health Care • Automotive/Airline • Agriculture • Consumer Electronics • Business Systems
  20. 20. Market Adoption – Customers Internet Companies Enterprises More than 300 customers -- 5,000 production deployments worldwide
  21. 21. Application Characteristics - Data driven • 3rd party or user defined structure (Twitter feeds) • Support for unlimited data growth (Viral apps) • Data with non-homogenous structure • Need to quickly and often change data structure • Variable length documents • Sparse data records • Hierarchical data Couchbase is a good fit
  22. 22. Application Characteristics - Performance driven • Low latency critical (ex. 1millisecond) • High throughput (ex. 200000 ops / sec) • Large number of users • Unknown demand with sudden growth of users/data • Predominantly direct document access • Read / Mixed / Write heavy workloads Couchbase is a good fit
  23. 23. Common Use Cases Social Gaming • Couchbase stores player and game data • Examples customers include: Zynga • Tapjoy, Ubisoft, Tencent Mobile Apps • Couchbase stores user info and app content • Examples customers include: Kobo, Playtika Ad Targeting • Couchbase stores user information for fast access • Examples customers include: AOL, Mediamind, Convertro Session store • Couchbase Server as a key- value store • Examples customers include: Concur, Sabre User Profile Store • Couchbase Server as a key-value store • Examples customers include: Tunewiki High availability cache • Couchbase Server used as a cache tier replacement • Examples customers include: Orbitz Content & Metadata Store • Couchbase document store with Elastic Search • Examples customers include: McGraw Hill 3rd party data aggregation • Couchbase stores social media and data feeds • Examples customers include: Sambacloud
  24. 24. From SQL to NoSQL
  25. 25. Document Design
  26. 26. Document Design • One document that contains all related data - Data is de-normalized - Better performance and scale - Eliminate client-side joins - But if the document grows continuously or has write contention, it should be split • Separate documents for different object types with cross references - Data duplication is reduced - Objects may not be co-located - Transactions supported only on a document boundary - Most document databases do not support joins
  27. 27. Example: Entities for a Blog • User profile The main pointer into the user data • Blog entries • Badge settings, like a twitter badge • Blog posts Contains the blogs themselves • Blog comments • Comments from other users
  28. 28. Single vs. Multiple Documents • We need to decide the best way to store comments { } Single Multiple vs. Document Comment Comment Comment
  29. 29. { “UUID”: “21f7f8de-8051-5b89-86 “Time”: “2011-04-01T13:01:02.42 “Server”: “A2223E”, “Calling Server”: “A2213W”, “Type”: “E100”, “Initiating User”: “dsallings@spy.net”, “Details”: { “IP”: “10.1.1.22”, “API”: “InsertDVDQueueItem”, “Trace”: “cleansed”, “Tags”: [ “SERVER”, “US-West”, “API” ] } } Blog Document – Option 1 – Single document { “_id”: “Couchbase_Hello_World”, “author”: “dborkar”, “type”: “post” “title”: “Hello World”, “format”: “markdown”, “body”: “Hello from [Couchbase](http://couchbase.com).”, “html”: “<p>Hello from <a href=“http: … “comments”:[ [“format”: “markdown”, “body”:”Awesome post!”], [“format”: “markdown”, “body”:”Like it.” ] ] }
  30. 30. Blog Document – Option 2 - Split into multiple docs { “UUID”: “21f7f8de-8051-5b89-86 “Time”: “2011-04-01T13:01:02.42 “Server”: “A2223E”, “Calling Server”: “A2213W”, “Type”: “E100”, “Initiating User”: “dsallings@spy.net”, “Details”: { “IP”: “10.1.1.22”, “API”: “InsertDVDQueueItem”, “Trace”: “cleansed”, “Tags”: [ “SERVER”, “US-West”, “API” ] } } { “_id”: “Coucbase_Hello_World”, “author”: “tgrall”, “type”: “post” “title”: “Hello World”, “format”: “markdown”, “body”: “Hello from [Couchbase](http://couchbase.com).”, “html”: “<p>Hello from <a href=“http: … “comments”:[ “comment1_Couchbase_Hello_world” ] } { “UUID ”: “2 1 f7 f8 de-8 0 5 1 -5 b89 -8 6 “Time”: “2 0 1 1 -0 4 -0 1 T1 3 :0 1 :0 2.4 2 “Server”: “A2 2 2 3 E”, “Calling Server”: “A2 2 1 3 W”, “Type”: “E1 0 0 ”, “Initiating Us er”: “ds allings @s py.net”, “D etails ”: { “IP”: “1 0 .1 .1 .2 2 ”, “API”: “Ins ertD VD QueueItem”, “Trace”: “cleans ed”, “Tags ”: [ “SERVER”, “US-Wes t”, “API” ] } } { “_id”: “comment1_Couchbase_Hello_World”, “format”: “markdown”, “body”:”Awesome post!” } BLOG DOC COMMENT
  31. 31. Threaded Comments • You can imagine how to take this to a threaded list Blog First comment Reply to comment More Comments List List Advantages • Only fetch the data when you need it • For example, rendering part of a web page • Spread the data and load across the entire cluster
  32. 32. Which one is the best? Case by Case: • How data are created by the application? • How data are read by the application? • Focus on the most common use case in the application Note: changing document “schema” is really easy !
  33. 33. Write your application
  34. 34. Mental Adjustments • In SQL we tend to want to avoid hitting the database as much as possible • In Couchbase, gets and sets on documents are in memory. This is fast. • The key to finding data is the key of the document. • Many newcomers see views as a replacement for key design, because it seems more “SQL” –like.
  35. 35. Code Change • Replace your database calls by Couchbase calls: - Find/Select by ID : couchbase.get(key); - Save/Update : couchbase.set(key, document); - Delete : couchbase.delete(key); • If you need locking for concurrent access use CAS - Check and Swap : Optimistic Locking • All these operations are atomic
  36. 36. Query the data • Create Couchbase views to index your documents • Use views to: - Index documents on primary index - Index documents on secondary index - Query these indexes using range, aggregate, … CRUD Operations MAP() emit() (processed)
  37. 37. Couchbase Views
  38. 38. Data Migration
  39. 39. Data Migration Move your data: • Using ETL or other tools • Own code
  40. 40. Moving Data using an ETL
  41. 41. DEMO Moving Data Using Talend Moving Data Using Java Importer
  42. 42. Conclusion • Move to NoSQL to gain scalability and flexibility • Design your document based on the application logic • Migrate the data easily with tools • Resources: • Java Petstore Migration: http://goo.gl/mzRNqt • Talend Connector : http://couchbase.com/couchbase-server/connectors/talend • Couchbase SQL Importer : https://github.com/tgrall/couchbase-sql-importer Download Couchbase: www.couchbase.com/download
  43. 43. Questions?
  44. 44. Thank you Couchbase NoSQL Document Database tug@couchbase.com

×