Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Database migrations: from legacy silo to Couchbase scalability – Connect Silicon Valley 2017

118 views

Published on

Speaker: Matthew Groves

Couchbase Server provides flexibility, scaling, and performance benefits. Do you want these benefits for part or all of your current application, but are stuck with a relational database? In this session, we will discuss how to identify whether or not your app is ready to migrate over to Couchbase. We’ll look at how a document model differs from a relational model. Modeling and access patterns are intertwined, so we’ll look how to structure your keys and data to best serve your application. Finally, we’ll look at how to migrate data between systems.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Database migrations: from legacy silo to Couchbase scalability – Connect Silicon Valley 2017

  1. 1. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. DATABASE MIGRATIONS: FROM LEGACY SILO TO COUCHBASE SCALABILITY Matthew D. Groves (David Segleau) Developer Advocate Couchbase
  2. 2. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 2 About the speaker – Matthew D. Groves Matthew D. Groves Developer Advocate Couchbase (since April 2016) Experience: Web developer ASP.NET, C#, SQL Server Speaker, blogger, author
  3. 3. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 01/ 02/ 03/ 04/ Q & A Identifying the right application Modeling your data Accessing your data Migrating your data 05/
  4. 4. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 1 IDENTIFYING THE RIGHT APPLICATION
  5. 5. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 5 What is Couchbase? • Products: Couchbase Server & Couchbase Mobile • Open source NoSQL, JSON document database • Founded 2010 • 500+ enterprise customers, including 20+ Fortune 100
  6. 6. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 6 Why are they using NoSQL?
  7. 7. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 7 Why are they using NoSQL?
  8. 8. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 8 NoSQL vs. RDBMS • Replace or Complement? -> It depends • Replace: NoSQL can be the operational database of record, or is better as an engagement database • Complement: NoSQL adds perf, scale, and availability to legacy RDBMS • Most customers use RDBMS and NoSQL • NoSQL is adding RDBMS features • Security, Query Language, Analytics • RDBMS is adding NoSQL features • Sharding, JSON, Distributed Processing
  9. 9. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 9 Identifying the right application Examples:  High performance, high availability caching service  Independent application with a narrow scope  Logical or physical service within a large application  Global service that powers multiple applications Service RDBMS Service Service NoSQL Application
  10. 10. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 10 Identifying the right application
  11. 11. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 2 MODELING YOUR DATA
  12. 12. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 12 Demystifying Terminology Relational NoSQL (Couchbase) Failover Cluster Cluster Availability Group Cluster Database Bucket Table Bucket Row (Tuple) Document (JSON) Primary Key Object ID IDENTITY or Sequence Counter Indexed View View SQL N1QL
  13. 13. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 13 Data Modeling Approaches • Minimize data inconsistencies (one item = one location) • Reduced duplicated data • Preserve storage resources • Optimized based on access patterns • Flexible, based on application requirements • Supports clustered architecture • Reduced server overhead NoSQL Relaxed Normalization schema implied by structure fields may be empty, duplicate, or missing Relational Required Normalization schema enforced by DB same fields in all records
  14. 14. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 14 What and Why JSON? • What is JSON? • Schema flexibility • Lightweight data interchange format • Based on JavaScript • Programming language independent • Field names must be unique • Why JSON? • Less verbose • Can represent Objects and Arrays (including nested documents) • No impedance mismatch between a JSON Document and a C#/Java Object
  15. 15. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 15 Modeling your data: Fixed vs. self-describing schema
  16. 16. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 16 Modeling your data: The flexibility of JSON Same document type, different fields • Different types • Optional • On demand Tip: Add a version field to track changes. {"doctype": "user", "docVersion": "1", …} {"doctype": "user", "docVersion": "2", …}
  17. 17. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 17 Modeling your data: Changing the data model Document database • Modify the interface (e.g., HTML5/JS) Relational database • Modify the database schema • Modify the application code (e.g., C#) • Modify the interface (e.g., HTML5/JS)
  18. 18. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 18 Modeling your data: Object IDs Examples • author::shane • author::shane::blogs • blog::nosql_fueled_hadoop • blog::nosql_fueled_hadoop::comments Recommendations • Natural Keys • Human Readable • Deterministic • Semantic
  19. 19. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 19 Modeling your data: Object IDs What about identity columns? 1. Document<Long> nextAuthorIdDoc = bucket.counter(“authorIdCounter”, 1); 2. Long nextAuthorId = nextAuthorIdDoc.content(); 3. String authDocId = “author::” + nextAuthorId; // author::101 Tip: Increment the counter by 10, 20, etc. instead of doing it for every insert.
  20. 20. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 20 Modeling your data: Relationships Top down/”Has”Bottom up/”Belongs To” Author Blog (FK)Blog (FK) Comment (FK) Comment (FK) Author (FK x2) BlogBlog (FK x2) Comment Comment
  21. 21. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 21 Modeling your data: Relationship – Related or Nested
  22. 22. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 22 Modeling your data: Strategies If … Then … Relationship is one-to-one or one-to-many Store related data as nested objects Relationship is many-to-one or many-to-many Store related data as separate documents Data reads are mostly parent fields Store children as separate documents Data reads are mostly parent + child fields Store children as nested objects Data writes are mostly parent or child (not both) Store children as separate documents Data writes are mostly parent and child (both) Store children as nested objects
  23. 23. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 23 Modeling your data: Strategies Thread { "docType": "thread", "comments": [ { "visitor": "Laura Czajkowski", "text": "This blog is amazing!" "replies": [ { "user": "Nic Raboy", "text": "No, it is not." }] } } Blog { "docType": "blog", "author": "author::matt", "title": "Couchbase Wins", "threads": [ "blog::couchbase_wins::threads::001", "blog::couchbase_wins::threads::002" }
  24. 24. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 24 Some JSON Design Choices • Couchbase Server neither enforces nor validates for any particular document structure • Choices that impact JSON document design: – Single Root Attributes vs. Document type – Objects vs. Arrays – Array Element Types – Timestamp Formats – Property Names – Empty and Null Property Values VS Missing Properties – JSON Schema Options • See "Agile document modeling and data structures" from Couchbase Connect16 On-Demand Recordings
  25. 25. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 3 ACCESS YOUR DATA
  26. 26. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 26 Accessing your data: Options Key-Value (CRUD) N1QL (Query) Views (Query) Documents Indexes MapReduce Full Text (Search) Geospatial (Search) Indexes MapReduce
  27. 27. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 27 Accessing your data – N1QL queries: Capabilities Feature SQL N1QL JOIN ✔ ✔ TRANSFORM ✔ ✔ FILTER ✔ ✔ AGGREGATE ✔ ✔ SORT ✔ ✔ SUBQUERIES ✔ ✔ PAGINATION ✔ ✔ OPERATORS ✔ ✔ FUNCTIONS ✔ ✔
  28. 28. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 28 Accessing your data: N1QL queries – referenced data
  29. 29. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 29 Accessing your data: N1QL queries – nested data
  30. 30. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 30 Accessing your data: N1QL queries - indexes Simple Compound Functional Partial
  31. 31. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 31 Couchbase Index Options Index Type Description 1 Primary Index Index on the document key on the whole bucket 2 Simple Index Index on the key-value or document-key 3 Composite Index Index on more than one key-value 4 Functional Index Index on function or expression on key-values 5 Partial Index Index subset of items in the bucket -- uses WHERE clause 6 Array Index Index individual elements of the arrays 7 Memory Optimized Index Index that is pinned in memory – defined when the cluster is configured 8 Covering Index Query able to resolve the query 100% within the index 9 Duplicate Index Ability to create a copy of the index on specific nodes within the cluster, thereby providing load balancing and failover
  32. 32. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 32 Accessing your data: Indexing Considerations Relational Couchbase Indexes are synchronous, index & data are in sync Indexes are asynchronous, index updates lag behind the data, application specifies read consistency Indexes slow down write operations Indexes do not affect write throughput Index load balancing for queries can only be implemented in the application Index load balancing for queries is automatic, based on index signature Indexes contend with other memory usage Memory Optimized indexes are pinned in memory and provides low-latency, high mutation throughput
  33. 33. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 33 Understanding your Query Plan: Explain
  34. 34. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 34 Understanding your Query Plan: Visual
  35. 35. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 35 Accessing your data: Strategies Concept Strategies & Best Practices Key-Value Operations provide the best possible performance • Create an effective key naming strategy • Create an optimized data model Incremental MapReduce (Views) are well suited to aggregation • Ideal for large data sets • Data set can be used to create complex view indexes N1QL queries provide the most flexibility – everything else • Query data regardless of how it is modeled • Remember to create secondary indexes, leverage covering indexes where possible
  36. 36. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 4 MIGRATE YOUR DATA
  37. 37. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 37 So many options! Remember the KISS principle Identify the requirements • ETL vs. Data cleanse vs. Data enrichment • Duration vs. Resources • Data governance
  38. 38. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 38 So many options! Remember the KISS principle Pick your strategy • Batch vs. Incremental • Single threaded vs. multi-threaded
  39. 39. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 39 So many options! Remember the KISS principle Pick your tools • Data migration tools (Informatica, Looker, Talend) • BYO-tool (PHP & Python scripts, Hadoop, Spark) • SQSL (Structured Query Scripting Language)
  40. 40. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 40 So many options! Remember the KISS principle KISS with Couchbase A. Export to CSV; Import as documents; Use N1QL to transform & insert into new bucket (cbimport) B. Use SQL to transform & export; Insert into Couchbase
  41. 41. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 41 So many options! Remember the KISS principle Recommendations • Align with your data model • Plan for failure (bad source data, hardware failure, resource limitations) • Ensure interruptible, restartable, logged, predictable
  42. 42. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 42 How can you sync NoSQL and relational? • 1. Application Code (Manual) • 2. Replication (Automatic) • From NoSQL to relational • From relational to NoSQL Couchbase Kafka Queue Producer Consumer RDBMSDCP Stream RDBMS Handler CouchbaseGoldenGate https://github.com/mahurtado/CouchbaseGoldenGateAdapter https://www.cdata.com/drivers/couchbase
  43. 43. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 43 Tips from the field 1. Consider adding index(es) after migrating data 2. Use SQSL - https://dzone.com/articles/migrate-databases-from-any-relational-engine-to-co 3. If importing JSON, make sure it's valid JSON
  44. 44. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 44 Data Modeling Recap • Pick the right application • Focus on SOA, application/use case specific • Drive data model from data access patterns • Use Document type, Versionid • Create optimized, understandable keys • Weigh nested, referenced or mixed designs • Add indexes: Simple, Compound, Functional, Partial, Array, Covering, Memory Optimized • Match the data access method to requirements • N1QL, Key-value, Views, • Proof of Concept • Focus, Success Criteria, Review Architecture
  45. 45. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. QUESTIONS?
  46. 46. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 46 Want to learn more? Getting Started guide: http://www.couchbase.com/get-started-developing-nosql Download Couchbase software: http://www.couchbase.com/downloads Free Online Training http://training.couchbase.com/online “Why NoSQL” white paper http://www.couchbase.com/nosql-resources/why-nosql
  47. 47. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 47 Additional Resources • General Docs: http://docs.couchbase.com • Developer Portal: http://developer.couchbase.com • Couchbase Labs: https://github.com/couchbaselabs • Query Portal: https://www.couchbase.com/products/n1ql • Sample Applications: https://github.com/couchbaselabs?utf8=✓&q=try- • Blog: http://blog.couchbase.com • Forum: http://forums.couchbase.com
  48. 48. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. 48 Additional Resources – Data Modeling Presentation: Data Modeling with Couchbase Server Connect16 On Demand Recordings • Agile document modeling and data structures • Migrating from relational – Data modeling and access • LINQing to data: Easing the transition from SQL • Tuning for Performance: Indexes and Queries Documentation: Data Modeling with JSON Training class: CD210 Couchbase NoSQL Data Modeling, Querying, and Tuning Using N1QL
  49. 49. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. THANK YOU

×