Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enterprise Architect's Perspective of Couchbase with N1QL: Couchbase Connect 2015

933 views

Published on

Enterprise architects have to decide on the database platform that will meet various requirements: performance and scalability on one side, ease of data modeling, agile development on the other, elasticity and flexibility to handle change easily, and a database platform that integrates well with tools and within ecosystem. This presentation will highlight the challenges and approaches to solution using Couchbase with N1QL.

Published in: Technology
  • Be the first to comment

Enterprise Architect's Perspective of Couchbase with N1QL: Couchbase Connect 2015

  1. 1. ENTERPRISE ARCHITECT'S PERSPECTIVE OF COUCHBASE WITH N1QL Keshav Murthy Couchbase Engineering keshav@couchbase.com @N1QL @rkeshavmurthy
  2. 2. ©2015 Couchbase Inc. 2
  3. 3. ©2015 Couchbase Inc. 3 Agenda Application requirements Data requirements Couchbase with N1QL
  4. 4. Application Requirements
  5. 5. ©2015 Couchbase Inc. 5 Application Requirements  Rapid application development  Changing market needs  Changing data needs  Scalability  Unknown user demand  Constantly growing throughput  Consistent Performance  Low response time for better user experience  High throughput to handle viral growth  Reliability  Always online Common application requirements
  6. 6. Database Requirements
  7. 7. ©2015 Couchbase Inc. 7 Database Requirements  Development environment  Data Modeling  APIs  Query Language  Performance, Performance, Performance  Availability  Consistency  Flexibility  Manageability
  8. 8. ©2015 Couchbase Inc. 8 Data Management Landscape Processing in Files MapReduce Generic fileformats Rows/Columns in files (tables) Hive – Pig - etc Query Impala Hive NoSQL MongoDB Couchbase Hbase Cassandra HADOOP (Analytical) Disk & Storage Highly Structured Data SQL, R, etc Bytes & Blocks $100K – $200K /TB$1K/TB$10K/TB Semi Structured & Self describingNo Structure OLTP EDW $10K-$20K/TB Drill Operational Big data
  9. 9. Couchbase 3.0
  10. 10. ©2015 Couchbase Inc. 10 Couchbase Server 3.0 Cluster Architecture 10 STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Servic e STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Servic e Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage
  11. 11. ©2015 Couchbase Inc. 11 read/write/update Active SERVER 1 Active SERVER 2 Active SERVER 3 APP SERVER 1 COUCHBASE Client Library CLUSTER MAP COUCHBASE Client Library CLUSTER MAP APP SERVER 2 Shard 5 Shard 2 Shard 9 Shard Shard Shard Shard 4 Shard 7 Shard 8 Shard Shard Shard Shard 1 Shard 3 Shard 6 Shard Shard Shard Replica Replica Replica Shard 4 Shard 1 Shard 8 Shard Shard Shard Shard 6 Shard 3 Shard 2 Shard Shard Shard Shard 7 Shard 9 Shard 5 Shard Shard Shard Multi-Node Operations • Docs distributed evenly across servers • Each server stores both active and replica docs - Only one “copy” is master at a time • Client library provides app with simple interface to database • Cluster map provides map to which server doc is on - App never needs to know • App reads, writes, updates docs • Multiple app servers can access same document at same time ©2014 Couchbase, Inc. 11
  12. 12. Why N1QL?
  13. 13. ©2015 Couchbase Inc. 13 Properties of Real-World Data  Rich structure  Attributes, Sub-structure  Relationships  To other data  Value evolution  Data is updated  Structure evolution  Data is reshaped Person Name DOB Billing Connections Purchases
  14. 14. ©2015 Couchbase Inc. 14 Models for Representing Data Data Concern Relational Model JSON Document Model (NoSQL) Rich Structure  Multiple flat tables  Constant assembly / disassembly  Documents  No assembly required! Relationships  Represented  Queried (SQL)  Represented  Queried? Not until now… Value Evolution  Data can be updated  Data can be updated Structure Evolution  Uniform and rigid  Manual change (disruptive)  Flexible  Dynamic change
  15. 15. What is N1QL?
  16. 16. ©2015 Couchbase Inc. 16 SELECT Statement SELECT [ DISTINCT ] … FROM … JOIN … WHERE … GROUP BY … HAVING … ORDER BY … LIMIT … OFFSET … ( UNION | INTERSECT | EXCEPT ) [ ALL ] …
  17. 17. ©2015 Couchbase Inc. 17 SELECT Statement Highlights  Querying across relationships  JOINs  Subqueries  Aggregation  MIN, MAX  ( SUM,COUNT,AVG, ARRAY_AGG ) [ DISTINCT ]  Combining result sets using set operators  ( UNION, INTERSECT, EXCEPT ) [ DISTINCT ]
  18. 18. ©2015 Couchbase Inc. 18 Data Modification Statements  UPDATE … SET …WHERE …  DELETE FROM …WHERE …  INSERT INTO … ( KEY,VALUE ) VALUES …  INSERT INTO … ( KEY …,VALUE … ) SELECT …  MERGE INTO … USING … ON … WHEN [ NOT ] MATCHEDTHEN … Note: Couchbase Server provides per-document atomicity.
  19. 19. ©2015 Couchbase Inc. 19 Query Execution: Join "CUSTOMER": { "C_D_ID": 10, "C_ID": 1938, "C_W_ID": 1, "C_BALANCE": -10, "C_CITY": ”San Jose", "C_CREDIT": "GC”, "C_DELIVERY_CNT": 0, "C_DISCOUNT": 0.3866, "C_FIRST": ”Jay", "C_LAST": ”Smith", "C_MIDDLE": "OE", "C_PAYMENT_CNT": 1, "C_PHONE": ”555-123-1234", "C_SINCE": "2015-03-22 00:50:42.822518", "C_STATE": ”CA", "C_STREET_1": ”555, Tideway Drive", "C_STREET_2": ”Alameda", "C_YTD_PAYMENT": 10, "C_ZIP": ”94501" } Document key: “1.10.1938” Document key: “1.10.143” “ORDERS”: { “O_CUSTOMER_KEY”: “1.10.1938): "O_D_ID": 10, "O_ID": 1, "O_ALL_LOCAL": 1, "O_CARRIER_ID": 2, "O_C_ID": 1938, "O_ENTRY_D": "2015-05-19 16:22:08.544472", "O_ID": 143, "O_OL_CNT": 10, "O_W_ID": 1 }x “ORDERS”: { “O_CUSTOMER_KEY”: “1.10.1938”): "O_ALL_LOCAL": 1, "O_CARRIER_ID": 2, "O_C_ID": 1938, "O_D_ID": 10, "O_ENTRY_D": "2015-05-19 16:22:08.544472", "O_ID": 1355, "O_OL_CNT": 10, "O_W_ID": 3 } Document key: “1.10.1355”
  20. 20. ©2015 Couchbase Inc. 20 Query Execution: Join SELECT COUNT(o.O_ORDER_CNT ) AS CNT_O_OL_C NT FROM ORDERS o INNER JOIN CUSTOMER c ON KEYS (o.O_CUSTOMER_KEY) WHERE o.O_CARRIER_NAME = ”Penske” AND c.C_STATE = “CA”; Two keyspace joins ON Clause for the join Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan
  21. 21. Couchbase 4.0
  22. 22. ©2015 Couchbase Inc. 22 Couchbase Server Cluster Architecture 22 STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Service Index Service Query Service STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Service Index Service Query Service Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage Managed Cache Storage
  23. 23. ©2015 Couchbase Inc. 23 Couchbase Server Cluster Service Deployment 23 STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Servic e STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Servic e STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Servic e STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Servic e Managed Cache Storage Managed Cache Storage Storage STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Servic e Storage Managed Cache Managed Cache Multi Dimensional Scaling
  24. 24. ©2015 Couchbase Inc. 24 Index Service: Global Secondary Index Index Service Snapshot atT1 Snapshot atT2 Index Email1 Scan Port Query Service Connection Pool Index Client MetadataCache (Email1, Email2) Index Service Snapshot atT3 Snapshot atT4 Index Email2 Scan Port Connection Pool Create index Email1 on Customer(Email) using gsi; Create index Email2 on Customer(Email) using gsi;
  25. 25. ©2015 Couchbase Inc. 25 Data Service Projector & Router Index Service: Global Secondary Index Query Service Bucket#1 Bucket#2 DCP Stream Index Service Supervisor Index maintenance & Scan coordinator Index#2Index#1 Index#4Index#3 ForestDB Storage Engine B u c k e t # 2 B u c k e t # 1
  26. 26. ©2015 Couchbase Inc. 26 Query Service: Parallelized for Performance Client FetchParse Plan Join Filter Pre-Aggregate Offset Limit ProjectSortAggregateScan Query Service Index Servic e Data Servic e
  27. 27. Application Development: SDKs for N1QL
  28. 28. ©2015 Couchbase Inc. 30 Native N1QL Support: Usage in the SDKs 30 C / C++ REST API
  29. 29. ©2015 Couchbase Inc. 31 Client to Query Service: REST API  Communication protocol is REST on top of HTTP  The database protocol structure is embedded within the REST API.  Query Service is stateless: All query information is embedded within the REST request.  REST is open. All REST clients work with N1QL  All N1QL clients, JDBC, ODBC drivers use REST Fetch Parse Plan Join Filter Offset Limit Project Sort Aggre gate Scan import requests import json url = "http://localhost:8093/query" s1=”SELECT * FROM CUSTOMER WHERE C_ID = 1284"; r = requests.post(url, data=s1, auth=('Administrator', 'abc')) print r.json()
  30. 30. ©2015 Couchbase Inc. 32 32 // InstantiateThe Query API var couchbase = require('couchbase'); var myCluster = new couchbase.Cluster(‘localhost:8091”); var myBucket = myCluster.openBucket(‘travel-sample’); var myQuery = couchbase.N1qlQuery; N1QL API: NodeJS
  31. 31. ©2015 Couchbase Inc. 33 N1QLAPI: NodeJS 33 function query(sql,done){ var queryToRun = myQuery.fromString(sql) .consistency(myQuery.Consistency.REQUEST_PLUS); myBucket.query(queryToRun,function(err,result){ if (err) { console.log("ERR:",err); done(err,null); return; } done(null,result); return; }); }
  32. 32. Performance
  33. 33. ©2015 Couchbase Inc. 35 Performance, Performance, Performance  Business Demands Highly Responsive Apps • Architecture based on “speed of disk” • Requires joins across many tables • High throughput requires very expensive hardware • Architecture based on “speed to memory” • Faster access to aggregated, de-normalized objects • High throughput at lowTCO with cluster of commodity servers Application layer RDBMSCache Application layer RDBMSCache Couchbase
  34. 34. Availability - Revisited
  35. 35. ©2015 Couchbase Inc. 40 Availability: Cross Cluster Availability (XDCR) Fast Streaming Replication  Complete copy of the data in cluster data into another cluster  Can be used both for availability and master-master replication  Used for both online-recovery Master Local Replic a Index Map/R educe Remote Replica IndexMap/R educe San Francisco NewYork Hadoop Client/Applic ation Integration Backup/E xport Tooling XDCR
  36. 36. Manageability
  37. 37. ©2015 Couchbase Inc. 42 Manageability machine 1 machine 2 machine 3 Ethernet Couchbase Node Couchbase Node Couchbase Node
  38. 38. ©2015 Couchbase Inc. 43 Anatomy of a Node machine 1 babysitter query indexer memcached ns-server xdcr view-engine other… The Cluster Manager is babysitter and ns-server
  39. 39. Security
  40. 40. ©2015 Couchbase Inc. 45 Previously… In 2.2 In 2.5 In 3.0 New in 4.0 SASL AuthN with Bucket Passwords Admin User Secure Build Platform Read-Only User Easy Admin Password Reset Non-Root User Deployments Secure Communication for XDCR Encrypted Client- Server Communication Encrypted Admin Access Access Log Data-at-Rest Encryption • Simplified compliance with admin auditing • External identity management for admins using LDAP Couchbase security journey
  41. 41. Application Development
  42. 42. ©2015 Couchbase Inc. 47 Flexibility: Agile Development • Hundreds or thousands of inter-related tables • Handles structured data well, unstructured data poorly • Rigid schema requires migrations that can take weeks, months • Impedance mismatch with developers • Aggregates & denormalizes data into documents • Handles structured & unstructured data equally well • Inferred schema requires no migration • JSON rapidly being adopted Hotel Descriptions Reviews User Profiles Reviews points to users Hotels points to reviews { “ID”: 1, “NAME”: “Fairmont San Francisco”, …} { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel…”, …} { “REVIEW_ID”: 2, “REVIEW”: “Nice, but …”, …} { “USER_ID”: 1, “DISPLAY”: “Ted’s Trip…”, …} { “USER_ID”: 2, “DISPLAY”: “WhatWhat…”, …}
  43. 43. Application Development: Data Modeling with N1QL
  44. 44. ©2015 Couchbase Inc. 49 Development: Goals of Data Modeling for N1QL 1. Define document boundaries 2. Define relationships 3. Express relationships to facilitate and optimize your desired access patterns.
  45. 45. ©2015 Couchbase Inc. 50 Elements of ER Model Description Examples Entity Represents a noun, object, or “thing” in the domain Employee, product, blog, episode, profile, session Relationship Represents a dependency or interaction between two entities Manager supervises employee, blog has comments, user owns session Cardinality Specifies how many instances of an entity can occur in each side of a relationship.A combination of 0, 1, or N for each side of a relationship. 0 to 1, exactly 1, 0 to N, 1 to N
  46. 46. ©2015 Couchbase Inc. 51 Expressing Relationships 3 ways to express relationships in Couchbase  Parent contains keys of children (outbound)  Children contain key of parent (inbound)  Both of the above (dual) High cardinality affects outbound relationships  Makes parent document bigger and slower  Makes it expensive to load a subset of relationships (e.g. paging through blog comments)
  47. 47. ©2015 Couchbase Inc. 52 N1QL Access Methods and Performance Fastest to slowest, 1 to 4 Method Description 1 USE KEYS Single fetch, no index scan 2 JOIN Fetch of left-hand-side, then fetches of right-hand-side 3 Index Scan Partial index scan, then fetches 4 Primary Scan Full bucket scan, then fetches
  48. 48. ©2015 Couchbase Inc. 53 Child Representation and Access Method Child Representation Access Method Notes 1 Embedded USE KEYS • Parent with children loaded via USE KEYS • Child can be surfaced via UNNEST 2 Outbound relationship JOIN • Parent contains child keys • Children loaded via JOIN 3 Inbound relationship Index scan • Children contain parent key • child.parent_key is indexed • Index is scanned to load children 4 Not modeled Primary scan • Relationship not explicitly modeled
  49. 49. ©2015 Couchbase Inc. 54 Maintenance of Relationships  Couchbase does not provide cascading deletes  Dangling references are possible  INNER JOINs and INNER NESTs omit dangling references  LEFT OUTER JOINs and LEFT OUTER NESTs safely include dangling references  Application or background task may need to clean up  Identify and remove dangling references  How to identify? Use N1QL’s LEFT OUTER JOINs!
  50. 50. Summary
  51. 51. ©2015 Couchbase Inc. 56 Couchbase Server Cluster Service Deployment 56 STORAGE Couchbase Server 1 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Managed Cache Storage Data Servic e STORAGE Couchbase Server 2 Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 3 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Data Servic e STORAGE Couchbase Server 4 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Servic e STORAGE Couchbase Server 5 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Query Servic e STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Servic e Managed Cache Storage Managed Cache Storage Storage STORAGE Couchbase Server 6 SHARD 7 SHARD 9 SHARD 5 SHARDSHARDSHARD Managed Cache Cluster ManagerCluster Manager Index Servic e Storage Managed Cache Managed Cache Multi Dimensional Scaling
  52. 52. ©2015 Couchbase Inc. 57 Couchbase: Multiple Dimensions Data Service: Scalable Key-Value Cluster Index + Aggregation:Views Index:View Indexing for N1QL Index: Global Secondary Index Index: Spatial Index Index: FullText Search N1QL = SQL + JSON XDCR: Inter data center replication Couchbase SDKs in every language
  53. 53. ©2015 Couchbase Inc. 58 Data Management Landscape Processing in Files MapReduce Generic fileformats Rows/Columns in files (tables) Hive – Pig - etc Query Impala Hive NoSQL MongoDB Couchbase Hbase Cassandra HADOOP (Analytical) Disk & Storage Highly Structured Data SQL, R, etc Bytes & Blocks $100K – $200K /TB$1K/TB$10K/TB Semi Structured & Self describingNo Structure OLTP EDW $10K-$20K/TB Drill Operational Bigdata Couchbase N1QL z
  54. 54. query.couchbase.com @N1QL Keshav Murthy keshav@couchbase.com @rkeshavmurthy

×