Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

NoETL for NoSQL – Real-Time Analytics With Couchbase – Connect New York 2018

122 views

Published on

Speaker: Till Westmann, Senior Director for Analytics, Couchbase

Couchbase Analytics is a new service in the Couchbase Data Platform that enables parallel evaluation of analytical queries without impacting the operational performance of the Couchbase cluster. With Couchbase Analytics, the operational data in Couchbase Server is available for analytical processing in real time. The MPP-based (massively parallel processing) query processor enables analytical queries to run quickly and efficiently. Join us for an architectural overview of the new service and the SQL++ based analytical workbench, and learn about our roadmap for Couchbase Analytics.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

NoETL for NoSQL – Real-Time Analytics With Couchbase – Connect New York 2018

  1. 1. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. NoETL for NoSQL Real-Time Analytics With Couchbase May 20, 2018 Till Westmann | Senior Director Engineering
  2. 2. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 01/ 02/ 03/ 04/ What? Why? How to use it Inside out Developer Preview AGENDA
  3. 3. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 3 What is Couchbase Analytics? • Common programming model & data model • Unified management • Fast data synchronization Extend Couchbase Platform to power real-time analytics • Ad-hoc queries (“Ask me anything!”) • Workload isolation • Independent scaling Scale out architecture Query Mobile & IoT AnalyticsPreview Memory-first architecture Unified Programming Search Core Database Engine
  4. 4. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 4 Why Couchbase Analytics? •Support OLTP and OLAP processing in a single platform •Eliminate the need for a separate OLAP system • Eliminate ETL • Reduces latency • Reduces complexity •Enable data exploration and ad hoc analytics
  5. 5. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 5 Traditional Analytics Solutions Ops DB Analytical DB Analytics Tool Business Application Operations Data Batch Batch Ops DBOps DB ETLo o
  6. 6. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 6 Couchbase Analytics – Bringing NoETL to NoSQL Couchbase Data Platform Analytics Tool Business Application Ops Data Node Analytics Node
  7. 7. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. HOW TO USE IT #
  8. 8. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 8 Data: Beer Sample { "name": "Commonwealth Brewing #1", "city": "Boston", "state": "Massachusetts", "code": "", "country": "United States", "phone": "", "website": "", "type": "brewery", "updated": "2010-07-22 20:00:20", "description": "", "address": [ ], "geo": { "accuracy": "APPROXIMATE", "lat": 42.3584, "lng": -71.0598 } } { "name": "Piranha Pale Ale", "abv": 5.7, "ibu": 0, "srm": 0, "upc": 0, "type": "beer", "brewery_id": "110f04166d", "updated": "2010-07-22 20:00:20", "description": "", "style": "American-Style Pale Ale", "category": "North American Ale" }
  9. 9. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 9 Simple Join [{ "brewer": "(512) Brewing Company", "beer": "(512) ALT" }, { "brewer": "(512) Brewing Company", "beer": "(512) Bruin" }, { "brewer": "(512) Brewing Company", "beer": "(512) IPA" }] "Get 3 beers with their breweries" SELECT bw.name AS brewer, br.name AS beer FROM breweries bw, beers br WHERE br.brewery_id = meta(bw).id ORDER BY bw.name, br.name LIMIT 3;
  10. 10. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 10 Nested Outer Join [{ "beers": [ { "abv": 8.2, "name": "(512) Pecan Porter" }, { "abv": 5.8, "name": "(512) Pale" }, ... ], "brewer": "(512) Brewing Company" }, { "beers": [ { "abv": 7.2, "name": "21A IPA" }, { "abv": 5.8, "name": "North Star Red" }, ... ], "brewer": "21st Amendment Brewery Cafe" }] "Get 2 breweries and the list of their beers" SELECT bw.name AS brewer, ( SELECT br.name, br.abv FROM beers br WHERE br.brewery_id = meta(bw).id ) AS beers FROM breweries bw ORDER BY bw.name LIMIT 2;
  11. 11. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 11 Grouping and Aggregation [{ "num_beers": 57, "brewery_id": "midnight_sun_brewing_co" }, { "num_beers": 49, "brewery_id": "rogue_ales" }, { "num_beers": 38, "brewery_id": "anheuser_busch" } ] "Get all breweries that produce more than 37 beers" SELECT br.brewery_id, COUNT(*) AS num_beers FROM beers br GROUP BY br.brewery_id HAVING num_beers > 37 ORDER BY num_beers DESC;
  12. 12. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 12 Putting It All Together [{ "num_beers": 5, "beer_strength": 12.02, "city": "Vorchdorf" }, { "num_beers": 8, "beer_strength": 10.3125, "city": "Buggenhout" }, { "num_beers": 11, "beer_strength": 10.045454545454545, "city": "Fraserburgh" }] "Explore beer characteristics by city" SELECT bw.city, COUNT(*) AS num_beers, AVG(br.abv) AS beer_strength FROM beers br, breweries bw WHERE br.brewery_id = meta(bw).id GROUP BY bw.city HAVING COUNT(*) > 1 ORDER BY beer_strength DESC LIMIT 3;
  13. 13. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 13 Couchbase Analytics DDL: Lifecycle "Look, No ETL!" CREATE BUCKET `beer-sample`; CREATE DATASET beers ON `beer-sample` WHERE `type` = "beer"; CREATE DATASET breweries ON `beer-sample` WHERE `type` = "brewery"; CONNECT BUCKET `beer-sample`; SELECT * FROM beers ORDER BY abv DESC LIMIT 12; DISCONNECT BUCKET `beer-sample`; DROP DATASET breweries ; DROP DATASET beers; DROP BUCKET `beer-sample`;
  14. 14. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. INSIDE OUT #
  15. 15. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 15 What is Couchbase Analytics? • Common programming model & data model • Unified management • Fast data synchronization Extend Couchbase Platform to power real-time analytics • Ad-hoc queries (“Ask me anything!”) • Workload isolation • Independent scaling Scale out architecture Query Mobile & IoT AnalyticsPreview Memory-first architecture Unified Programming Search Core Database Engine
  16. 16. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 16 Query and Analytics Services  Many queries  Each touches a little data  Fewer queries  Each touches a lot of data Couchbase Query Couchbase Analytics Optimized for Analytics (OLAP) Optimized for Operations (OLTP)
  17. 17. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 17 QUERY SERVICE Online search and booking, reviews and ratings • Property and room detail pages • Cross-sell links, up-sell links • Stars & likes & associated reviews • Their booking history Query Service behind every page display and click/navigation ANALYTICS SERVICE Reporting, Trend Analysis, Data Exploration • Daily discount availability report • Cities with highest room occupancy rates • Hotels with biggest single day drops • How many searches turn into bookings grouped by property rating? grouped by family size? Business Analysts ask these questions without knowing in advance every aspect of the question Query and Analytics Services - Examples
  18. 18. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 18 "Get the 10 chattiest users in a timeframe" SELECT user.id, COUNT(message) AS count FROM gbook_messages AS message, gbook_users AS user WHERE message.author_id = user.id AND message.send_time BETWEEN "2001-11-28T09:57:13" AND "2001-11-29T09:57:13" GROUP BY user.id ORDER BY count DESC LIMIT 10; Example: Join, Grouping, and Aggregation
  19. 19. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 19 Couchbase Query and Analytics – Performance Tradeoff 1m (<10) 1h (<500) 1d (<5000) Join GBy CBA Join GBy N1QL GSI 1w (<25K) 1mo (<100K) 3mo (<300K) 6mo (<600K) Join GBy CBA Join GBy N1QL GSI interval (# records)
  20. 20. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 20 Shadow data for processing What is Couchbase Analytics? Fast Ingest Complex Queries on large datasets Real-time Insights for Business Teams DATA DATA DATA ANALYTICS ANALYTICS ANALYTICS ANALYTICS MPP architecture: parallelization among core and servers
  21. 21. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. INTEGRATED DEVELOPER PREVIEW#
  22. 22. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 24 What is Couchbase Analytics? • Common programming model & data model • Unified management • Fast data synchronization Extend Couchbase Platform to power real-time analytics • Ad-hoc queries (“Ask me anything!”) • Workload isolation • Independent scaling Scale out architecture Query Mobile & IoT AnalyticsPreview Memory-first architecture Unified Programming Search Core Database Engine ✔ ✔ ✔ ✔ ✔ ✔
  23. 23. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 25 Workbench
  24. 24. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 26 Get it at http://www.couchbase.com/downloads https://www.couchbase.com/downloads
  25. 25. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2017. All rights reserved. THANK YOU
  26. 26. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. EVENT HASHTAG: #CBConnect WRITE A COUCHBASE REVIEW: trustradius.com or Gartner.com/reviews VOTE FOR COUCHBASE TODAY! DBTA Readers’ Choice Awards: dbta.com/ReadersChoice/2018 Voting closes tomorrow!
  27. 27. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. APPENDIX #
  28. 28. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2018. All rights reserved. 30 Couchbase Analytics and friends Operations Analytics BatchOnline Key Value CB Query CB Analytics Spark Hadoop 𝜇s ms 30s Minutes+ 1 record Trillions of records Start up overhead Job-based Parallel query ETL

×