Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Streaming with Apache Kafka & MongoDB - EMEA

4,206 views

Published on

A new generation of technologies is needed to consume and exploit today's real time, fast moving data sources. Apache Kafka, originally developed at LinkedIn, has emerged as one of these key new technologies.

This webinar explores the use-cases and architecture for Kafka, and how it integrates with MongoDB to build sophisticated data-driven applications that exploit new sources of data.

Published in: Software
  • Be the first to comment

Data Streaming with Apache Kafka & MongoDB - EMEA

  1. 1. Data Streaming with Apache Kafka & MongoDB AndrewMorgan–MongoDBProduct Marketing DavidTucker–Director,PartnerEngineering andAlliancesatConfluent 8th November2016
  2. 2. Agenda Target Audience Apache Kafka MongoDB Integrating MongoDB and Kafka Kafka – What’s Next Next Steps
  3. 3. Target Audience
  4. 4. Target Audience
  5. 5. Target Audience
  6. 6. Target Audience
  7. 7. Target Audience
  8. 8. Target Audience
  9. 9. Apache Kafka / Confluent Platform
  10. 10. What does Kafka do? Producers Consumers Kafka Connect Kafka Connect Topic Your interfaces to the world Connected to your systems in real time
  11. 11. What is Streaming Data Synchronous Req/Response 0 – 100s ms Near Real Time > 100s ms Offline Batch > 1 hour KAFKA Stream Data Platform Search RDBMS Apps Monitoring Real-time AnalyticsNoSQL Stream Processing HADOOP Data Lake Impala DWH Hive Spark Map-Reduce
  12. 12. Confluent’s Offerings Core Connect Streams Java Client Kafka Confluent Platform EnterpriseConfluent Platform Multi-data-center ReplicationMore Clients Advanced Data BalancingREST Proxy Stream MonitoringSchema Registry Connector ManagementPre-Built Connectors
  13. 13. Confluent Platform: It’s Kafka ++ Feature Benefit Apache Kafka Confluent Open Source Confluent Enterprise Apache Kafka High throughput, low latency, high availability, secure distributed message system Kafka Connect Advanced framework for connecting external sources/destinations into Kafka Kafka Streams Simple library that enables streaming application development within the Kafka framework Additional Clients Supports non-Java clients; C, C++, Python, etc. REST Proxy Provides universal access to Kafka from any network connected device via HTTP Schema Registry Central registry for the format of Kafka data – guarantees all data is always consumable Pre-Built Connectors HDFS, JDBC, elasticsearch and other connectors fully certified and fully supported by Confluent Confluent Control Center Enables easy connector management and stream monitoring Data Center & Cloud MDC Replication, auto-data balancing Support Enterprise class support to keep your Kafka environment running at top performance Community Community 24x7x365 Free Free Subscription
  14. 14. Common Kafka Use Cases Data transport and integration • Log data • Database changes • Sensors and device data • Monitoring streams • Call data records • Stock ticker data Real-time stream processing • Monitoring • Asynchronous applications • Fraud and security
  15. 15. Kafka Adoption in Large Enterprises 6 of the top 10 travel companies 8 of the top 10 insurance companies 7 of the top 10 global banks 9 of the top 10 telecom companies
  16. 16. People Using Kafka Today Financial Services Entertainment & Media Consumer Tech Travel & Leisure Enterprise Tech Telecom Retail
  17. 17. MongoDB
  18. 18. Relational Expressive Query Language & Secondary Indexes Strong Consistency Enterprise Management & Integrations
  19. 19. The World Has Changed Data Risk Time Cost
  20. 20. NoSQL Scalability & Performance Always On, Global Deployments FlexibilityExpressive Query Language & Secondary Indexes Strong Consistency Enterprise Management & Integrations
  21. 21. Nexus Architecture Scalability & Performance Always On, Global Deployments FlexibilityExpressive Query Language & Secondary Indexes Strong Consistency Enterprise Management & Integrations
  22. 22. Integrating MongoDB and Kafka
  23. 23. Where MongoDB Fits Prod 3 2 4 123 ... Topic A Prod 9 6 7 123 ... Topic B Filter Filter Merge 5 3 4 123 ... Topic C Analyze 4 9 6 123 ... Topic D Take Action Take Action
  24. 24. Where MongoDB Fits Prod 3 2 4 123 ... Topic A Prod 9 6 7 123 ... Topic B Filter Filter Merge 5 3 4 123 ... Topic C Analyze 4 9 6 123 ... Topic D Take Action Store Results Operational Database
  25. 25. Where MongoDB Fits Prod 3 2 4 123 ... Topic A Prod 9 6 7 123 ... Topic B Filter Filter Merge 5 3 4 123 ... Topic C Analyze 4 9 6 123 ... Topic D Take Action Store Results Key Events Operational Database
  26. 26. Where MongoDB Fits Prod 3 2 4 123 ... Topic A Prod 9 6 7 123 ... Topic B Filter Filter Merge 5 3 4 123 ... Topic C Analyze 4 9 6 123 ... Topic D Take Action Store Results Key Events Operational Database
  27. 27. Where MongoDB Fits Prod 3 2 4 123 ... Topic A Prod 9 6 7 123 ... Topic B Filter Filter Merge 5 3 4 123 ... Topic C Analyze 4 9 6 123 ... Topic D Take Action Store Results Key Events Operational Database Reference Data
  28. 28. Where K-Streams Fits Prod 3 2 4 123 ... Topic A Prod 9 6 7 123 ... Topic B 5 3 4 123 ... Topic C Analyze 4 9 6 123 ... Topic D Take Action Store Results Key Events Operational Database Reference Data Kafka Streams
  29. 29. MongoDB As a Kafka Producer
  30. 30. MessageQueue Customer Data Mgmt Mobile App IoT App Live Dashboards Raw Data Processed Events Distributed Processing Frameworks Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model Sensors User Data Clickstreams Logs Churn Analysis Enriched Customer Profiles Risk Modeling Predictive Analytics Real-Time Access Batch Processing, Batch Views Design Pattern: Operationalized Data Lake Kafka Streams
  31. 31. MessageQueue Customer Data Mgmt Mobile App IoT App Live Dashboards Raw Data Processed Events Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model Sensors User Data Clickstreams Logs Churn Analysis Enriched Customer Profiles Risk Modeling Predictive Analytics Real-Time Access Batch Processing, Batch Views Design Pattern: Operationalized Data Lake Configure where to land incoming data Distributed Processing Frameworks Kafka Streams
  32. 32. MessageQueue Customer Data Mgmt Mobile App IoT App Live Dashboards Raw Data Processed Events Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model Sensors User Data Clickstreams Logs Churn Analysis Enriched Customer Profiles Risk Modeling Predictive Analytics Real-Time Access Batch Processing, Batch Views Design Pattern: Operationalized Data Lake Raw data processed to generate analytics models Distributed Processing Frameworks Kafka Streams
  33. 33. MessageQueue Customer Data Mgmt Mobile App IoT App Live Dashboards Raw Data Processed Events Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model Sensors User Data Clickstreams Logs Churn Analysis Enriched Customer Profiles Risk Modeling Predictive Analytics Real-Time Access Batch Processing, Batch Views Design Pattern: Operationalized Data Lake MongoDB exposes analytics models to operational apps. Handles real time updates Distributed Processing Frameworks Kafka Streams
  34. 34. MessageQueue Customer Data Mgmt Mobile App IoT App Live Dashboards Raw Data Processed Events Millisecond latency. Expressive querying & flexible indexing against subsets of data. Updates-in place. In-database aggregations & transformations Multi-minute latency with scans across TB/PB of data. No indexes. Data stored in 128MB blocks. Write-once-read-many & append-only storage model Sensors User Data Clickstreams Logs Churn Analysis Enriched Customer Profiles Risk Modeling Predictive Analytics Real-Time Access Batch Processing, Batch Views Design Pattern: Operationalized Data Lake Compute new models against MongoDB & HDFS Distributed Processing Frameworks Kafka Streams
  35. 35. https://www.mongodb.c om/presentations/repla cing-traditional- technologies-mongodb- single-platform-all- financial-data-ahl
  36. 36. http://www.slideshare.n et/danharvey/change- data-capture-with- mongodb-and-kafka
  37. 37. Kafka – What’s Next
  38. 38. Kafka Connectors • Confluent-supported connectors (included in CP) • Community-written connectors (just a sampling) JDBC
  39. 39. Kafka Futures • Apache Core • Admin API (KIP-4) • Exactly-once delivery semantics • Time-based topic indexing • Kafka Streams • Exactly-once processing semantics • Interactive Queries: enable real-time sharing of application state with other applications • Confluent Platform Enterprise • Multi-cluster views and expanded alerting added to Control Center
  40. 40. Next Steps
  41. 41. MongoDB Atlas Database as a service for MongoDB MongoDB Atlas is… • Automated: The easiest way to build, launch, and scale apps on MongoDB • Flexible: The only database as a service with all you need for modern applications • Secured: Multiple levels of security available to give you peace of mind • Scalable: Deliver massive scalability with zero downtime as you grow • Highly available: Your deployments are fault-tolerant and self-healing by default • High performance: The performance you need for your most demanding workloads
  42. 42. MongoDB Atlas Features • Spin up a cluster in minutes • Replicated & always- on deployments • Fully elastic: scale out or up in a few clicks with zero downtime • Automatic patches & simplified upgrades for the newest MongoDB features • Authenticated & encrypted • Continuous backup with point-in-time recovery • Fine-grained monitoring & custom alerts Safe & SecureRun for You • On-demand pricing model; billed by the hour • Multi-cloud support (AWS available with others coming soon) • Part of a suite of products & services designed for all phases of your app; migrate easily to different environments (private cloud, on- prem, etc) when needed No Lock-In Database as a service for MongoDB
  43. 43. MongoDB Enterprise Advanced • MongoDB Ops Manager or MongoDB Cloud Manager Premium • MongoDB Compass • MongoDB Connector for BI • Cloud Foundry Integration • Encrypted Storage Engine • LDAP / Kerberos Integration • DDL & DML Auditing • FIPS 140-2 Support SecurityTooling • 24 x 7 Support • 1 hr SLA • Emergency Patches • Customer Success Program • On-Demand Training Support License • Commercial License
  44. 44. Resources • Data Streaming with Apache Kafka & MongoDB • https://www.mongodb.com/collateral/data-streaming-with-apache- kafka-and-mongodb • Implementing a Kafka Consumer for MongoDB • https://www.mongodb.com/blog/post/mongodb-and-data-streaming- implementing-a-mongodb-kafka-consumer • Tailing the Oplog on a sharded MongoDB Cluster • https://www.mongodb.com/blog/post/tailing-mongodb-oplog-sharded- clusters
  45. 45. Old Billingsgate, London 15th November mongodb.com/europe Use my discount code for 20% off: andrewmorgan20

×