Data Streaming with Apache Kafka & MongoDB - EMEA

Data Streaming with
Apache Kafka &
MongoDB
AndrewMorgan–MongoDBProduct
Marketing
DavidTucker–Director,PartnerEngineering
andAlliancesatConfluent
8th November2016

Agenda
Target Audience
Apache Kafka
MongoDB
Integrating MongoDB and Kafka
Kafka – What’s Next
Next Steps

Apache Kafka /
Confluent Platform

What does Kafka
do?
Producers
Consumers
Kafka Connect
Kafka Connect
Topic
Your interfaces to the world
Connected to your systems in real time

What is Streaming Data
Synchronous Req/Response
0 – 100s ms
Near Real Time
> 100s ms
Offline Batch
> 1 hour
KAFKA
Stream Data Platform
Search
RDBMS
Apps Monitoring
Real-time AnalyticsNoSQL Stream Processing
HADOOP
Data Lake
Impala
DWH
Hive
Spark Map-Reduce

Confluent’s Offerings
Core
Connect
Streams
Java Client
Kafka
Confluent Platform EnterpriseConfluent Platform
Multi-data-center ReplicationMore Clients
Advanced Data BalancingREST Proxy
Stream MonitoringSchema Registry
Connector ManagementPre-Built Connectors

Confluent Platform: It’s Kafka ++
Feature Benefit Apache Kafka
Confluent Open
Source
Confluent Enterprise
Apache Kafka
High throughput, low latency, high availability, secure distributed message
system
Kafka Connect
Advanced framework for connecting external sources/destinations into
Kafka
Kafka Streams
Simple library that enables streaming application development within the
Kafka framework
Additional Clients Supports non-Java clients; C, C++, Python, etc.
REST Proxy
Provides universal access to Kafka from any network connected device via
HTTP
Schema Registry
Central registry for the format of Kafka data – guarantees all data is always
consumable
Pre-Built Connectors
HDFS, JDBC, elasticsearch and other connectors fully certified
and fully supported by Confluent
Confluent Control Center Enables easy connector management and stream monitoring
Data Center & Cloud MDC Replication, auto-data balancing
Support
Enterprise class support to keep your Kafka environment running at top
performance
Community Community 24x7x365
Free Free Subscription

Common Kafka Use Cases
Data transport and integration
• Log data
• Database changes
• Sensors and device data
• Monitoring streams
• Call data records
• Stock ticker data
Real-time stream processing
• Monitoring
• Asynchronous applications
• Fraud and security

Kafka Adoption in Large Enterprises
6 of the top 10
travel companies
8 of the top 10
insurance companies
7 of the top 10
global banks
9 of the top 10
telecom companies

People Using Kafka Today
Financial Services
Entertainment & Media
Consumer Tech
Travel & Leisure
Enterprise Tech
Telecom Retail

Relational
Expressive Query Language
& Secondary Indexes
Strong Consistency
Enterprise Management
& Integrations

The World Has Changed
Data Risk Time Cost

NoSQL
Scalability
& Performance
Always On,
Global Deployments
FlexibilityExpressive Query Language
& Secondary Indexes
Strong Consistency
& Integrations

Nexus Architecture
Scalability
& Performance
Always On,
Global Deployments
FlexibilityExpressive Query Language
& Secondary Indexes
Strong Consistency
& Integrations

Where MongoDB Fits
Prod
3
2
4
123
...
Topic A
Prod
9
6
7
123
...
Topic B
Filter
Filter
Merge
5
3
4
123
...
Topic C
Analyze
4
9
6
123
...
Topic D
Take
Action
Take
Action

Where MongoDB Fits
Prod
3
2
4
123
...
Topic A
Prod
9
6
7
123
...
Topic B
Filter
Filter
Merge
5
3
4
123
...
Topic C
Analyze
4
9
6
123
...
Topic D
Take
Action
Store
Results
Operational Database

Where MongoDB Fits
Prod
3
2
4
123
...
Topic A
Prod
9
6
7
123
...
Topic B
Filter
Filter
Merge
5
3
4
123
...
Topic C
Analyze
4
9
6
123
...
Topic D
Take
Action
Store
Results
Key
Events

Where MongoDB Fits
Prod
3
2
4
123
...
Topic A
Prod
9
6
7
123
...
Topic B
Filter
Filter
Merge
5
3
4
123
...
Topic C
Analyze
4
9
6
123
...
Topic D
Take
Action
Store
Results
Key
Events
Reference Data

Where K-Streams Fits
Prod
3
2
4
123
...
Topic A
Prod
9
6
7
123
...
Topic B
5
3
4
123
...
Topic C
Analyze
4
9
6
123
...
Topic D
Take
Action
Store
Results
Key
Events
Reference Data
Kafka
Streams

MessageQueue
Customer Data Mgmt Mobile App IoT App Live Dashboards
Raw Data
Processed
Events
Distributed
Processing
Frameworks
Millisecond latency. Expressive querying & flexible indexing against subsets
of data. Updates-in place. In-database aggregations & transformations
Multi-minute latency with scans across TB/PB of data. No indexes. Data
stored in 128MB blocks. Write-once-read-many & append-only storage model
Sensors
User Data
Clickstreams
Logs
Churn
Analysis
Enriched
Customer
Profiles
Risk
Modeling
Predictive
Analytics
Real-Time Access
Batch Processing, Batch Views
Design Pattern: Operationalized Data Lake
Kafka
Streams

MessageQueue
Raw Data
Processed
Events
Sensors
User Data
Clickstreams
Logs
Churn
Analysis
Enriched
Customer
Profiles
Risk
Modeling
Predictive
Analytics
Real-Time Access
Configure where to
land incoming data
Distributed
Processing
Frameworks
Kafka
Streams

MessageQueue
Raw Data
Processed
Events
Sensors
User Data
Clickstreams
Logs
Churn
Analysis
Enriched
Customer
Profiles
Risk
Modeling
Predictive
Analytics
Real-Time Access
Raw data processed to
generate analytics models
Distributed
Processing
Frameworks
Kafka
Streams

MessageQueue
Raw Data
Processed
Events
Sensors
User Data
Clickstreams
Logs
Churn
Analysis
Enriched
Customer
Profiles
Risk
Modeling
Predictive
Analytics
Real-Time Access
MongoDB exposes
analytics models to
operational apps.
Handles real time
updates
Distributed
Processing
Frameworks
Kafka
Streams

MessageQueue
Raw Data
Processed
Events
Sensors
User Data
Clickstreams
Logs
Churn
Analysis
Enriched
Customer
Profiles
Risk
Modeling
Predictive
Analytics
Real-Time Access
Compute new
models against
MongoDB &
HDFS
Distributed
Processing
Frameworks
Kafka
Streams

https://www.mongodb.c
om/presentations/repla
cing-traditional-
technologies-mongodb-
single-platform-all-
financial-data-ahl

http://www.slideshare.n
et/danharvey/change-
data-capture-with-
mongodb-and-kafka

Kafka Connectors
• Confluent-supported connectors (included in CP)
• Community-written connectors (just a sampling)
JDBC

Kafka Futures
• Apache Core
• Admin API (KIP-4)
• Exactly-once delivery semantics
• Time-based topic indexing
• Kafka Streams
• Exactly-once processing semantics
• Interactive Queries: enable real-time sharing of application state with
other applications
• Confluent Platform Enterprise
• Multi-cluster views and expanded alerting added to Control Center

MongoDB Atlas
Database as a service for MongoDB
MongoDB Atlas is…
• Automated: The easiest way to build, launch, and scale apps on MongoDB
• Flexible: The only database as a service with all you need for modern applications
• Secured: Multiple levels of security available to give you peace of mind
• Scalable: Deliver massive scalability with zero downtime as you grow
• Highly available: Your deployments are fault-tolerant and self-healing by default
• High performance: The performance you need for your most demanding workloads

MongoDB Atlas Features
• Spin up a cluster in
minutes
• Replicated & always-
on deployments
• Fully elastic: scale
out or up in a few
clicks with zero
downtime
• Automatic patches &
simplified upgrades
for the newest
MongoDB features
• Authenticated &
encrypted
• Continuous backup
with point-in-time
recovery
• Fine-grained
monitoring &
custom alerts
Safe & SecureRun for You
• On-demand pricing
model; billed by the
hour
• Multi-cloud support
(AWS available with
others coming
soon)
• Part of a suite of
products & services
designed for all
phases of your app;
migrate easily to
different
environments
(private cloud, on-
prem, etc) when
needed
No Lock-In
Database as a service for MongoDB

MongoDB Enterprise Advanced
• MongoDB Ops
Manager or
MongoDB Cloud
Manager Premium
• MongoDB Compass
• MongoDB
Connector for BI
• Cloud Foundry
Integration
• Encrypted Storage
Engine
• LDAP / Kerberos
Integration
• DDL & DML
Auditing
• FIPS 140-2 Support
SecurityTooling
• 24 x 7 Support
• 1 hr SLA
• Emergency
Patches
• Customer Success
Program
• On-Demand
Training
Support License
• Commercial
License

Resources
• Data Streaming with Apache Kafka & MongoDB
• https://www.mongodb.com/collateral/data-streaming-with-apache-
kafka-and-mongodb
• Implementing a Kafka Consumer for MongoDB
• https://www.mongodb.com/blog/post/mongodb-and-data-streaming-
implementing-a-mongodb-kafka-consumer
• Tailing the Oplog on a sharded MongoDB Cluster
• https://www.mongodb.com/blog/post/tailing-mongodb-oplog-sharded-
clusters

Old Billingsgate, London
15th November
mongodb.com/europe
Use my discount code for 20% off: andrewmorgan20

Data Streaming with Apache Kafka & MongoDB - EMEA

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Data Streaming with Apache Kafka & MongoDB - EMEA

Similar to Data Streaming with Apache Kafka & MongoDB - EMEA (20)

More from Andrew Morgan

More from Andrew Morgan (8)

Recently uploaded

Recently uploaded (20)

Data Streaming with Apache Kafka & MongoDB - EMEA

Editor's Notes