Il talk introduce Confluent, la Confluent Platform, ed in particolare il ruolo della Kafka Connect APIs ed l'Elasticsearch Connector di Confluent. Si spiegherà perché Kafka ed il connector di Confluent per Elastic sono un'ottima e semplice soluzione per aggregare dati da svariate sorgenti e gestire l'input e l'idicizzazione di documenti o dati in Elasticsearch.
7. 7
Which Streaming Platform is Right for You?
Register your interest and we will keep you informed!
https://www.confluent.io/confluent-operator/
Also in partnership with Mesosphere (and others):
https://mesosphere.com/blog/dcos-confluent-kubernetes/
8. 8
Leading Businesses Adopting a Streaming Platform
Funding Circle
https://www.youtube.com/watch?v=ks9caob2s6w
HomeAway
https://www.youtube.com/watch?v=IXJUmYvbTLw
Royal Bank of Canada
https://www.youtube.com/watch?v=WTxmHHJcHRc
AUDI
https://www.youtube.com/watch?v=yGLKi3TMJv8
9. 9
Elastic and Confluent Share Similar Values
• Distributed and fault tolerant
• Horizontally scalable
• Low latency
• Open source at their core
• Enterprise grade solutions
10. 10
Common Kafka Use Cases
Data Transport and Integration
• Log data
• Sensors and device data
• Monitoring streams
• Call data records
• Stock ticker data
• Customer 360 / Single View
Real-Time Stream Processing
• Monitoring
• Asynchronous applications
• Fraud and security
• Cybersecurity
• Instant payments
• Automotive, connected cars
• IoT, sensors and manufacturing
• Microservices architectures
11. 11
Confluent Platform
Open Source ExternalEnterprise
Confluent Platform
Monitoring
Analytics
Custom Apps
Transformations
Real-time
Applications
…
CRM
Data Warehouse
Database
Hadoop
Data
Integration
…
Control Center
Auto-data
Balancing
Multi-Datacenter
Replication
24/7 Support
Supported
Connectors
Clients
Schema
Registry
REST
Proxy
Apache Kafka
Kafka
Connect
Kafka
Streams
Kafka
Core
Database Changes Log Events loT Data Web Events …
JMS Client
KSQL
12. 12
Kafka Connect APIs and Elasticsearch Connector
Open Source ExternalEnterprise
Confluent Platform
Monitoring
Analytics
Custom Apps
Transformations
Real-time
Applications
…
CRM
Data Warehouse
Database
Hadoop
Data
Integration
…
Control Center
Auto-data
Balancing
Multi-Datacenter
Replication
24/7 Support
Supported
Connectors
Clients
Schema
Registry
REST
Proxy
Apache Kafka
Kafka
Connect
Kafka
Streams
Kafka
Core
Database Changes Log Events loT Data Web Events …
JMS Client
KSQL
14. 14
The Streams API of Apache Kafka®
ü No separate processing cluster required
ü Develop on Mac, Linux, Windows
ü Deploy to containers, VMs, bare metal, cloud
ü Powered by Kafka: elastic, scalable, distributed,
battle-tested
ü Perfect for small, medium, large use cases
ü Fully integrated with Kafka security
ü Exactly-once processing semantics
ü Part of Apache Kafka, included in
Confluent Open Source
Write standard Java applications and microservices
to process your data in real-time
KStream<User, PageViewEvent> pageViews = builder.stream("pageviews-topic");
KTable<Windowed<User>, Long> viewsPerUserSession = pageViews
.groupByKey()
.count(SessionWindows.with(TimeUnit.MINUTES.toMillis(5)), "session-views");
https://docs.confluent.io/current/streams/
15. 15
KSQL: a Streaming SQL Engine for Apache Kafka® from Confluent
ü No coding required, all you need is SQL
ü No separate processing cluster required
ü Powered by Kafka: elastic, scalable,
distributed, battle-tested
CREATE TABLE possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING count(*) > 3;
CREATE STREAM vip_actions AS
SELECT userid, page, action
FROM clickstream c
LEFT JOIN users u
ON c.userid = u.userid
WHERE u.level = 'Platinum';
KSQL is the simplest way to process streams of data in real-time
ü Perfect for streaming ETL, anomaly detection,
event monitoring, and more
ü Part of Confluent Open Source
https://github.com/confluentinc/ksql
16. 16
KSQL in less than 5 minutes
https://www.youtube.com/watch?v=A45uRzJiv7I
18. 18
The Connect API of Apache Kafka®
ü Centralized management and configuration
ü Support for hundreds of technologies
including RDBMS, Elasticsearch, HDFS, S3
ü Supports CDC ingest of events from RDBMS
ü Preserves data schema
ü Fault tolerant and automatically load balanced
ü Extensible API
ü Single Message Transforms
ü Part of Apache Kafka, included in
Confluent Open Source
Reliable and scalable integration of Kafka
with other systems – no coding required.
{
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url": "jdbc:mysql://localhost:3306/demo?user=rmoff&password=foo",
"table.whitelist": "sales,orders,customers"
}
https://docs.confluent.io/current/connect/
19. 19
Benefits of Kafka Connect
JDBC
Oracle
MySQL
Elastic
Couchbase
HDFS
Kafka Connect API
Kafka Pipeline
Connector
Connector
Connector
Connector
Connector
Connector
Sources Sinks
Fault tolerant
Manage hundreds of
data sources and sinks
Preserves data schema
Part of Apache Kafka
project
Integrated within
Confluent Platform’s
Control Center
20. 20
Confluent Elasticsearch Connector
• Easily move data from
Kafka to Elasticsearch
• Open Source, ASL
licensed
• Key Features:
• Exactly Once Delivery
• Mapping Inference
• Schema Evolution
JDBC
Oracle
MySQL
Elastic
Kafka Connect API
Elasticsearch
Connector
Documentation:
http://docs.confluent.io/current/connect/connect-elasticsearch/docs/elasticsearch_connector.html
Source code:
https://github.com/confluentinc/kafka-connect-elasticsearch
22. 22
Cross Data Center Replication
Kafka
Kafka Connect API
Confluent Replicator
Kafka Cluster
Data Center A
Data Center B
Low latency, real-time
data replication
23. 23
Confluent Replicator: Logical Architecture
Data Center in USA
Kafka Cluster (USA)
Kafka Broker 1
Kafka Broker 2
Kafka Broker 3
ZooKeeper 1
ZooKeeper 2
ZooKeeper 3
Control Center
Kafka Connect
Cluster
Replicator 1
Replicator 2
Data Center in EMEA
Kafka Cluster (EU)
Kafka Broker 1
Kafka Broker 2
Kafka Broker 3
ZooKeeper 1
ZooKeeper 2
ZooKeeper 3
Control Center
Kafka Connect
Cluster
Replicator 1
Replicator 2
Available only with Confluent Enterprise
Apache Kafka and Confluent Open Source
https://docs.confluent.io/current/connect/connect-replicator/docs/
24. 24
Some of the Connectors Available
Databases Datastore/File Store
Analytics Applications / Other
https://www.confluent.io/product/connectors/
25. 25
The Easiest Way to get You Started!
https://www.confluent.io/download/
27. 27
Books (freely available in PDF from Confluent website)
https://www.confluent.io/apache-kafka-stream-processing-book-bundle
https://www.confluent.io/designing-event-driven-systems
28. 28
“Confluent + Elastic: a simple, scalable and flexible solution
that delivers data and actionable insights in real-time.”
Focusing on Customer Success
• Enterprise grade distribution of Kafka
• Stream processing at scale and easy
• Simple, reliable, secure and auditable
• Fast and scalable
• Easy to operate
• Enterprise grade security