SlideShare a Scribd company logo
Connecting Kafka Message
Systems with Scylla
Maheedhar Gunturu, Solutions Architect
Presenter
Maheedhar Gunturu, Solutions Architect
Maheedhar held senior roles both in engineering and sales
organizations. He has over a decade of experience designing &
developing server-side applications in the cloud and working on
big data and ETL frameworks in companies such as Samsung,
MapR, Apple, VoltDB, Zscaler and Qualcomm. He holds a
masters degree in Electrical and Computer engineering from
the University of Texas at San Antonio.
Agenda
1. Benefits of Message Queues
2. Kafka Connect Framework
Benefits of Message Queues
Benefits of messaging queues
■ Centralized Infrastructure.
■ Intermediate layer for buffering.
■ Export and Import capabilities.
■ Publish CDC streams.
■ Integrate with various applications.
■ Streaming Data Transformations.
■ Impedance mismatch between applications.
■ Ability to recreate state.
Centralized Infrastructure
Microservices
Apps
Operational
Applications
Data Warehouse
Databases
Producers
Consumers
Database
change
Microservices
events
SaaS
data
Operational
Alerts
Streams of real time events
Stream processing apps
Databases
Intermediate layer for buffering.
■ Provides flexibility for downstream Consumers.
● Buffer data while upgrades, migrations or troubleshooting.
■ Downstream systems don't have to be provisioned for peak traffic
● Save hardware costs.
■ Dynamically scalable layer to handle bursty loads
● Add more partitions/brokers to increase parallelism and throughput.
● Use kafka operator and it will dynamically scale the cluster based on ingress traffic.
■ Provides resiliency and fault tolerance
● Each Topic has replicas available with multiple partitions spread across multiple
brokers.
● Set TTLs at the topic level to determine retention.
Export and Import capabilities.
Publish CDC streams
■ Publish record level changes to the corresponding Topics.
● Usually configurable to what level of detail you want in the change records.
■ Upstream changes from watched rows emitted as a change record
● The format of these rows is in a configurable format (JSON, Avro etc)
■ Downstream processing for reporting, caching, or full-text indexing.
● Subscribe with the corresponding consumer (i.e. Scylla, elasticsearch, spark)
■ Changefeeds are emitted with at-least-once delivery guarantees.
● In most cases, each version of a row will be emitted once. However, some infrequent
conditions (e.g., node failures, network partitions) will cause them to be repeated.
Integrations
Scylla
Mongo
Example Consumers
Serializer
App 2
Serializer
App 3
!
Schema
Registry
Elastic
Serializer
App 1
!
Kafka Topic
● Define the expected fields for each Kafka topic
● Automatically handle schema changes (e.g. new
fields)
● Prevent backwards incompatible
changes
● Support multi-data center environments
Hbase
Streaming Data Transformations.
Streams
API
Producer
Topic TopicTopic
Consumer Consumer
Overview
• Write standard Java applications
• No separate processing cluster required
• Exactly-once processing semantics
• Elastic, highly scalable, fault-tolerant
• Fully integrated with Kafka security
Example Use Cases
• Event-Driven Microservices
• Continuous queries
• Continuous transformations
Kafka Cluster
■ Applications produce and consume data at a different rate.
● Provides flexibility for the downstream applications to scale based on their SLAs
■ Downstream applications can be independently scaled
● Dynamically move partitions to optimize resource utilization and reliability.
● Enable elastic scaling by easily adding and removing nodes from your Kafka cluster.
■ Tuning topic’s configuration will help in efficient use of consumers
● Determine the ratio between number of partitions in a topic and number of
consumers.
● ADB traffic is throttled upon data transfers to ensure network bandwidth
Impedance mismatch between applications.
Event Sourcing
■ Every change to the state of an application is captured in an event
object.
● Order of the events needs to be maintained.
■ Ability to recreate state in your application and the supporting
database.
● cqrs provides the benefit of event sourcing analogous to a materialized view
● Need to keep track of lineage and the transformations that were run on the data.
■ Newer versions of ML algorithms can operate on the raw Event data
to recreate the state in the database.
● Better model serving/benchmarking.
Kafka Connect Framework
15
Scylla & Confluent
Kafka Connect Features
01
A standard framework for
Kafka connectors.
04
Distributed & scalable by
default.
04
Automatic offset mgmt.
02
Distributed and standalone
modes.
06
Streaming/batch integration.
03
REST interface for configuration.
Port: 8083
Kafka Connect API
CDC
Database
Mongo
Cassandra
Elastic
Scylla
HDFS
Kafka Connect API
Kafka Pipeline
Connect Worker
Connect worker
Connect worker
Connect Worker
Connect Worker
Connect Worker
Sources Sinks
Auto-recovery and
Fault tolerant
Manage hundreds of
data sources and
sinks
Preserves data
schema
Integrated within
Confluent Control
Center
Simple Parallelism
Configuring Kafka Connect (sink)
#sample casssandra-sink.properties file
name=sink
topics=temperature
tasks.max=1
connector.class=io.confluent.connect.cassandra.CassandraSinkConn
ector
cassandra.contact.points=<PUBLIC IPs of your SCYLLA Cluster
(IP1,IP2,IP3)>
cassandra.keyspace=demo
cassandra.compression=SNAPPY
cassandra.consistency.level=LOCAL_QUORUM
transforms=prune
transforms.prune.type=org.apache.kafka.connect.transforms.Replac
eField$Value
transforms.prune.whitelist=CreatedAt,Id,Text,Source,Truncated
1. Update the sink.properties
2. Update the connect-
distributed.properties file
3. start the Connect framework using the
Cassandra connector in distributed
mode.
ref: https://www.scylladb.com/2018/12/19/scylla-and-confluent-for-iot/
Kafka Connect Security
Encryption
■ Kafka Connect also works with SSL-encrypted connections to these
brokers.
Authentication
■ Kafka Connect works with SASL – e.g. Kerberos, Active Directory
Authorization
■ Restrict who can create, write to, read from topics, and more
■ REST API for Kafka Connect nodes are not secure.
● Require an external proxy (eg Apache HTTP) to act as a secure gateway to the REST
services, when configuring a secure cluster.
Confluent Hub
■ Discover and share
Connectors
■ Cassandra (OSS) and
Dynamodb Source/Sink
connectors available.
■ Scylla Shard aware
connector to be
published soon!
https://www.confluent.io/hub/confluentinc/kafka-connect-cassandra
Take away
TakeAways
■ Message queues are useful for a variety of reasons.
■ Scylla Kafka Connecter ( Sink and CDC source) will be coming out
soon!
■ Event Streaming and Event-driven microservices are useful - try it
out!
Thank you Stay in touch
Any questions?
Maheedhar Gunturu
maheedhar@scylladb.com
@vanguard_space
some useful links
Here are some useful links for further reading/watching.
1. Useful video explaining most things for a low level of understanding – https://www.confluent.io/kafka-summit-sf18/so-
you-want-to-write-a-connector
2. Confluent’s Developer guide to connectors which covers most basics –
https://docs.confluent.io/current/connect/devguide.html
3. The source for above developer guide is available through maven here –
https://mvnrepository.com/artifact/org.apache.kafka/connect-file/2.1.1
4. Useful guide providing additional best practices ( now deprecated though still useful) –
https://docs.google.com/document/d/1jEn_G-KDsrhdecPTGIWIcke1I4gw4fR0G8OVj8e3iAI/edit#
5. Verification guide though a little generic as it is for both Connectors and Consumer/producers –
https://www.confluent.io/wp-content/uploads/Verification-Guide-Confluent-Platform-Connectors-Integrations.pdf
6. https://opencredo.com/blogs/kafka-connect-source-connectors-a-detailed-guide-to-connecting-to-what-you-love/
7. https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/
8. https://www.confluent.io/blog/the-simplest-useful-kafka-connect-data-pipeline-in-the-world-or-thereabouts-part-2/
9. https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-3/

More Related Content

What's hot

Cassandra - Tips And Techniques
Cassandra - Tips And TechniquesCassandra - Tips And Techniques
Cassandra - Tips And Techniques
Knoldus Inc.
 
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
ScyllaDB
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
confluent
 
Seastar Summit 2019 Keynote
Seastar Summit 2019 KeynoteSeastar Summit 2019 Keynote
Seastar Summit 2019 Keynote
ScyllaDB
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
ScyllaDB
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
Discover Pinterest
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka Journey
Allen (Xiaozhong) Wang
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
confluent
 
Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat
Administrative techniques to reduce Kafka costs | Anna Kepler, ViasatAdministrative techniques to reduce Kafka costs | Anna Kepler, Viasat
Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat
HostedbyConfluent
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDS
ScyllaDB
 
Scylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair Updates
Scylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair UpdatesScylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair Updates
Scylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair Updates
ScyllaDB
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
Vinay Kumar Chella
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
Hakka Labs
 
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
HBaseCon
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
ScyllaDB
 
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.ioKickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
HostedbyConfluent
 
Scylla Summit 2016: Why Kenshoo is about to displace Cassandra with Scylla
Scylla Summit 2016: Why Kenshoo is about to displace Cassandra with ScyllaScylla Summit 2016: Why Kenshoo is about to displace Cassandra with Scylla
Scylla Summit 2016: Why Kenshoo is about to displace Cassandra with Scylla
ScyllaDB
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
ScyllaDB
 
Lookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesLookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million Devices
ScyllaDB
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
DataStax Academy
 

What's hot (20)

Cassandra - Tips And Techniques
Cassandra - Tips And TechniquesCassandra - Tips And Techniques
Cassandra - Tips And Techniques
 
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
 
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean FellowsDeploying Kafka at Dropbox, Mark Smith, Sean Fellows
Deploying Kafka at Dropbox, Mark Smith, Sean Fellows
 
Seastar Summit 2019 Keynote
Seastar Summit 2019 KeynoteSeastar Summit 2019 Keynote
Seastar Summit 2019 Keynote
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
From Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka JourneyFrom Three Nines to Five Nines - A Kafka Journey
From Three Nines to Five Nines - A Kafka Journey
 
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life ExampleKafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example
 
Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat
Administrative techniques to reduce Kafka costs | Anna Kepler, ViasatAdministrative techniques to reduce Kafka costs | Anna Kepler, Viasat
Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat
 
Scylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDSScylla Summit 2016: Scylla at Samsung SDS
Scylla Summit 2016: Scylla at Samsung SDS
 
Scylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair Updates
Scylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair UpdatesScylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair Updates
Scylla Summit 2018: Scylla Feature Talks - Scylla Streaming and Repair Updates
 
How netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloudHow netflix manages petabyte scale apache cassandra in the cloud
How netflix manages petabyte scale apache cassandra in the cloud
 
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQDataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
 
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kuberneteshbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
 
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.ioKickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
Kickstart your Kafka with Faker Data | Francesco Tisiot, Aiven.io
 
Scylla Summit 2016: Why Kenshoo is about to displace Cassandra with Scylla
Scylla Summit 2016: Why Kenshoo is about to displace Cassandra with ScyllaScylla Summit 2016: Why Kenshoo is about to displace Cassandra with Scylla
Scylla Summit 2016: Why Kenshoo is about to displace Cassandra with Scylla
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
 
Lookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million DevicesLookout on Scaling Security to 100 Million Devices
Lookout on Scaling Security to 100 Million Devices
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
 

Similar to Connecting kafka message systems with scylla

Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
Knoldus Inc.
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
Cedric Vidal
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
DataStax Academy
 
A prototype of utilizing Apache Kafka and Lightweight M2M protocol as the bac...
A prototype of utilizing Apache Kafka and Lightweight M2M protocol as the bac...A prototype of utilizing Apache Kafka and Lightweight M2M protocol as the bac...
A prototype of utilizing Apache Kafka and Lightweight M2M protocol as the bac...
Christos Vasilakis
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
Anton Nazaruk
 
Event Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDBEvent Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDB
ScyllaDB
 
Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?
ScyllaDB
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Helena Edelson
 
Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouse
Altinity Ltd
 
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion DubaiSMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
Codemotion Dubai
 
Implementing Domain Events with Kafka
Implementing Domain Events with KafkaImplementing Domain Events with Kafka
Implementing Domain Events with Kafka
Andrei Rugina
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Guido Schmutz
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
Slim Baltagi
 
Event Driven Architectures with Apache Kafka
Event Driven Architectures with Apache KafkaEvent Driven Architectures with Apache Kafka
Event Driven Architectures with Apache Kafka
Matt Masuda
 
Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for ML
Edunomica
 
How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...
How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...
How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...
HostedbyConfluent
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Timothy Spann
 
Microservices Integration Patterns with Kafka
Microservices Integration Patterns with KafkaMicroservices Integration Patterns with Kafka
Microservices Integration Patterns with Kafka
Kasun Indrasiri
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Data Con LA
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Monal Daxini
 

Similar to Connecting kafka message systems with scylla (20)

Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
 
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
 
A prototype of utilizing Apache Kafka and Lightweight M2M protocol as the bac...
A prototype of utilizing Apache Kafka and Lightweight M2M protocol as the bac...A prototype of utilizing Apache Kafka and Lightweight M2M protocol as the bac...
A prototype of utilizing Apache Kafka and Lightweight M2M protocol as the bac...
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Event Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDBEvent Streaming Architectures with Confluent and ScyllaDB
Event Streaming Architectures with Confluent and ScyllaDB
 
Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?Captial One: Why Stream Data as Part of Data Transformation?
Captial One: Why Stream Data as Part of Data Transformation?
 
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
 
Redpanda and ClickHouse
Redpanda and ClickHouseRedpanda and ClickHouse
Redpanda and ClickHouse
 
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion DubaiSMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
SMACK Stack - Fast Data Done Right by Stefan Siprell at Codemotion Dubai
 
Implementing Domain Events with Kafka
Implementing Domain Events with KafkaImplementing Domain Events with Kafka
Implementing Domain Events with Kafka
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Event Driven Architectures with Apache Kafka
Event Driven Architectures with Apache KafkaEvent Driven Architectures with Apache Kafka
Event Driven Architectures with Apache Kafka
 
Timothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for MLTimothy Spann: Apache Pulsar for ML
Timothy Spann: Apache Pulsar for ML
 
How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...
How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...
How Kafka and MemSQL Became the Dynamic Duo (Sarung Tripathi, MemSQL) Kafka S...
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
 
Microservices Integration Patterns with Kafka
Microservices Integration Patterns with KafkaMicroservices Integration Patterns with Kafka
Microservices Integration Patterns with Kafka
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
 
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
 

Recently uploaded

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 

Recently uploaded (20)

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 

Connecting kafka message systems with scylla

  • 1. Connecting Kafka Message Systems with Scylla Maheedhar Gunturu, Solutions Architect
  • 2. Presenter Maheedhar Gunturu, Solutions Architect Maheedhar held senior roles both in engineering and sales organizations. He has over a decade of experience designing & developing server-side applications in the cloud and working on big data and ETL frameworks in companies such as Samsung, MapR, Apple, VoltDB, Zscaler and Qualcomm. He holds a masters degree in Electrical and Computer engineering from the University of Texas at San Antonio.
  • 3. Agenda 1. Benefits of Message Queues 2. Kafka Connect Framework
  • 5. Benefits of messaging queues ■ Centralized Infrastructure. ■ Intermediate layer for buffering. ■ Export and Import capabilities. ■ Publish CDC streams. ■ Integrate with various applications. ■ Streaming Data Transformations. ■ Impedance mismatch between applications. ■ Ability to recreate state.
  • 7. Intermediate layer for buffering. ■ Provides flexibility for downstream Consumers. ● Buffer data while upgrades, migrations or troubleshooting. ■ Downstream systems don't have to be provisioned for peak traffic ● Save hardware costs. ■ Dynamically scalable layer to handle bursty loads ● Add more partitions/brokers to increase parallelism and throughput. ● Use kafka operator and it will dynamically scale the cluster based on ingress traffic. ■ Provides resiliency and fault tolerance ● Each Topic has replicas available with multiple partitions spread across multiple brokers. ● Set TTLs at the topic level to determine retention.
  • 8. Export and Import capabilities.
  • 9. Publish CDC streams ■ Publish record level changes to the corresponding Topics. ● Usually configurable to what level of detail you want in the change records. ■ Upstream changes from watched rows emitted as a change record ● The format of these rows is in a configurable format (JSON, Avro etc) ■ Downstream processing for reporting, caching, or full-text indexing. ● Subscribe with the corresponding consumer (i.e. Scylla, elasticsearch, spark) ■ Changefeeds are emitted with at-least-once delivery guarantees. ● In most cases, each version of a row will be emitted once. However, some infrequent conditions (e.g., node failures, network partitions) will cause them to be repeated.
  • 10. Integrations Scylla Mongo Example Consumers Serializer App 2 Serializer App 3 ! Schema Registry Elastic Serializer App 1 ! Kafka Topic ● Define the expected fields for each Kafka topic ● Automatically handle schema changes (e.g. new fields) ● Prevent backwards incompatible changes ● Support multi-data center environments Hbase
  • 11. Streaming Data Transformations. Streams API Producer Topic TopicTopic Consumer Consumer Overview • Write standard Java applications • No separate processing cluster required • Exactly-once processing semantics • Elastic, highly scalable, fault-tolerant • Fully integrated with Kafka security Example Use Cases • Event-Driven Microservices • Continuous queries • Continuous transformations Kafka Cluster
  • 12. ■ Applications produce and consume data at a different rate. ● Provides flexibility for the downstream applications to scale based on their SLAs ■ Downstream applications can be independently scaled ● Dynamically move partitions to optimize resource utilization and reliability. ● Enable elastic scaling by easily adding and removing nodes from your Kafka cluster. ■ Tuning topic’s configuration will help in efficient use of consumers ● Determine the ratio between number of partitions in a topic and number of consumers. ● ADB traffic is throttled upon data transfers to ensure network bandwidth Impedance mismatch between applications.
  • 13. Event Sourcing ■ Every change to the state of an application is captured in an event object. ● Order of the events needs to be maintained. ■ Ability to recreate state in your application and the supporting database. ● cqrs provides the benefit of event sourcing analogous to a materialized view ● Need to keep track of lineage and the transformations that were run on the data. ■ Newer versions of ML algorithms can operate on the raw Event data to recreate the state in the database. ● Better model serving/benchmarking.
  • 16. Kafka Connect Features 01 A standard framework for Kafka connectors. 04 Distributed & scalable by default. 04 Automatic offset mgmt. 02 Distributed and standalone modes. 06 Streaming/batch integration. 03 REST interface for configuration. Port: 8083
  • 17. Kafka Connect API CDC Database Mongo Cassandra Elastic Scylla HDFS Kafka Connect API Kafka Pipeline Connect Worker Connect worker Connect worker Connect Worker Connect Worker Connect Worker Sources Sinks Auto-recovery and Fault tolerant Manage hundreds of data sources and sinks Preserves data schema Integrated within Confluent Control Center Simple Parallelism
  • 18. Configuring Kafka Connect (sink) #sample casssandra-sink.properties file name=sink topics=temperature tasks.max=1 connector.class=io.confluent.connect.cassandra.CassandraSinkConn ector cassandra.contact.points=<PUBLIC IPs of your SCYLLA Cluster (IP1,IP2,IP3)> cassandra.keyspace=demo cassandra.compression=SNAPPY cassandra.consistency.level=LOCAL_QUORUM transforms=prune transforms.prune.type=org.apache.kafka.connect.transforms.Replac eField$Value transforms.prune.whitelist=CreatedAt,Id,Text,Source,Truncated 1. Update the sink.properties 2. Update the connect- distributed.properties file 3. start the Connect framework using the Cassandra connector in distributed mode. ref: https://www.scylladb.com/2018/12/19/scylla-and-confluent-for-iot/
  • 19. Kafka Connect Security Encryption ■ Kafka Connect also works with SSL-encrypted connections to these brokers. Authentication ■ Kafka Connect works with SASL – e.g. Kerberos, Active Directory Authorization ■ Restrict who can create, write to, read from topics, and more ■ REST API for Kafka Connect nodes are not secure. ● Require an external proxy (eg Apache HTTP) to act as a secure gateway to the REST services, when configuring a secure cluster.
  • 20. Confluent Hub ■ Discover and share Connectors ■ Cassandra (OSS) and Dynamodb Source/Sink connectors available. ■ Scylla Shard aware connector to be published soon! https://www.confluent.io/hub/confluentinc/kafka-connect-cassandra
  • 22. TakeAways ■ Message queues are useful for a variety of reasons. ■ Scylla Kafka Connecter ( Sink and CDC source) will be coming out soon! ■ Event Streaming and Event-driven microservices are useful - try it out!
  • 23. Thank you Stay in touch Any questions? Maheedhar Gunturu maheedhar@scylladb.com @vanguard_space
  • 24. some useful links Here are some useful links for further reading/watching. 1. Useful video explaining most things for a low level of understanding – https://www.confluent.io/kafka-summit-sf18/so- you-want-to-write-a-connector 2. Confluent’s Developer guide to connectors which covers most basics – https://docs.confluent.io/current/connect/devguide.html 3. The source for above developer guide is available through maven here – https://mvnrepository.com/artifact/org.apache.kafka/connect-file/2.1.1 4. Useful guide providing additional best practices ( now deprecated though still useful) – https://docs.google.com/document/d/1jEn_G-KDsrhdecPTGIWIcke1I4gw4fR0G8OVj8e3iAI/edit# 5. Verification guide though a little generic as it is for both Connectors and Consumer/producers – https://www.confluent.io/wp-content/uploads/Verification-Guide-Confluent-Platform-Connectors-Integrations.pdf 6. https://opencredo.com/blogs/kafka-connect-source-connectors-a-detailed-guide-to-connecting-to-what-you-love/ 7. https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/ 8. https://www.confluent.io/blog/the-simplest-useful-kafka-connect-data-pipeline-in-the-world-or-thereabouts-part-2/ 9. https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-3/

Editor's Notes

  1. Kafka is, it is highly available and resilient to node failures and supports automatic recovery. This feature makes Apache Kafka ideal for communication and integration between components of large-scale data systems in real-world data systems.
  2. A typical ratio of the number of partitions in a topic to the number of consumers in a group would be (1:1) or (1:2) https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines https://www.confluent.io/blog/apache-kafka-supports-200k-partitions-per-cluster
  3. Topic partition is the unit of parallelism in Kafka. On both the producer and the broker side, writes to different partitions can be done fully in parallel. Kafka can replicate partitions across a configurable number of Kafka servers. Each partition has a leader server and zero or more follower servers. Leaders handle all read and write requests for a partition. Kafka uses also uses partitions for parallel consumer handling within a group. Each Broker handles its share of data and requests by sharing partition leadership. The partitions in each topic that all of the consumers are subscribed to are assigned dynamically to the consumers in round-robin fashion.
  4. Phrase coined by Martin Fowler. - CQRS and Event Sourcing Command Query Responsibility Segregation https://martinfowler.com/eaaDev/EventSourcing.html
  5. Kafka Connect is an open source framework, built as another layer on core Apache Kafka, to support large scale streaming data.
  6. SoC (Separation of Concerns)
  7. Includes transformations (SMT) Has ability to communicate with schema registry Currently the API is primarily Java and Scala only.