1. Learn Kafka and event-driven
architecture
Vinay Kumar
@vinaykuma201
2. • ORACLE ACE
• Global Integration Architect
• Author of “Beginning Oracle WebCenter portal 12c”
• Blogger- http://www.techartifact.com/blogs
• https://www.linkedin.com/in/vinaykumar2/
2
About me
3.
4. Agenda
4
• Interaction Style
• Traditional SOA approach
• Event driven Architecture
• Event with Microservices
• Event streaming with event Hub
• Integration with legacy.
• Kafka
• Kafka internals
• Async APIs
5. Communication Styles
5
Type of Interaction Initiator Participants
Time-driven Time The specifice system
Request-driven Client Client & Server
Event-driven event Open-ended
9. 9
Document Title - Name - Function - Business Unit DD/MM/YYYY
Event Notification
10. ■Anything happened (or didnt happen).
■A change in the state.
■An event is always named in the past tense and is immutabled
■A condition that triggers a notification.
CustomerAddressChanged
InventoryUpdated
SalesOrderCreated
PurschaseOrderCreated
10
Events
11. ■“Real-time” events as they happen at the producer
■ Push notifications
■ One-way “fire-and-forget”
■ Immediate action at the consumers
■ Informational (“someone logged in”), not commands (“audit this”)
11
Characterstics of Events
12. Typical EDA Architecture
Event Bus
System System
System
System System System
Event Producers
Event Transport
Event Consumer
13. ■ Supports the business demands for better service (no batch,
less waiting)
■ No point-to-point integrations (fire & forget)
■ Fault tolerance, scalability, versatility, and other benefits of
loose coupling.
■ Powerful real-time response and analytics.
■ Greater operational efficiencies
13
benefits of EDA
15. 15
Document Title - Name - Function - Business Unit DD/MM/YYYY
Monolithic
Shop
Customer
Customer
Inventory
Payment
Might be individual bounded
context in new world
16. 16
Document Title - Name - Function - Business Unit DD/MM/YYYY
Traditional Archiecture (Point to Point)
Shop
Customer
Marketing
Inventory
Payment Reporting
Inventory
17. 17
Document Title - Name - Function - Business Unit DD/MM/YYYY
Traditional Archiecture- ESB
Shop
Customer
Marketing
Inventory
Payment Reporting
Enterprise Service Bus
18. 18
Document Title - Name - Function - Business Unit DD/MM/YYYY
Traditional Archiecture- ESB
Shop
Customer
Marketing
Inventory
Payment Reporting
Lets add
some fraud
check and
new
version
V1
V2
Enterprise Service Bus
fraud
fraud
19. 19
Document Title - Name - Function - Business Unit DD/MM/YYYY
Lets add loyalty features
Shop
Customer
Marketing
Inventory
Payment Reporting
V1
V2
Enterprise Service Bus
fraud
fraud
Loyalty
20. 20
Document Title - Name - Function - Business Unit DD/MM/YYYY
When we scale up and see Integration ripple effect
Shop
Customer
Marketing
Inventory
Payment Reporting
V1
V2
Enterprise Service Bus
fraud
fraud
Loyalty
Loyalty
21. ■SOA is all about dividing domain logic into separate systems and exposing
it as services
■In some cases business logic will, by its very nature, be spread out over
many systems or across various domain (cross domain).
■The result is domain pollution and bloat in the SOA systems.
Whats the problem
23. ■Domain driven design promote the business logic to expose as a service
and focus should be on domain and domain logic.
■Domain event is an activity happened that domain expert is concerned.
■By exposing relevant Domain Events on a shared event bus we can
isolate cross cutting functions to separate systems
SOA and Domain events
24. SOA + Domain events
Shop
Customer
Marketing
Reporting
Inventory
Payment
Event
Bus
Loyalty
Fraud
BAM
26. Event Driven (Async) in Microservices
Shop DB
Shop logic
Shop API
Customer DB
Customer logic
Customer API
Payment DB
Payment logic
Payment API
Event Hub
Shop Microservices
Payment Microservices
Customer Microservices
Producer,
Consumer
Consumer
Consumer
27. 27
Document Title - Name - Function - Business Unit DD/MM/YYYY
Event Streaming Data Source
Shop DB
Shop logic
Shop API
Customer DB
Customer logic
Customer API
Payment DB
Payment logic
Payment API
Event Hub
Shop Microservices
Payment Microservices
Customer Microservices
Producer,
Consumer
Consumer
Consumer
Mobile Apps
Social Media
Stocks Blockchain
Location IOT
Events Streaming
28. 28
Document Title - Name - Function - Business Unit DD/MM/YYYY
Microservice events and Streaming processing
State
Microservice
API
Microservices Cluster
Mobile Apps
Social Media
Stocks Blockchain
Location IOT
Events Stream
Event
Hub
Events Stream
Stream Processing
Cluster
Stream
Analytics
dashboard
Events Stream
Reference
Model
Results
BI tools
Search/Discover
Mobile &
online apps
SQL
Service
Service
Service
29. ■ Domain event- In domain-driven design, domain events are described as something that happens in the domain and is
important to domain experts.
- A user has registered
- An order has been cancelled.
- The payment has been received
Domain events are relevant both within a bounded context and across bounded contexts for implementing processes within the
domain.
Best for communication between bounded context.
29
Document Title - Name - Function - Business Unit DD/MM/YYYY
Domain event and event sourcing
■ Event Sourcing - Event Sourcing ensures that all changes to application state are stored as a sequence of events. It store
the events that lead to specific state and state too.
- MobileNumberProvided (MobileNumber)
- VerificationCodeGenerated (VerificationCode)
- MobileNumberValidated (no additional state)
- UserDetailsProvided (FullName, Address, …)
These events are sufficient to reconstruct the current state of the UserRegistration aggregate at any time.
Event Sourcing is for persistent strategy. Event Sourcing makes it easier to fix inconsistencies. Event Sourcing is local for a
domain.
32. 32
Document Title - Name - Function - Business Unit DD/MM/YYYY
Kafka Overview
• Distributed publish-subscribe messaging system.
• Designed for processing of real time activity stream data (log,
metrics, collections, social media streams,…..)
• Does not use JMS API and standards
• Kafka maintains feeds of message in topics
• Initially developed at Linkedin, now part of Apache.
33. 33
Document Title - Name - Function - Business Unit DD/MM/YYYY
Kafka History
34. ■ Reliability. Kafka is distributed, partitioned, replicated, and fault tolerant. Kafka
replicates data and is able to support multiple subscribers. Additionally, it automatically
balances consumers in the event of failure.
■ Scalability. Kafka is a distributed system that scales quickly and easily without
incurring any downtime.
■ Durability. Kafka uses a distributed commit log, which means messages persists on
disk as fast as possible providing intra-cluster replication, hence it is durable.
■ Performance. Kafka has high throughput for both publishing and subscribing
messages. It maintains stable performance even when dealing with many terabytes of
stored messages.
34
Document Title - Name - Function - Business Unit DD/MM/YYYY
Benefits of Kafka
35. ■ Kafka is a messaging system that is designed to be fast, scalable, and durable.
■ A producer is an entity/application that publishes data to a Kafka cluster, which is made up of brokers.
■ A Broker is responsible for receiving and storing the data when a producer publishes.
■ A consumer then consumes data from a broker at a specified offset, i.e. position.
■ A Topic is a category/feed name to which records are stored and published. Topics have partitions and
order guaranteed per partitions
■ All Kafka records are organized into topics. Producer applications write data to topics and consumer
applications read from topics.
35
Document Title - Name - Function - Business Unit DD/MM/YYYY
What is kafka
37. 37
Document Title - Name - Function - Business Unit DD/MM/YYYY
Kafka Architecture.
38. ■ Topic is divided in partitions.
■ The message order is only guarantee inside a partition
■ Consumer offsets are persisted by Kafka with a commit/auto-commit mechanism.
■ Consumers subscribes to topics
■ Consumers with different group-id receives all messages of the topics they subscribe.
They consume the messages at their own speed.
■ Consumers sharing the same group-id will be assigned to one (or several) partition of
the topics they subscribe. They only receive messages from their partitions. So a
constraint appears here: the number of partitions in a topic gives the maximum number
of parallel consumers.
■ The assignment of partitions to consumer can be automatic and performed by Kafka
(through Zookeeper). If a consumer stops polling or is too slow, a process call “re-
balancing” is performed and the partitions are re-assigned to other consumers.
38
Document Title - Name - Function - Business Unit DD/MM/YYYY
Key Concepts of Kafka
39. ■ Kafka normally divides topic in multiply partitions.
■ Each partition is an ordered, immutable sequence of messages that is continually appended to.
■ A message in a partition is identified by a sequence number called offset.
■ The FIFO is only guarantee inside a partition.
■ When a topic is created, the number of partitions should be given
■ The producer can choose which partition will get the message or let Kafka decides for him based on
a hash of the message key (recommended). So the message key is important and will be the used
to ensure the message order.
■ Moreover, as the consumer will be assigned to one or several partition, the key will also “group”
messages to a same consumer.
39
Document Title - Name - Function - Business Unit DD/MM/YYYY
Key Concepts of Kafka - continued
40. ■ A data source writes messages to the
log and one or more consumers reads
from the log at the point in time they
choose.
■ In the diagram below a data source is
writing to the log and consumers A and
B are reading from the log at different
offsets.
40
Document Title - Name - Function - Business Unit DD/MM/YYYY
Log Anatomy
41. ■ We have a broker with three topics, where each topic has 8 partitions.
■ The producer sends a record to partition 1 in topic 1 and since the partition is empty the
record ends up at offset 0.
41
Document Title - Name - Function - Business Unit DD/MM/YYYY
Record flow in Apache Kafka
42. ■ Next record is added to partition 1 will and up at offset 1, and the next record at offset 2
and so on.
■ This is a commit log, each record is appended to the log and there is no way to change
the existing records in the log(immutable). This is also the same offset that the consumer
uses to specify where to start reading.
42
Document Title - Name - Function - Business Unit DD/MM/YYYY
Record flow in Apache Kafka - continued
43. 43
Document Title - Name - Function - Business Unit DD/MM/YYYY
Apache Kafka Architecture
44. ■ Each broker holds a number of partitions and each of these partitions can be either a
leader or a replica for a topic.
■ All writes and reads to a topic go through the leader and the leader coordinates updating
replicas with new data. If a leader fails, a replica takes over as the new leader.
44
Document Title - Name - Function - Business Unit DD/MM/YYYY
Kafka - Partitions and Brokers
45. ■ Producers write to a single leader, this provides a means of load balancing production so
that each write can be serviced by a separate broker and machine.
■ In the image, the producer is writing to partition 0 of the topic and partition 0 replicates
that write to the available replicas.
45
Document Title - Name - Function - Business Unit DD/MM/YYYY
Kafka – Producers writing to broker
46. 46
Document Title - Name - Function - Business Unit DD/MM/YYYY
Kafka Architecture- Topic Replication factor
47. ■ Producer is process that can publish a
message to a topic.
■ Consumer is a process that can subscribe to
one or more topics and consume messages
published to topics.
■ Topic category is the name of the feed to which
messages are published.
■ Broker is a process running on single machine
■ Cluster is a group of brokers working together.
■ Broker management done by Zookeeper.
47
Document Title - Name - Function - Business Unit DD/MM/YYYY
Flow of a record in Kafka
48. ■ Auto Scalable infrastructure.
■ Multi language support (SDKs)
■ Event Streaming Database
■ GUI driven management and monitoring
■ Enterprise Grade security
■ Diagnostic Logs
■ Data Monitor
■ Global Resilience
■ Disaster Recovery
■ Connector with legacy application
■ Retention management
■ Flexible DevOps
■ …..
Document Title - Name - Function - Business Unit DD/MM/YYYY
What are capabilites of events hub
49. ■ Traditional message broker (not really event driven)
49
Document Title - Name - Function - Business Unit DD/MM/YYYY
Alternatives of event-hub or kafka?
50. 50
Document Title - Name - Function - Business Unit DD/MM/YYYY
What about legacy App?
RDBMS
Existing App
Event
Hub
New APP
Change Data capture
51. ■ Attunity Replicate
■ Debezium (open source)
■ IBM IIDR
■ Oracle GoldenGate for Big Data
■ SQ Data
51
Document Title - Name - Function - Business Unit DD/MM/YYYY
Change data capture tools
52. 52
Document Title - Name - Function - Business Unit DD/MM/YYYY
Microservice events and Streaming processing and legacy application
State
Microservice
API
Microservices Cluster
Mobile Apps
Social Media
Stocks Blockchain
Location IOT
Events Stream
Event
Hub
Events Stream
Stream Processing
Cluster
Stream
Analytics
dashboard
Events Stream
Reference
Model
Results
BI tools
Search/Discover
Mobile &
online apps
SQL
Service
Service
Service
State
State
Finance Sales Audit
Change Data capture
Big Data
54. Integrate kafka with legacy
• JDBC connector for Kafka connect
• Use CDC (Change data capture) tool which integrates with kafka
connect.
55. ■ Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and
other data systems.
Runs separately from Kafka brokers.
55
Document Title - Name - Function - Business Unit DD/MM/YYYY
Kafka connect
56. ■ How do manage the event lifecycle?
■ We have API management platform for APIs.
56
Document Title - Name - Function - Business Unit DD/MM/YYYY
Async API - https://www.asyncapi.com/
Event lifecycle
- Design
- Documentation
- Code generation
- Event management
- Test
- Monitoring
An Async API document is a file that defines and annotates
the different components of a specific Event-Driven API.
57. 57
Document Title - Name - Function - Business Unit DD/MM/YYYY
OpenAPI- AsyncAPI comparison
58. Summary
58
• Make the split right – Bounded context.
• Events are the communication between bounded context.
• Event can be Async communication b/w microsevices.
• Kafka is great source for messaging.
• Event hub is key in new enterprise integration world.
• Use CDC for legacy integrationn.
• Try Async API for event documentation.