Enterprise wide publish subscribe with Apache Kafka

Enterprise Wide Publish / Subscribe Models
With Apache Kafka
Johan Louwers – Global Lead Architect

2Apache Kafka| Johan Louwers | 2018 © 2018 Capgemini. All rights reserved.
Application
Schema C
Database
Application (backend)
Client Device / Client Application
Schema B
Schema A
UI Rendering Logic
Application Logic
Interface Logic
ExternalSystems
Monolithic Architecture
Still used today
Monolithic Architecture
- Commonly seen in traditional large scale enterprise
deployment
- Not flexible and hard to maintain
- Seen as a dead end

Database
(micro)Service(s)
Service C
RESTAPI
Database
(micro)Service(s)
Service B
RESTAPI
Database
(micro)Service(s)
Service A
RESTAPI
UI Applications & Application Backend Services
API gateway
Microservice Architecture
Decompose the monolith
- Decomposition of the monolith
- Commonly deployed in containers of functions
- Commonly self scaling
- Commonly loosely coupled
- Commonly makes use of REST / gRPC
- Commonly makes use isolated persistence

Database
(micro)Service(s)
Service C
RESTAPI
Database
(micro)Service(s)
Service B
RESTAPI
Database
(micro)Service(s)
Service A
RESTAPI
API gateway
Distribute transactions (option 1)
Distributed transactions over multiple services
- UI calls all services involved in a business
transaction
- High changes of inconsistency
- Complex UI logic and hard to maintain

Database
(micro)Service(s)
Service C
RESTAPI
Database
(micro)Service(s)
Service B
RESTAPI
Database
(micro)Service(s)
Service A
RESTAPI
API gateway
- UI calls initial service for business transaction
- Service propagate transaction to other services
- High level of point 2 point connections
- Complex to add new services

Database
(micro)Service(s)
Service C
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service B
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service A
Subscribe
r
Publisher
RESTAPI
KafkaPub/Sub
UI Applications
App. Backend
Services
API gateway
- UI calls initial service for business transaction
- Service publish event to a Kafka topic
- A topic can be read by every subscriber
- Care about your consumers by not caring

Database
(micro)Service(s)
Service C
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service B
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service A
Subscribe
r
Publisher
RESTAPI
KafkaPub/Sub
UI Applications
App. Backend
Services
API gateway
All services are publishers
- “event” : a significant change of state
- Do not share the data, share the event
- CRUD events:
- CREATE & UPDATE : standard
- DELETE : do you really want that?
- READ : on some very specific cases
{
"event": {
"event_topic": "customerdata",
"event_type": "create",
"event_application": "CRMAPAC",
"event_service": "customers",
"event_data_object": "deliveryLocations",
"event_data_location": "https://api.crmapac.acme.com/customers/1264/deliveryLocations/26/",
"classification": {
"classified_gdpr": true,
"classified_internalviews": true,
"classified_publicviews": false,
"classified_privateviews": true
}
}
}

Database
(micro)Service(s)
Service C
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service B
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service A
Subscribe
r
Publisher
RESTAPI
KafkaPub/Sub
UI Applications
App. Backend
Services
API gateway
from kafka import KafkaProducer
from kafka.errors import KafkaError
producer = KafkaProducer(bootstrap_servers=['broker1:1234'])
# Asynchronous by default
future = producer.send('my-topic', b'raw_bytes')
# Block for 'synchronous' sends
try:
record_metadata = future.get(timeout=10)
except KafkaError:
# Decide what to do if produce request failed...
log.exception()
pass
# Successful result returns assigned partition and offset
print (record_metadata.topic)
print (record_metadata.partition)
print (record_metadata.offset)
# produce keyed messages to enable hashed partitioning
producer.send('my-topic', key=b'foo', value=b'bar')
# encode objects via msgpack
producer = KafkaProducer(value_serializer=msgpack.dumps)
producer.send('msgpack-topic', {'key': 'value'})
# produce json messages
producer = KafkaProducer(value_serializer=lambda m: json.dumps(m).encode('ascii'))
producer.send('json-topic', {'key': 'value'})
# produce asynchronously
for _ in range(100):
producer.send('my-topic', b'msg')
def on_send_success(record_metadata):
print(record_metadata.topic)
print(record_metadata.partition)
print(record_metadata.offset)
def on_send_error(excp):
log.error('I am an errback', exc_info=excp)
# handle exception
# produce asynchronously with callbacks
producer.send('my-topic', b'raw_bytes').add_callback(on_send_success).add_errback(on_send_error)
# block until all async messages are sent
producer.flush()
# configure multiple retries
producer = KafkaProducer(retries=5)

Database
(micro)Service(s)
Service C
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service B
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service A
Subscribe
r
Publisher
RESTAPI
KafkaPub/Sub
UI Applications
App. Backend
Services
API gateway
Services can be a subscriber
- Easily add new services by subscribing to topics
- Kafka topic consumption:
- REST API / Native Kafka client
- PULL/PUSH
- Subscriber defines the required actions on an event
- Subscriber calls publisher server API to obtain data
- Data should never be stored outside the context of
the owning service
- or with care and understanding{
"event": {
"event_topic": "customerdata",
"event_type": "create",
"event_application": "CRMAPAC",
"event_service": "customers",
"event_data_object": "deliveryLocations",
"event_data_location": "https://api.crmapac.acme.com/customers/1264/deliveryLocations/26/",
"classification": {
"classified_gdpr": true,
"classified_internalviews": true,
"classified_publicviews": false,
"classified_privateviews": true
}
}
}

Database
(micro)Service(s)
Service C
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service B
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service A
Subscribe
r
Publisher
RESTAPI
KafkaPub/Sub
UI Applications
App. Backend
Services
API gateway
from kafka import KafkaConsumer
# To consume latest messages and auto-commit offsets
consumer = KafkaConsumer('my-topic',
group_id='my-group',
bootstrap_servers=['localhost:9092'])
for message in consumer:
# message value and key are raw bytes -- decode if necessary!
# e.g., for unicode: `message.value.decode('utf-8')`
print ("%s:%d:%d: key=%s value=%s" % (message.topic, message.partition,
message.offset, message.key,
message.value))
# consume earliest available messages, don't commit offsets
KafkaConsumer(auto_offset_reset='earliest', enable_auto_commit=False)
# consume json messages
KafkaConsumer(value_deserializer=lambda m: json.loads(m.decode('ascii')))
# consume msgpack
KafkaConsumer(value_deserializer=msgpack.unpackb)
# StopIteration if no message after 1sec
KafkaConsumer(consumer_timeout_ms=1000)
# Subscribe to a regex topic pattern
consumer = KafkaConsumer()
consumer.subscribe(pattern='^awesome.*')
# Use multiple consumers in parallel w/ 0.9 kafka brokers
# typically you would run each on a different server / process / CPU
consumer1 = KafkaConsumer('my-topic',
bootstrap_servers='my.server.com')
consumer2 = KafkaConsumer('my-topic',
bootstrap_servers='my.server.com')

Database
(micro)Service(s)
Service C
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service B
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service A
Subscribe
r
Publisher
RESTAPI
KafkaPub/Sub
UI Applications
App. Backend
Services
API gateway
External API ServiceExt consumers
Inform the world
Enable the “world” to subscribe as much as possible
- Allow all known services to subscribe to topics
- Allow unknown parties to subscribe to topic via the
external service API
- Ensure both known (internal) and unknown
(external) parties can only access the data in a
secured manner (OAUTH2)

Database
(micro)Service(s)
Service C
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service B
Subscribe
r
Publisher
RESTAPI
Database
(micro)Service(s)
Service A
Subscribe
r
Publisher
RESTAPI
KafkaPub/Sub
UI Applications
App. Backend
Services
External API ServiceExt consumers
Outbound WebHook
Ext/Int
subscribers
Inform the world
Enable the “world” to subscribe as much as possible
- Do provide the option to subscribe to a webhook
- Push updates to HTTP(S) endpoints
- Prevent aggressive polling
- Allows for notification “spreading”
- Prevent request storms
API gateway

Kafka Pub/Sub
External API ServiceOutbound WebHook
External Consumers
Application 1
(backend)
Application 3
(backend)
Application 2
(backend)
Inform everyone
Simplify the picture
- External and internal consumers
- Provide APIs, native clients and webhooks
- Strive for an event driven architecture

Kafka Pub/Sub
External Consumers
Application 1
(backend)
Application 3
(backend)
Application 2
(backend)
Remember – It is NOT a database
More than pub/sub - KSQL
Kafka, more than pub/sub
Query kafka
- Allow developers to query Kafka
- Stream analytics
- KSQL, a bit more than SQL
- Kafka is NOT a database
- Keep retention costs in mind

Kafka Pub/Sub
External Consumers
Application 1
(backend)
Application 3
(backend)
Application 2
(backend)
CREATE STREAM pageviews
(viewtime BIGINT,
userid VARCHAR,
gender VARCHAR,
regionid VARCHAR,
pageid VARCHAR)
WITH (KAFKA_TOPIC='pageviews',
VALUE_FORMAT='DELIMITED');
SELECT regionid, COUNT(*) FROM pageviews
WINDOW HOPPING (SIZE 30 SECONDS, ADVANCE BY 10 SECONDS)
WHERE UCASE(gender)='FEMALE' AND LCASE (regionid) LIKE '%_6'
GROUP BY regionid;
Remember – It is NOT a database
More than pub/sub - KSQL

DC - A
How to deploy Kafka
Make it high available
Make it High Available
It is harder than you think
- Single DC cluster
- Issues with DC failure
- Stretched DC cluster
- Issues with split brain and CAP theorem
- Cluster per DC
- Issues with topic sync
Kafka node A-
3
Kafka node A-
2
Kafka node A-
1
Kafka node A-
0
DC - B
Kafka node A-
3
Kafka node A-
2
Kafka node A-
1
Kafka node B-
0
DC - A
Kafka node AKafka node 0
DC - B
DC - A
Kafka node A-
3
Kafka node A-
2
Kafka node A-
1
Kafka node A-
0
DC - B

DC - A
How to get Kafka
Build versus buy versus consume
Kafka provides a high value
It comes with a lot of complexity
- Build (& deploy)
- Invest in a R&D and development
- Invest in (virtual HW)
- You build it you run it
- You know and own everything
- Buy
- High support costs
- Invest in (virtual HW)
- Still requires deep knowledge
- Could require a maintenance partner
- Consume
- Pay-per-use PaaS models
- Most solutions are in the cloud
- Some solutions are private cloud on converged
infra
Kafka node A-
3
Kafka node A-
2
Kafka node A-
1
Kafka node A-
0
DC - B
Kafka node A-
3
Kafka node A-
2
Kafka node A-
1
Kafka node B-
0
DC - A
DC - B
DC - A
Kafka node A-
3
Kafka node A-
2
Kafka node A-
1
Kafka node A-
0
DC - B

Enterprise wide publish subscribe with Apache Kafka

Enterprise wide publish subscribe with Apache Kafka

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Enterprise wide publish subscribe with Apache Kafka

Similar to Enterprise wide publish subscribe with Apache Kafka (20)

More from Johan Louwers

More from Johan Louwers (20)

Recently uploaded

Recently uploaded (20)

Enterprise wide publish subscribe with Apache Kafka