Kafka Streams has captured the hearts and minds of many developers that want to develop streaming applications on top of Kafka. But as powerful as the framework is, Kafka Streams has had a hard time getting around the requirement of writing Java code and setting up build pipelines. There were some attempts to rebuild Kafka Streams, but up until now popular languages like Python did not receive equally powerful (and maintained) stream processing frameworks. In this session we will present a new declarative approach to unlock Kafka Streams, called KSML. After this session you will be able to write streaming applications yourself, using only a few simple basic rules and Python snippets.
Introducing KSML: Kafka Streams for low code environments | Jeroen van Disseldorp and Ton van Bart, Axual
1. Introducing KSML:
Kafka Streams for
Low-code Environments
Kafka Summit Europe 2021
Jeroen van Disseldorp
CEO
Ton van Bart
Senior Engineer
STREAMING
MADE SIMPLE
2. Axual in a nutshell
Page 2
Easy setups
kafka++
On-premise
&
Clouds
streaming
made simple
Battle-tested
security
Fully
Supported
3. Data Governance
Role-based access control
Distributed governance
Metadata catalogs
Data lineage
Security
Network and storage encryption
Strict access control
Regularly tested by independent third
third parties
Operations
High availability
Software-defined and automated
deployments
Metric collection and dashboards
Log collection and analysis
App development & process
Self-service for DevOps
DTAP Support
Client libraries
Test tooling
CI / CD integration
Platform
Publish/Subscribe Messaging
High Performance
Multi-tenant
Multi-cluster with transparent
message replication
Integration options & Protocol
support
Infrastructure
Fully containerized for on-premise &
clouds
Axual provides Kafka in an Enterprise-Ready package
Page 3
Logging
and
Monitoring
Infrastructure
Streaming
Platform
Data
Governance
VMs
Kubernetes
Kafka
REST
Connect
Distributor
Prometheus
Axual Deploy
Helm
Discovery Schema
Platform Manager Metadata
Self-service UI
Grafana
Cluster
Mgr
Instance
Mgr
Alert
Manager
Logstash
ElasticSearch
Kibana
Docker
Linux
Hard
ware
On-premise, AWS, Azure, Google, ...
4. Self-service allows DevOps teams to collaborate independently
Page 4
Applications can be multiple types:
• Custom Custom code Java, Python, .NET Runs on Team’s own platform
• Connector Low code JDBC, Snowflake, ... Runs on a central platform
• Streaming App Low code Filter, Aggregate, ... Runs on a central platform
?
7. YAML tags are interpreted as a DSL
• Requires no “coding”
Domain Specific Language for HA
Automations: tasks executed upon some trigger
Scripts: functions that may be called upon from
anywhere (function library)
Sensors: inputs from very different plugins, like
smart lights, smarts cars or even Jumbo
supermarket
...
No code compilation takes place, only
interpretation
Page 7
9. What if we apply such a mechanism to Axual / Kafka
With YAML we could specify a Kafka Streams topology
Example
pipeline:
main:
from: some_topic
to: some_other_topic
Page 9
Interpreter
YAML + =
But wait... what about custom functions like
• Predicates
• Transformations of keys and values
• ValueJoiners
• ...
Could we somehow specify them without requiring compilation?
11. KSML: a library to generate Kafka Streams topologies
KSML reads YAML and generates a complete Kafka Streams topology
Page 11
KSML
Topology
Generator
KSML Input + =
12. Demonstration
SensorData
• Name String The name of the sensor
• Timestamp Long The timestamp of the sensor reading
• Value String The value of the sensor, represented as string
• Type Enum The type of the sensor
"AREA", "HUMIDITY", "LENGTH", "STATE", "TEMPERATURE"
• Unit String The unit of the sensor
• Color String The color of the sensor
• City String The city of the sensor
• Owner String The owner of the sensor
Page 12
13. DEMO
1. Inspect Python Print
2. Copy copy all messages to given output
3. Filter copy only if input matches a criterium, Python condition
4. Branch multiple filters combined in one construct
5. Router output messages to dynamic topic name
Page 13
15. Complete the Kafka Streams DSL
• All stream types implemented
KStreams, KTables, GlobalKTables, KGroupedStream, KGroupedTable, SessionWindowedKStream,
TimeWindowedKStream
• All stream operations implemented
Filter, FilterNot, ForEach, Aggregate, Count, Reduce, GroupBy, ...
• All custom function types implemented
Aggregator, ForEachAction, Initializer, KeyTransformer, Merger, Predicate, Reducer, ...
• Extending the list of supported data types
Primitive types implemented (String, Int, Long, Float, Double, Binary)
Basic AVRO works
JSON support coming very soon
Page 15
16. Next steps...
• Test, test, test
Current code is alpha
• Build community
Get feedback on the idea
Imagine potential use cases
Work out deployment model(s), such as
Simple inclusion library for Java applications
Docker image with Spring Boot
Kafka Connect-like container with multiple threads
• Work towards beta release
Implement unit tests
Refine typing model extend upon current AVRO-support
Refine syntax (currently very much alpha state)
Refine implementation
Page 16
Join us at
https://ksml.io