"Capturing geospatial data of aircraft is data intensive due to needing frequent updates for the quickly changing position and velocity data points. A well-tuned real-time system for high throughput and low latency is a must as data is useless to operators if it is just a few seconds behind as they cannot make meaningful real-time decisions otherwise. In addition, the collaboration between operators must also be in real-time. This system, called Raft’s Data Fabric, modernizes this process for the United States Air Force as the communication layer for the systems acting upon real-time geospatial data of all aircraft over the United States airspace.
Data Fabric is built with Apache Kafka at its core. It uses KeyCloak for authentication, Open-Policy Agent for authorization, Kafka Streams for stateful processing, database systems (relational, OLAP, and document) for search, WebSockets for visualization integration, and Grafana for observability.
From this real-world use case of using real-time technologies, you will walk away knowing how to build end-to-end real-time systems with Apache Kafka at its core. Specifically:
* Tuning Apache Kafka for high throughput and low latency for small and larger messages.
* Building a WebSocket Kafka consumer for web-based visualization.
* Securely fusing and moving real-time data from unsecured to secured environments to provide better decision-making abilities.
* Integration with database systems for real-time and historical use cases.
* Reasons to consider leveraging OAUTH authorization and Open Policy Agent for authorization for your Kafka Cluster."
13. March
2024
·
Kafka
Summit
London
Message Format XML
● XML ⇒ JSON ✔
● JSON ⇒ XML ✖
consumers expected XML (& valid against XSD schema)
● Additional Challenges
○ marshaling bytes into XML
○ schema validation speed
○ thread safety
17. March
2024
·
Kafka
Summit
London
Message Format XML
● Staying with XML easier than converting back to XML.
● Optimize Read-Only for speed by compromising on Storage.
● Lessons learned here would apply to other formats; just more
obvious with XML.
19. March
2024
·
Kafka
Summit
London
Configurations - Topics
Challenge 2 – Throughput and Latency
● 12 partitions
○ even distribution across availability zones (÷3)
○ even consumer workloads
■ 1/2/3/4/6/12 (6)
● Currently evaluating 24 for some topics
○ also evenly distributed across availability zones (÷3)
○ 1/2/3/4/6/8/12/24 (8)
■ 30->1/2/3/5/6/10/15/30 (7)
■ 36->1/2/3/4/6/9/12/18/36 (9)
20. March
2024
·
Kafka
Summit
London
Configurations - Producer
Challenge 2 – Throughput and Latency
● buffer.size=200_000
○ or more
● linger.ms=10-50
○ balance of latency & throughput
● compression.type=lz4
○ Always test your data against your compression
○ Never compress compressed data
21. March
2024
·
Kafka
Summit
London
Configurations - Consumer
Challenge 2 – Throughput and Latency
● 12 partitions
○ even distribution across availability zones (÷3)
○ even consumer workloads
■ 1/2/3/4/6/12
● max.partition.fetch.bytes & fetch.max.bytes
○ adding partitions can increase latency
(especially if number of consumers isn't increased)
22. March
2024
·
Kafka
Summit
London
Configurations - Performance
Challenge 2 – Throughput and Latency
● Start with the producer
● Then the partitions
● Kafka Streams
○ State Store - Caching (Disable for Latency)
○ Commit Interval (Reduce for Visibility and Latency)
○ Threading - depends on topology & number of
containers.
27. March
2024
·
Kafka
Summit
London
Custom Kafka Rest API
Challenge 4 – Pattern of Life Analysis
● One of the easiest parts to build
○ Producer
■ linger.ms - major impact - chose wisely
○ Consumer -- not RESTful ☞ Websockets
● One of the easiest mistakes to make - waiting...
○ producer flushing
○ linger.ms
○ Callback vs. Waiting on Future
Leverage Framework, e.g. Spring's Deferred Result
● Http Response Codes
○ 200 - OK
○ 201 - Created (try to avoid using this one, but some
clients....)
○ 202 - Accepted
29. March
2024
·
Kafka
Summit
London
Websocket - Consumer
Challenge 5 – Real Time Data – Hot Data
● consumer.subscribe() vs. consumer.assign()
● handling backpressure
● tried Java 21 and Virtual Threads - did not help...
● 2 implementation
○ web-socket consumption drains queue
○ 30 second eviction (independent of consumption)
● garbage collection
topic poll() websocket
LinkedBlockingQueue push()
30. March
2024
·
Kafka
Summit
London
Websocket - Consumer
Challenge 5 – Real Time Data – Hot Data
● Data Mashing over Websocket
○ XML - not great
○ JSON - better
○ Apache Arrow - best
topic poll() websocket
LinkedBlockingQueue push()
consumer thread thread / websocket
32. March
2024
·
Kafka
Summit
London
Kafka Streams
Challenge 6 – Real Time Data Enrichment
● Avoid initial rekeying (trust but verify), every rekey adds latency.
filter((k, v) -> {
if (k ≠ v.id()) {
log.error("invalid...");
return false;
}
● Global Tables / KTables
33. March
2024
·
Kafka
Summit
London
Kafka Streams
Challenge 6 – Real Time Data Enrichment
● Commit Interval
○ 100ms - 5000ms
● Threads
○ Containers vs Stream Threads
● If Scaling down and up (and not using static membership)
internal.leave.group.on.close = true
● If Scaling down and up (and static membership + Kafka Streams 3.3+)
KafkaStreams.CloseOptions closeOptions = new
KafkaStreams.CloseOptions().timeout(SHUTDOWN).leaveGroup(true);
streams.close(closeOptions);