Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache Kafka as Event Streaming Platform for Microservice Architectures

798 views

Published on

This session introduces Apache Kafka, an event-driven open source streaming platform. Apache Kafka goes far beyond scalable, high volume messaging. In addition, you can leverage Kafka Connect for integration and the Kafka Streams API for building lightweight stream processing microservices in autonomous teams. The Confluent Platform adds further components such as a Schema Registry, REST Proxy, KSQL, Clients for different programming languages and Connectors for different technologies.

The session discusses how tech giants like LinkedIn, Ebay or Airbnb leverage Apache Kafka as event streaming platform to solve various different business problems and how to create a scalable, flexible microservice architecture. A live demo shows how you can easily process and analyze streams of events using Apache Kafka and KSQL.

Published in: Software

Apache Kafka as Event Streaming Platform for Microservice Architectures

  1. 1. 1 Introduction to Apache Kafka as Event-Driven Open Source Streaming Platform for Microservice Architectures Kai Waehner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
  2. 2. 2 A need for integration in every enterprise Search Sensors / IoT RDBMS Monitoring NoSQLReal-time Analytics Data Warehouse Apps Microservices Big Data Integration
  3. 3. 3 Business Digitalization Trends are Driving the Need to Process Events at a whole new Scale, Speed and Efficiency The World has Changed Mobile Cloud Microservices Internet of Things Machine Learning
  4. 4. 4 Before: many ad hoc pipelines Search Security Fraud Detection Application User Tracking Operational Logs Operational Metrics Big Data App Data Warehouse Mainframes NoSQL Relational DB Databases Storage Interfaces Monitoring App Databases Storage Interfaces
  5. 5. 5 After: streaming platform with Kafka Search Security Fraud Detection Application User Tracking Operational Logs Operational MetricsMainframes Relational DB Big Data App Monitoring App Data Warehouse Event Streaming Platform NoSQL
  6. 6. Events What is an event?
  7. 7. Events
  8. 8. 8 Events A Sale An Invoice A Trade A Customer Experience
  9. 9. 9 Where are they? Events haven’t had a proper home in infrastructure or in code. They are implicit. Here!
  10. 10. 10 Haven’t we seen all this before?
  11. 11. 11 What’s different this time around? (Published in 2009) (Published in 2004)
  12. 12. A Streaming Platform is the Underpinning of an Event-driven Architecture Ubiquitous connectivity Globally scalable platform for all event producers and consumers Immediate data access Data accessible to all consumers in real time Single system of record Persistent storage to enable reprocessing of past events Continuous queries Stream processing capabilities for in-line data transformation Microservice s DBs SaaS apps Mobile Customer 360 Real-time fraud detection Data warehouse Producers Consumers Database change Microservices events SaaS data Customer experience s Streams of real time events Stream processing appsStream processing apps Stream processing apps
  13. 13. The beginning of a new Era https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying The first use case. This is why Kafka was created!
  14. 14. 17 ● Global-scale ● Real-time ● Persistent Storage ● Stream Processing Apache Kafka: The De-facto Standard for Real-Time Event Streaming Edge Cloud Data LakeDatabases Datacenter IoT SaaS AppsMobile Microservices Machine Learning Apache Kafka
  15. 15. Apache Kafka at Scale at Tech Giants > 4.5 trillion messages / day > 6 Petabytes / day “You name it” * Kafka Is not just used by tech giants ** Kafka is not just used for big data
  16. 16. Confluents Business Value per Use Case Improve Customer Experience (CX) Increase Revenue (make money) Business Value Decrease Costs (save money) Core Business Platform Increase Operational Efficiency Migrate to Cloud Mitigate Risk (protect money) Key Drivers Strategic Objectives (sample) Fraud Detection IoT sensor ingestion Digital replatforming/ Mainframe Offload Connected Car: Navigation & improved in-car experience: Audi Customer 360 Simplifying Omni-channel Retail at Scale: Target Faster transactional processing / analysis incl. Machine Learning / AI Mainframe Offload: RBC Microservices Architecture Online Fraud Detection Online Security (syslog, log aggregation, Splunk replacement) Middleware replacement Regulatory Digital Transformation Application Modernization: Multiple Examples Website / Core Operations (Central Nervous System) The [Silicon Valley] Digital Natives; LinkedIn, Netflix, Uber, Yelp... Predictive Maintenance: Audi Streaming Platform in a regulated environment (e.g. Electronic Medical Records): Celmatix Real-time app updates Real Time Streaming Platform for Communications and Beyond: Capital One Developer Velocity - Building Stateful Financial Applications with Kafka Streams: Funding Circle Detect Fraud & Prevent Fraud in Real Time: PayPal Kafka as a Service - A Tale of Security and Multi-Tenancy: Apple Example Use Cases $↑ $↓ $ Example Case Studies (of many)
  17. 17. Confluent Partner Briefing 20 Example: An Airbnb Booking Event Booked event happens { rentalId:4124, rentalPrice: 58, userId: 5893381 …. } Rental availability Rental pricing Recommended experiences Account history Account Updates Store Updates Report Updates User engagement Localized supply Topic: rentalOrders
  18. 18. A Modern, Distributed Platform for Data Streams. Messaging + Storage + Processing!
  19. 19. Apache Kafka is made up of distributed, immutable, append-only commit logs
  20. 20. Apache Kafka - A Distributed Commit Log Writers Kafka cluster Readers
  21. 21. Scalability of a filesystem • Hundreds of MB/s throughput • Many TB per server • Commodity hardware
  22. 22. Guarantees of a Database • Strict ordering • Persistence
  23. 23. Distributed by design • Replication • Fault Tolerance • Partitioning • Elastic Scaling
  24. 24. Kafka Topics my-topic my-topic-partition-0 my-topic-partition-1 my-topic-partition-2 broker-1 broker-2 broker-3
  25. 25. P Producing to Kafka Time
  26. 26. P Producing to Kafka Time C2 C3C1
  27. 27. Partition Leadership and Replication Broker 1 Topic1 partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
  28. 28. Apache Kafka (kafka.apache.org) includes Kafka Connect and Kafka Streams
  29. 29. Kafka Connect is an integration framework on top of Kafka‘s Core
  30. 30. Kafka’s Streams API: Build real-time applications for your core business Kafka’s Streams API • To build real-time applications for your core business • Easiest way to process data in Apache Kafka • Apps are standard Java applications that run on client machines • Powerful yet easy-to-use library, part of Apache Kafka • https://github.com/apache/kafka/tree/trunk/streams Streams API Your App Kafka Cluster
  31. 31. Example: complete app, ready for production at large-scale Word Count App configuration Define processing (here: WordCount) Start processing
  32. 32. Confluent Platform Operations and Security Development & Stream Processing Support,Services,Training&Partners Apache Kafka Security plugins | Role-Based Access Control Control Center | Replicator | Auto Data Balancer | Operator Connectors Clients | REST Proxy MQTT Proxy | Schema Registry KSQL Connect Continuous Commit Log Streams Complete Event Streaming Platform Mission-critical Reliability Freedom of ChoiceDatacenter Public Cloud Confluent Cloud Self-Managed Software Fully-Managed Service Confluent Delivers a Mission-Critical Event Streaming Platform
  33. 33. KSQL – A Streaming SQL Engine for Apache Kafka
  34. 34. 3939 Confluent Control Center (C3) Monitors all pipelines end-to-end • Lost Messages? • Duplicates? • Latency Issues? • What is the problem? • Where is the problem? • Etc.
  35. 35. 4040 KSQLKafka Streams Event Streaming with Confluent’s Event Streaming Platform Splunk Security Fraud Detection Application User Tracking Operational Logs Operational MetricsMainframes Oracle DB Hadoop Business App Monitoring App Confluent Control Center Kafka Mongo DB Cassandra Kafka Connect Schema Registry Rest Proxy
  36. 36. 41C O N F I D E N T I A L Kafka Connect Kafka Cluster CRM Integration Domain-Driven Design for your Event Steaming Platform Legacy Integration Custom Application ESB Connector Java / KSQL / Kafka Streams Schema Registry Event Streaming Platform CRM Domain Legacy Domain Payment Domain è Independent and loosely coupled, but scalable, highly available and reliable!
  37. 37. 4343 Best-of-breed Platforms, Partners and Services for Multi-cloud Streams Private Cloud Deploy on bare-metal, VMs, containers or Kubernetes in your datacenter with Confluent Platform and Confluent Operator Public Cloud Implement self-managed in the public cloud or adopt a fully managed service with Confluent Cloud Hybrid Cloud Build a persistent bridge between datacenter and cloud with Confluent Replicator Confluent Replicator VM SELF MANAGED FULLY MANAGED
  38. 38. 44 Confluent’s Streaming Maturity Model - where are you? Value Maturity (Investment & time) 2 Enterprise Streaming Pilot / Early Production Pub + Sub Store Process 5 Central Nervous System 1 Developer Interest Pre-Streaming 4 Global Streaming 3 SLA Ready, Integrated Streaming Projects Platform
  39. 39. 45Highly Scalable Microservices with Apache Kafka + Mesos Kai Waehner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.confluent.io www.kai-waehner.de LinkedIn Questions? Feedback? Please contact me!

×