Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)

1,511 views

Published on

Learn the differences between an event-driven streaming platform and middleware like MQ, ETL and ESBs – including best practices and anti-patterns, but also how these concepts and tools complement each other in an enterprise architecture.

Extract-Transform-Load (ETL) is still a widely-used pattern to move data between different systems via batch processing. Due to its challenges in today’s world where real time is the new standard, an Enterprise Service Bus (ESB) is used in many enterprises as integration backbone between any kind of microservice, legacy application or cloud service to move data via SOAP / REST Web Services or other technologies. Stream Processing is often added as its own component in the enterprise architecture for correlation of different events to implement contextual rules and stateful analytics. Using all these components introduces challenges and complexities in development and operations.

This session discusses how teams in different industries solve these challenges by building a native streaming platform from the ground up instead of using ETL and ESB tools in their architecture. This allows to build and deploy independent, mission-critical streaming real time application and microservices. The architecture leverages distributed processing and fault-tolerance with fast failover, no-downtime rolling deployments and the ability to reprocess events, so you can recalculate output when your code changes. Integration and Stream Processing are still key functionality but can be realized in real time natively instead of using additional ETL, ESB or Stream Processing tools.

Published in: Software

Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)

  1. 1. 1 Apache Kafka vs. Traditional Middleware (MQ, ETL, ESB) Friends, Enemies or Frenemies? Kai Waehner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
  2. 2. 2 Agenda 1.Traditional Middleware 2.Event Streaming Platform 3.Enemies 4.Friends 5.Frenemies
  3. 3. 3 Apache Kafka vs. Traditional Middleware Agreement: Kafka is the de facto standard for … • messaging at scale! • decoupling of microservices! • reliable, lightweight stream processing! Controversial discussion: Use Apache Kafka as middleware!
  4. 4. 4 Agenda 1.Traditional Middleware 2.Event Streaming Platform 3.Enemies 4.Friends 5.Frenemies
  5. 5. 5 Source A Source B Sink X Sink Y Connector A REST / SOAP Connector X JMS ETL / ESB Product XYZ Dream
  6. 6. 6 ETL / ESB Passive Server Source A Source B Sink X Sink Y Connector A REST / SOAP Connector X JMS ETL / ESB Active Server Messaging System A (Real Time) Messaging System B (Big Data) Database (Important Data) In-Memory Data Grid (Cache) API Gateway Stream Processing Engine Build your custom integration layer… At least 4-5 different products or open source frameworks Another integration tool (because one tool cannot integrate everything, buy more…) Reality
  7. 7. 7 Challenges with current integration architectures (MQ / ETL / ESB) Zoo of technologies Integration platform (Extract-Transform-Load / Enterprise service Bus) + additional “optional” components Messaging System (Message Queue, Peer-to-Peer) In-Memory Cache or Data Grid Database Streaming Engine API Gateway Architectures with limited scalability and availability No end-to-end scalability (only parts are built for high volume of messages and high throughput) No native, built-in scalability (you cannot just add a broker to the running system) Broker-per-scenario (e.g. hundreds of MQ clusters) Active-passive clustering Downtime for maintenance, upgrades, configuration changes No backwards-compatibility Tight coupling Platform-centric integration implementation (à no separation of concerns, vendor lock-in) Synchronous communication via request-response (SOAP / REST / gRPC / …) No handling of backpressure or unavailable consumers (push-based messaging queues)
  8. 8. 8 Business Digitalization Trends are Driving the Need to Process Events at a whole new Scale, Speed and Efficiency The World has Changed Mobile Cloud Microservices Internet of Things Machine Learning
  9. 9. 9 #NoMQ #NoESB #NoETL can handle this well! Tight coupling, limited scale, zoo of components You need to think event-based to realize these use cases! You need to leverage a single, scalable and loosely coupled platform Event Streaming Platform! ?
  10. 10. 10 Agenda 1.Traditional Middleware 2.Event Streaming Platform 3.Enemies 4.Friends 5.Frenemies
  11. 11. 11 Source A Source B Sink X Sink Y Connector A SOAP Connector X REST Event Streaming Platform Hmm… Yet another one of those magic boxes in the middle, huh?
  12. 12. Events What is an event?
  13. 13. Events
  14. 14. 14 Events A Sale An Invoice A Trade A Customer Experience
  15. 15. 15 Where are they? Events haven’t had a proper home in infrastructure or in code. They are implicit. Here!
  16. 16. 16 Implementing an Event-driven Architecture Requires a Paradigm Shift Relational DB From data represented in static tables and accessed with RPC... ...to data represented as streams of events Apps Microservices SaaS apps Custom apps Data warehouse
  17. 17. 17 Haven’t we seen all this before?
  18. 18. 18 What’s different this time around? (Published in 2009) (Published in 2004)
  19. 19. 19 A Streaming Platform is the Underpinning of an Event-driven Architecture Ubiquitous connectivity Globally scalable platform for all event producers and consumers Immediate data access Data accessible to all consumers in real time Single system of record Persistent storage to enable reprocessing of past events Continuous queries Stream processing capabilities for in-line data transformation Microservices DBs SaaS apps Mobile Customer 360 Real-time fraud detection Data warehouse Producers Consumers Database change Microservices events SaaS data Customer experience s Streams of real time events Stream processing apps
  20. 20. 20 Agenda • Traditional Middleware • Event Streaming Platform • Enemies • Friends • Frenemies
  21. 21. 21 Source A Source B Sink X Sink Y Connector A SOAP Connector X REST Integration Layer Middleware: Message Queue… ETL… Enterprise Service Bus… Event Streaming Platform: Apache Kafka. Spoilt for Choice!
  22. 22. 22 Business Digitalization Trends are Driving the Need to Process Events at a whole new Scale, Speed and Efficiency Why do you need an Event Streaming Platform? Remember: Mobile Cloud Microservices Internet of Things Machine Learning The World has Changed!
  23. 23. 23 23 Fast (Low Latency) Event Streaming Paradigm
  24. 24. 24
  25. 25. 25 ● Global-scale ● Real-time ● Persistent Storage ● Stream Processing Apache Kafka: The De-facto Standard for Real-Time Event Streaming Edge Cloud Data LakeDatabases Datacenter IoT SaaS AppsMobile Microservices Machine Learning Apache Kafka
  26. 26. 26 Apache Kafka at Scale at Tech Giants > 4.5 trillion messages / day > 6 Petabytes / day “You name it”
  27. 27. 27 Event Streaming Platform - Value per Use Case Improve Customer Experience (CX) Increase Revenue (make money) Business Value Decrease Costs (save money) Core Business Platform Increase Operational Efficiency Migrate to Cloud Mitigate Risk (protect money) Key Drivers Strategic Objectives (sample) Fraud Detection IoT sensor ingestion Digital replatforming/ Mainframe Offload Connected Car: Navigation & improved in-car experience: Audi Customer 360 Simplifying Omni-channel Retail at Scale: Target Faster transactional processing / analysis incl. Machine Learning / AI Mainframe Offload: RBC Microservices Architecture Online Fraud Detection Online Security (syslog, log aggregation, Splunk replacement) Middleware replacement Regulatory Digital Transformation Application Modernization: Multiple Examples Website / Core Operations (Central Nervous System) The [Silicon Valley] Digital Natives; LinkedIn, Netflix, Uber, Yelp... Predictive Maintenance: Audi Streaming Platform in a regulated environment (e.g. Electronic Medical Records): Celmatix Real-time app updates Real Time Streaming Platform for Communications and Beyond: Capital One Developer Velocity - Building Stateful Financial Applications with Kafka Streams: Funding Circle Detect Fraud & Prevent Fraud in Real Time: PayPal Kafka as a Service - A Tale of Security and Multi-Tenancy: Apple Example Use Cases $↑ $↓ $ Example Case Studies (of many)
  28. 28. 28 Why Apache Kafka instead of traditional middleware? Event Streaming Platform The core is event-based Supports real time stream processing But also Fire-and-Forget, Publish / Subscribe, Request-Response / RPC, Batch, … Single infrastructure Messaging, storage, processing Extreme Scale and throughput Reliability and zero downtime (“built for failure”) High availability Rolling upgrades and dynamic configuration changes Backwards compatibility Decoupling of clients Agile microservices Dumb pipes, smart endpoints Handling backpressure No vendor lock-in
  29. 29. 29 Why Apache Kafka instead of traditional middleware? ”Eat your own dog good!”
  30. 30. 30 Enemies! ESB MQ Storage Streaming Engine Messaging: Kafka Core Storage: Kafka Core Caching: Kafka Core Real-Time, Batch: Kafka Clients Integration: Kafka Connect Stream Processing: Kafka Streams / KSQL Request-Response: REST Proxy ”Eat your own dog good” vs. Enemy 1 Enemy 2 Enemy 3 Enemy 4 …. More components, clusters, technologies means more conflicts, incompatibility, operations burden! Enemy 5
  31. 31. 31 ETL / ESB Kafka Broker X Source A Source B Sink X Sink Y Kafka Connect Python Kafka Connect Java Kafka Broker 1 Schema Registry REST Proxy Kafka Streams / KSQL No need for another infrastructure, cluster, database… High availability and scale handled by Kafka Topics! Monitoring Tool “Yet another Kafka addon”
  32. 32. 32 Independent, scalable, reliable components read, write App (Kafka Streams) Kafka (data) More Apps (KSQL, Connect, Python, REST, “You-name-it”) BookingsTeam FraudTeam … MobileTeam …
  33. 33. 34 Example - Kafka Streams / KSQL : Fault-tolerance, powered by Kafka Processing fails over automatically, without data loss or miscomputation. 1 Kafka consumer group rebalance is triggered 2 Processing and state of #3 is migrated via Kafka to remaining servers #1 + #2 #3 died, so #1 and #2 take over 1 Kafka consumer group rebalance is triggered 2 Part of processing incl. state is migrated via Kafka from #1 + #2 to server #3 #3 is back, so work is split again
  34. 34. 35 Example - Kafka Streams / KSQL: Elasticity and Scalability, powered by Kafka You can add, remove, restart servers during live operations. We need more processing power!” “Ok, we can scale down again.”
  35. 35. 36 Don’t use your ESB knowledge with Kafka! https://www.thoughtworks.com/radar/techniques/r ecreating-esb-antipatterns-with-kafka !
  36. 36. 37 Agenda • Traditional Middleware • Event Streaming Platform • Enemies • Friends • Frenemies
  37. 37. 38 Friends! • Kafka Connect connectors (JMS, IBM MQ, RabbitMQ, etc.) • JMS Client (Kafka-native JMS Implementation) • ESB or ETL tools with their own connectors • Kafka’s Client APIs (like Java, .NET, Go, Python, Javascript) • REST Proxy • Etc. Plenty of integration options between Kafka and traditional middleware Traditional Middleware Apache Kafka
  38. 38. 39 Why friends? Kafka is NOT THE ALLROUNDER for every single problem! Maybe you don’t need a scalable, reliable, distributed system, but “just”: • Integration with legacy components (Cobol, Edifact, …) • Point-to-point messaging with active / passive HA for (“a few”) messages • Very specific messaging solution for “less than millisecond performance” • API Management • “Real” Batch Processing (Hadoop, Flink, Informatica, …) • … X
  39. 39. 40 Choose the right integration technology • Visual coding for complex graphical mappings • Integration components for complex legacy standard software and protocols • (SAP BAPI / iDoc, Cobol, Edifact, SOAP / WS*, etc.)
  40. 40. 41 Agenda • Traditional Middleware • Event Streaming Platform • Enemies • Friends • Frenemies
  41. 41. 42 Big Bang fails for critical 24/7 deployments No!
  42. 42. 43 Some technologies never die… Cough… Can we ever get away from all legacy applications? Probably not! Cobol
  43. 43. 44 RPC / Request-Response… You can use the ESB for it. BUT: REST SOAP RPC JMS Java … it often isn’t enough and doesn’t scale!
  44. 44. 45 Embrace data that lives and flows between services
  45. 45. 46 An Event Streaming Platform gives services independence Orders Customers Payments Stock Freedom to tap into and manage shared data REST JMS ESB REST CRM Mainframe SOAP … Kafka Kafka Kafka Kafka
  46. 46. 47 If you really need Request-Response with Kafka à A Correlation ID is your friend!
  47. 47. 48 Before IQ L With IQ J If you need to combine RPC with Streaming: Interactive Queries (IQ) (Kafka Streams)
  48. 48. 49 Confluent’s Streaming Maturity Model - where are you? Value Maturity (Investment & time) 2 Enterprise Streaming Pilot / Early Production Pub + Sub Store Process 5 Central Nervous System 1 Developer Interest Pre-Streaming 4 Global Streaming 3 SLA Ready, Integrated Streaming Projects Platform
  49. 49. 50 Frenemies? How to integrate the old and new world?
  50. 50. 51 Mainframe offloading to Apache Kafka Date Amount 1/27/2017 $4.56 1/22/2017 $32.14 Transaction Data Vendor Description Starbucks Coffee Walmart Blu-Ray Transaction Description Integration via - Kafka Connect (JMS, MQ) - REST Proxy - 3rd party CDC tool - Etc. Website Client profiles Mainframe MIPS = $$ Ability to migrate back if not working well
  51. 51. 52 MQ Integration (and later Replacement) Date Amount 1/27/2017 $4.56 1/22/2017 $32.14 Transaction Data Messaging Solution (IBM MQ, RabbitMQ, etc.) Application 1) Legacy MQ Communication with App 2) Kafka for decoupling between MQ and App 3) Direct communication via Kafka (no MQ anymore)
  52. 52. 53 New, Innovative Projects and Applications Date Amount 1/27/2017 $4.56 1/22/2017 $32.14 Transaction Data Messaging Solution (IBM MQ, RabbitMQ, etc.) Application Kafka Microservices Agile, lightweight (but scalable robust) Kafka microservice Big Data project (Elastic, Spark, AWS Sagemaker, …) 1) Direct Legacy MQ Communication with App 2) Kafka for decoupling between MQ and App 3) Direct communication via Kafka (no MQ anymore) 4) New projects and applications (independent or related to the existing migration projects) External Solution
  53. 53. 54 Use the right tool for the job (and combine them where it makes sense!)
  54. 54. 55 This is just the beginning of a new era! Confluent Vision for Kafka: Global Automated disaster recovery Global applications with geo-awareness Infinite Efficient and infinite data with tiered storage Unlimited horizontal scalability for single clusters Faster elastic scaling for brokers and partition Elastic Easy Container-based orchestration and management Faster elastic scaling when adding brokers and partitions Cloud-native Apache Kafka
  55. 55. 56 Kai Waehner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.confluent.io www.kai-waehner.de LinkedIn Questions? Feedback? Let’s connect!

×