Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December 2017)

405 views

Published on

Fast data arrives in real time and potentially high volume. Rapid processing, filtering and aggregation is required to ensure timely reaction and actual information in user interfaces. Doing so is a challenge, make this happen in a scalable and reliable fashion is even more interesting. This session introduces Apache Kafka as the scalable event bus that takes care of the events as they flow in and Kafka Streams and KSQL for the streaming analytics. Both Java and Node applications are demonstrated that interact with Kafka and leverage Server Sent Events and WebSocket channels to update the Web UI in real time. User activity performed by the audience in the Web UI is processed by the Kafka powered back end and results in live updates on all clients.

This presentation includes a demonstration of remote database synchronization through Twitter.

Published in: Software
  • Be the first to comment

What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December 2017)

  1. 1. What is Apache Kafka & Why is it Important? The Event Fabric bringing IT together What is Apache Kafka & Why is it Important? | UKOUG Tech17 1 µ µ What is Apache Kafka & Why is it Important?
  2. 2. It would be so nice if I could publish my ideas and actions, accessible near instantly for everyone who is interested Heck, I do not even know these people and they may not know me [personally] – just my pearls of wisdom. And if they are late to the party, they can also check out the historic archives of my eloquence Without fretting about the numbers of readers involved and whether they are in the same time zone as me and online when I publish my messages – and which device they use
  3. 3. It would be so nice if I could publish my ideas and actions, accessible near instantly for everyone who is interested Heck, I do not even know these people and they may not know me [personally] – just my pearls of wisdom. And if they are late to the party, they can also check out the historic archives of my eloquence Without fretting about the numbers of readers involved and whether they are in the same timezone as me and online when I publish my messages – and which device they use
  4. 4. • Decoupled communication • 0, 1 or many followers • Scalable number of messages (and parties) • Reliable (mostly available, few messages lost) • Full history • Open: cross device, cross location • Not Sub-second, near real-time fast • Rate limited (#messages/minute) • Size limited (140-280 characters) • Format limited (text) • Not for private interactions • Not (really) for programmatic use
  5. 5. 5 Oracle Database ORDERS Oracle Database DVX_ORDERS
  6. 6. 6 Oracle Database ORDERS Oracle Database DVX_ORDERS
  7. 7. What is Apache Kafka and why is it important? 7 Oracle Database ORDERS Oracle Database DVX_ORDERS µ Oracle Application Container Cloud Oracle DBaaS Cloud µLocally running Node application
  8. 8. What does the Twitter for System Driven Event Interaction look like? What is Apache Kafka and why is it important? 8 • Decoupled communication – organized per topic • 0, 1 or many Consumers per Topic • Scalable number of messages (and parties) • Reliable (distributed) • Full history • Open: libraries in many technologie & REST APIs
  9. 9. 9 Oracle Database ORDERS Oracle Database DVX_ORDERS µ Oracle Application Container Cloud Oracle DBaaS Cloud µLocally running Node application Oracle Event Hub
  10. 10. What does the Twitter for System Driven Event Interaction look like? What is Apache Kafka and why is it important? 10 • Decoupled communication – organized per topic • 0, 1 or many Consumers per Topic • Scalable number of messages (and parties) • Reliable (distributed) • Full history • Open: libraries in many technologie & REST APIs • Near real-time fast • No Rate Limit • No enforced size limit • Anything goes (it’s all byte[]) • On premises or in cloud, private or trusted • Very much for programmatic use
  11. 11. Events Producers Consumers Robust, Scalable, Fast, History Retention Containerized/Cloud- enabled Open
  12. 12. Messaging as we know it • JMS, Oracle Advanced Queuing, IBM MQ, MS MQ, RabbitMQ, MQTT, XMPP, WebSockets, Oracle Coherence, … • Challenges • Costs • Scalability (size and speed) • (lack of) Distribution (and therefore availability) • Complexity of infrastructure • Message delivery guarantees • Lack of technology openness • Deal with temporarily offline consumers • Retain history
  13. 13. Introducing Apache Kafka • ..- 2010 – creation at Linkedin • Message Bus | Event Broker • High volume, low latency, highly reliable, cross technology • Scalable, distributed, strict message ordering, …. • 2011/2012 – open source under the Apache Incubator/ Top Project • Kafka is used by many large corporations: • Walmart, Cisco, Netflix, PayPal, LinkedIn, eBay, Spotify, Uber, Sift Science, Zalando, The New York Times, Airbnb, Coursera, ING Bank,… • And embraced by many software vendors & cloud providers • Client libraries available for Node, Java, C/C++, Python, Ruby, PHP, Go, Rust, .NET, Perl, Scala DSL, Clojure, Swift and more
  14. 14. Producers Consumers tcp tcp
  15. 15. Producers Consumers Topic
  16. 16. KAFKA TERMINOLOGY • Topic • Message • == ByteArray • Broker • Producer • Consumer Producer Consumer Topic Broker Key Value Time Message
  17. 17. Producers Consumers Topic Broker Key Value Time
  18. 18. CONSUMING • Messages are available to consumers only when they have been committed • Kafka does not push • Unlike JMS • Read does not destroy • Unlike JMS Topic • (some) History available • Offline consumers can catch up • Consumers can re-consume from the past • Delivery Guarantees • Ordering maintained • At-least-once (per consumer) by default; at-most-once and exactly-once can be implemented
  19. 19. Producers Consumers Topic Broker Key Value Time
  20. 20. Producers Consumers Topic Broker tcp tcp
  21. 21. WHAT’S SO SPECIAL? • Durable • Scalable • High volume • High speed • Available • Distributed • Open • Quick start • Free (no license costs) • “Self Fulfilling Prophecy” (positive feedback loop)
  22. 22. CONFLUENT == ENTERPRISE KAFKA • Freemium model • Support • Training • Confluent Cloud • Platform and Tools
  23. 23. <CTRL F5>
  24. 24. Application Server F5 F5 F5 CTRLF5 CTRLF5 CTRLF5
  25. 25. Application Server F5 F5 F5 CTRLF5 CTRLF5 CTRLF5
  26. 26. Application Server F5 F5 F5 CTRLF5 CTRLF5 CTRLF5
  27. 27. FAST DATA AND ACTIVE UI • Handle influx • Publish findings instantaneously • Update UI & notify end user immediately • Analyze in real time • Decoupled components • No data loss when a component is temporarily down • Scalable with volume of events and of number of clients
  28. 28. THE CASE AT HAND Client Client Client Client Show live tweet feed for conferences Show live tweet aggregates per conference Allow users to like tweets –and show live list of liked tweets Show a live list of top 3 liked tweets per conference Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17
  29. 29. DEMO - REAL TIME, CROSS CLOUD, CROSS TECHNOLOGY PUSH Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17 Client Client Client Client you
  30. 30. THE CASE AT HAND – STEP ONE Client Client Client Client Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17 Show live tweet feed for conferences Tweets Topic
  31. 31. THE CASE AT HAND – STEP ONE AND TWO Client Client Client Client Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17 Show live tweet feed for conferences Tweets Topic
  32. 32. KAFKA CONSUMER IN NODE GET EVENTS PUSHED INTO APPLICATION
  33. 33. THE CASE AT HAND SERVER SENT EVENTS FOR PUSH BACK Client Client Client Client Show live tweet feed for conferences Tweets Topic Server Sent Event Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17
  34. 34. SERVER SENT EVENT – SERVER SIDE Client Client Client Client Server Sent Event
  35. 35. Client Client Client Client Server Sent Event SERVER SENT EVENT – CLIENT SIDE
  36. 36. LIVE TWEET STREAM Server Sent Event
  37. 37. THE CASE AT HAND TWEET LIKES – CLIENT TO SERVER TO ALL CLIENTS Client Client Client Client Show live tweet feed for conferences Tweets Topic SS E Allow users to like tweets –and show live list of liked tweets Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17
  38. 38. THE CASE AT HAND WEB SOCKETS – FOR BI DIRECTIONAL PUSH Client Client Client Client Show live tweet feed for conferences Tweets Topic SSE WebSockets Allow users to like tweets –and show live list of liked tweets Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17
  39. 39. TWEET LIKES BROADCASTING WebSockets WebSockets
  40. 40. THE CASE AT HAND STREAMING ANALYSIS OF TWEET EVENTS Client Client Client Client Show live tweet feed for conferences Tweets Topic SSE WebSockets Allow users to like tweets –and show live list of liked tweets Show live tweet aggregates per conference Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17
  41. 41. THE CASE AT HAND - STREAMING ANALYSIS OF TWEETS Client Client Client Client Show live tweet feed for conferences Tweets Topic WebSockets Allow users to like tweets –and show live list of liked tweets Show live tweet aggregates per conference tweetAnalytics Topic Streaming Tweets Aggregation µ SSE Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17
  42. 42. KAFKA STREAMS • Real Time Event [Stream] Processing integrated into Kafka • Aggregations & Top-N • Time Windows • Continuous Queries • Latest State (event sourcing) • Turn Stream (of changes) into Table (of most recent or current state) • Part of the state can be quite old • A Kafka Streams client will have state in memory • Always to be recreated from topic partition log files • Note: Kafka Streams is relatively new • Only support for Java clients
  43. 43. KAFKA STREAMS Topic Filter Aggregate Join Topic Map (Xform) Publish Topic
  44. 44. EXAMPLE OF KAFKA STREAMS Topic groupBy Aggregate Join Topic Map (Xform) Publish TweetMessage Conference Text Author Hashtag Set Conference as key Sum/Avg/Top3 by key (==conference) As JSON Round aggregate to nearest 100 Latest Conference Details Topic: CountTweetsPerConference and possibly per time window
  45. 45. KAFKA STREAMS – RUNNING COUNT TWEETS PER CONFERENCE
  46. 46. STREAMING TWEET ANALYTICS PUSHED TO CLIENTS Server Sent Event
  47. 47. THE CASE AT HAND - STREAMING ANALYSIS OF TWEET LIKES Client Client Client Client Show live tweet feed for conferences Tweets Topic WebSockets Allow users to like tweets –and show live list of liked tweets Show live tweet aggregates per conference tweetAnalytics Topic Streaming Tweets Aggregation µ SSE Show a live list of top 3 liked tweets per conference Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17
  48. 48. KSQL FOR DECLARATIVE STREAM ANALYTICS THROUGH CONTINUOUS QUERIES create table tweetAnalytics as select conference , count(*) from tweetsTopic group by conference create stream retweets as select * from tweetsTopic where text like 'RT%'
  49. 49. VISUALIZING KSQL VS KAFKA STREAMS
  50. 50. THE CASE AT HAND - STREAMING ANALYSIS OF TWEET LIKES Client Client Client Client Show live tweet feed for conferences Tweets Topic WebSockets Allow users to like tweets –and show live list of liked tweets Show live tweet aggregates per conference tweetAnalytics Topic Streaming Tweets Aggregation µ SSE Show a live list of top 3 liked tweets per conference Likes Aggregation µ tweetLike Topic Top3TweetLikes PerConference Tweets on #ukoug17 #ukoug_tech17 #ukoug_apps17 #ukoug_jde17
  51. 51. WEBSOCKETS – SERVER SIDE
  52. 52. RUNNING TOP 3 OF BEST LIKED TWEETS PER CONFERENCE Server Sent Event
  53. 53. END TO END FLOW CLOUD ENABLED API Cache EventHub CS µ Tweets Aggregation µ LikesTweets UI µ Client Chrome Client Firefox Likes Aggregation µ API µ Tweet Count Likes Top3
  54. 54. Key aspects of this demo – What Kafka can do for you • Bridging Cloud(s) and on premises systems • Providing decoupled interaction between microservices • Performing Streaming Analysis • Bridging technologies (Java, Node, …) • Bridging the availability (no | one | multiple instances) • Provide semi-push based synchronization • Open • Scalable • Reliable & Available • Fast • Complete historical record What is Apache Kafka and why is it important? 58
  55. 55. Oracle embracing Apache Kafka • Event Hub Cloud Service = Managed Apache Kafka platform • Managed Topics have been announced too • Kafka as source for Golden Gate and ODI • Data Pipeline with Data Hub (Apache Cassandra) & Event Hub • Oracle Service Bus Kafka Adapter • Integration Cloud • Stream Analytics (aka Stream Explorer fka Oracle Event Processor) • Oracle Native Container and Microservices Platform • Fn Serverless Platform • JET and ADF real time push based on Apache Kafka • In general – the bridge between on premises  [public] Cloud What is Apache Kafka and why is it important? 59
  56. 56. Summary • => == => • Apache Kafka is emerging as platform of choice for message exchange in a world of • Microservices • CQRS and Data Source Synchronization • Clouds • Fast Data (IoT) and Streaming Analysis • Real time data integration & distribution • Oracle is rapidly embracing Apache Kafka on various levels • Getting started with Apache Kafka is not very hard at all • The platform is open source – and has broad client support (Java, Node, …) • Many resources are available – tutorials, blog article, demonstrations, presentation slides and recordings of conference sessions, samples on GitHub What is Apache Kafka and why is it important? 60
  57. 57. Thank you! What is Apache Kafka and why is it important? 61 • Blog: technology.amis.nl • Email: lucas.jellema@amis.nl • : @lucasjellema • : lucas-jellema • : www.amis.nl, info@amis.nl

×