Apache kafka

•Download as PPTX, PDF•

1 like•156 views

Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

Technology

• Introduction
• Features
• Architecture
• Storage
• Topic
• Producer and consumer
• Broker
• Zookeeper
• Use Cases
• Comparison with RMQ & AMQ
10/25/2016 Confidential 2
Agenda

Introduction
10/25/2016 Confidential 3
Open Source messaging Q written in scala.
LinkedIn Corp -> Apache
Producer1
Producer2
Queue
Consumer1
Consumer2
Consumer3

Features
10/25/2016 Confidential 4
• Brokered
• Distributed
• high throughput for publish and subscribe
• easy scalable
• fast
• replicated commit log service
• partitioned
• stores messages on disk
• In order delivery, per partition.

Architecture
10/25/2016 Confidential 5
Elements are:
• Messages
• Producer
• Consumer
• Topic
• Broker [Cluster]
• Zookeeper

Broker
10/25/2016 Confidential 6
• Stateless brokers.
• Transient System – deletes data
• Time-based SLA for the retention policy.
• Zero copy

Topic
10/25/2016 Confidential 8
• Unique identification for messages – offset
• Consumer can change the offset to re consume or skip the message
• Replicated among the configurable number of servers.
• Retention policy
• Parallelism at partitions level

Producers and Consumers
10/25/2016 Confidential 9
Producers:
• Decides which message to which partition - LB.
• Batch the messages
• Async Send
Consumers:
• Pull
• Queuing and pub-sub
• Consumer groups - cluster of consumers
• Ordered per partition

Use Cases
10/25/2016 Confidential 11
• Messaging
• Website activity tracking
• Metrics
• Log Aggregation
• Stream Processing
• Event Publishing

Sooo fast… How come?
10/25/2016 Confidential 13
1. No wait.
2. efficient storage format.
Header -> 9bytes (144 bytes in AMQ(As per JMS)).
3. Zero copy
4. Compressing multiple messages – message set

Thank You
Yogesh..BG
10/25/2016 Confidential 14

What's hot

SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020Redis Labs

Openstack trove-updatesJesse Wiles

Kafka as a DatastoreKyle Cruz

Containerized Storage for Containers: Why, What and How OpenEBS WorksMatt Baldwin

Application Deployment and Management at Scale at 1&1Matt Baldwin

Reliable, Scalable Kubernetes on AWSApplatix

RedisConf18 - Video Experience Operational Insights in Real Time.Redis Labs

autodiscoverable microservices with vertx3Andy Moncsek

Microservice - Up to 500k CCUViet Tran

Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...Redis Labs

Globus Connect Server 5.1 WebinarGlobus

FlockerVenkata Naga Ravi

Does Hypervisor matter in OpenStackNermina Miller

NATS vs HTTPApcera

OpenCms Days 2015 OCEE explainedAlkacon Software GmbH & Co. KG

PaaS options for .NETSlawomir Dorzak

Ceph Management and Monitoring with Dashboard v2 - Lenz GrimmerCeph Community

RedisDay London 2018 - Layered Orchestration & Redis Enterprise for fun and p...Redis Labs

ServiceNow-Box IntegrationNagendra B

OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...OpenNebula Project

What's hot (20)

SQL, Redis and Kubernetes by Paul Stanton of Windocks - Redis Day Seattle 2020

Openstack trove-updates

Kafka as a Datastore

Containerized Storage for Containers: Why, What and How OpenEBS Works

Application Deployment and Management at Scale at 1&1

Reliable, Scalable Kubernetes on AWS

RedisConf18 - Video Experience Operational Insights in Real Time.

autodiscoverable microservices with vertx3

Microservice - Up to 500k CCU

Leveraging Redis for System Monitoring by Adam McCormick of SBG - Redis Day S...

Globus Connect Server 5.1 Webinar

Flocker

Does Hypervisor matter in OpenStack

NATS vs HTTP

OpenCms Days 2015 OCEE explained

PaaS options for .NET

Ceph Management and Monitoring with Dashboard v2 - Lenz Grimmer

RedisDay London 2018 - Layered Orchestration & Redis Enterprise for fun and p...

ServiceNow-Box Integration

OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...

Viewers also liked

Vicoz y vicoz two minig concessions in Puno PeruVicoz, Empresa Minera en Venta

Centrales hidraulicasgabriel60

Sinopsisferjnaval

Actividad #1 saia 2016Kenny Sanchez

Christmas traditions in Romania, KA2Jolanta Varanaviciene

TIPOS DE COMPUTADRASlaurapallero5

Jenkins a latravis - @cfgmgmtcamp 2016João Cravo

Vicoz y vicoz 2, im landkreis coasaVicoz, Empresa Minera en Venta

Termistores saiaKenny Sanchez

Home flipferjnaval

"Naghihinayang na Pag-ibig" - TagupaUniversity Student Council-Molave

Peds simualtion san antonio2boyd888

Advantages and Disadvantages and Disadvantages of being RichDarius Juknevičius

"Introduction to R Programming and Machine Learning"Edureka!

MENISCUS REPAIR I Dr.RAJAT JANGIR JAIPURDr.RAJAT JANGIR Orthopaedic surgeon Jaipur

Internet of Things Security: IBM HorizonWatch 2016 Trend BriefBill Chamberlin

Vandens reikšmė žmogaus organizmuibiomokykla

Neu-IR 2016: Lessons from the TrenchesBhaskar Mitra

Viewers also liked (18)

Vicoz y vicoz two minig concessions in Puno Peru

Centrales hidraulicas

Sinopsis

Actividad #1 saia 2016

Christmas traditions in Romania, KA2

TIPOS DE COMPUTADRAS

Jenkins a latravis - @cfgmgmtcamp 2016

Vicoz y vicoz 2, im landkreis coasa

Termistores saia

Home flip

"Naghihinayang na Pag-ibig" - Tagupa

Peds simualtion san antonio2

Advantages and Disadvantages and Disadvantages of being Rich

"Introduction to R Programming and Machine Learning"

MENISCUS REPAIR I Dr.RAJAT JANGIR JAIPUR

Internet of Things Security: IBM HorizonWatch 2016 Trend Brief

Vandens reikšmė žmogaus organizmui

Neu-IR 2016: Lessons from the Trenches

Similar to Apache kafka

Pulsar - flexible pub-sub for internet scaleMatteo Merli

Pimping the ForgeRock Identity Platform for a Billion UsersForgeRock

Managing storage on Prem and in CloudHoward Marks

Evaluating Streaming Data SolutionsStreamlio

Apache geodeYogesh BG

Hands-on Workshop: Apache PulsarSijie Guo

Distributed messaging through KafkaDileep Kalidindi

Distributed messaging with Apache KafkaSaumitra Srivastav

Closer Look at Cloud Centric ArchitecturesTodd Kaplinger

Kafka talkMaheedhar Gunturu

EVOLVE'15 | Enhance | Richard Gatewood | Integrating SFDC & AEM Evolve The Adobe Digital Marketing Community

Cloud Messaging Service: Technical OverviewMessaging Meetup

Apache Kafka IntroductionAmita Mirajkar

Serving Files In AzureSam Cogan

Tokyo Azure Meetup #5 - Microservices and Azure Service FabricTokyo Azure Meetup

IBM Message Hub service in Bluemix - Apache Kafka in a public cloudAndrew Schofield

Messaging, storage, or both? The real time story of Pulsar and Apache Distri...Streamlio

Introduction to Apache KafkaJeff Holoman

Using Apache Cassandra and Apache Kafka to Scale Next Gen ApplicationsData Con LA

Pulsar - Distributed pub/sub platformMatteo Merli

Similar to Apache kafka (20)

Pulsar - flexible pub-sub for internet scale

Pimping the ForgeRock Identity Platform for a Billion Users

Managing storage on Prem and in Cloud

Evaluating Streaming Data Solutions

Apache geode

Hands-on Workshop: Apache Pulsar

Distributed messaging through Kafka

Distributed messaging with Apache Kafka

Closer Look at Cloud Centric Architectures

Kafka talk

EVOLVE'15 | Enhance | Richard Gatewood | Integrating SFDC & AEM

Cloud Messaging Service: Technical Overview

Apache Kafka Introduction

Serving Files In Azure

Tokyo Azure Meetup #5 - Microservices and Azure Service Fabric

IBM Message Hub service in Bluemix - Apache Kafka in a public cloud

Messaging, storage, or both? The real time story of Pulsar and Apache Distri...

Introduction to Apache Kafka

Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications

Pulsar - Distributed pub/sub platform

Recently uploaded

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

A Call to Action for Generative AI in 2024Results

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Understanding the Laravel MVC ArchitecturePixlogix Infotech

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

GenCyber Cyber Security Day PresentationMichael W. Hawkins

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

How to convert PDF to text with Nanonetsnaman860154

Histor y of HAM Radio presentation slidevu2urc

Google AI Hackathon: LLM based Evaluator for RAGSujit Pal

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Recently uploaded (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Salesforce Community Group Quito, Salesforce 101

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Presentation on how to chat with PDF using ChatGPT code interpreter

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

08448380779 Call Girls In Civil Lines Women Seeking Men

A Call to Action for Generative AI in 2024

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Understanding the Laravel MVC Architecture

How to Troubleshoot Apps for the Modern Connected Worker

GenCyber Cyber Security Day Presentation

CNv6 Instructor Chapter 6 Quality of Service

How to convert PDF to text with Nanonets

Histor y of HAM Radio presentation slide

Google AI Hackathon: LLM based Evaluator for RAG

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Apache kafka

1. Yogesh..BG 27th Nov 2015

2. • Introduction • Features • Architecture • Storage • Topic • Producer and consumer • Broker • Zookeeper • Use Cases • Comparison with RMQ & AMQ 10/25/2016 Confidential 2 Agenda

3. Introduction 10/25/2016 Confidential 3 Open Source messaging Q written in scala. LinkedIn Corp -> Apache Producer1 Producer2 Queue Consumer1 Consumer2 Consumer3

4. Features 10/25/2016 Confidential 4 • Brokered • Distributed • high throughput for publish and subscribe • easy scalable • fast • replicated commit log service • partitioned • stores messages on disk • In order delivery, per partition.

5. Architecture 10/25/2016 Confidential 5 Elements are: • Messages • Producer • Consumer • Topic • Broker [Cluster] • Zookeeper

6. Broker 10/25/2016 Confidential 6 • Stateless brokers. • Transient System – deletes data • Time-based SLA for the retention policy. • Zero copy

7. Storage 10/25/2016 Confidential 7

8. Topic 10/25/2016 Confidential 8 • Unique identification for messages – offset • Consumer can change the offset to re consume or skip the message • Replicated among the configurable number of servers. • Retention policy • Parallelism at partitions level

9. Producers and Consumers 10/25/2016 Confidential 9 Producers: • Decides which message to which partition - LB. • Batch the messages • Async Send Consumers: • Pull • Queuing and pub-sub • Consumer groups - cluster of consumers • Ordered per partition

10. Zookeeper 10/25/2016 Confidential 10

11. Use Cases 10/25/2016 Confidential 11 • Messaging • Website activity tracking • Metrics • Log Aggregation • Stream Processing • Event Publishing

12. Comparison 10/25/2016 Confidential 12

13. Sooo fast… How come? 10/25/2016 Confidential 13 1. No wait. 2. efficient storage format. Header -> 9bytes (144 bytes in AMQ(As per JMS)). 3. Zero copy 4. Compressing multiple messages – message set

14. Thank You Yogesh..BG 10/25/2016 Confidential 14

Editor's Notes

Non JMS Initial development was for activity tracker for web pages
Has its unique design Communication is by TCP Compression
Physically is a file Uses distributed commit log Storage is distributed. Kafka is all about log
Leader partitions in one server handles all r/w. followers will passively copies the leader.
Compression Pull Flow control at consumer side aggressive batching Suppose: Queue –> all the instances have same group name Pub-sub –> each instance has different group name
Candidates: Apache Kafka Apache ActiveMQ version 5.4 RabbitMQ version 2.4 System: Linux m/c 8 2Ghz cores 16GB mem 6diskd with RAID 10 1GB network link one m/c as broker and another for Prod and Cons
justification: 1. Kafka doesn’t wait for ack. 2. efficient storage format. Header is 9bytes than 144 bytes in AMQ(As per JMS). Busiest thread in AMQ is to access the B-Tree to maintain msg meta data n state. 3. Zero copy Disk -> page caches of kernal space Kernal space -> user space User spcace -> socket buffers socket buffers -> NIC buffer 4. Compressing multiple messages – message set

Apache kafka

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (18)

Similar to Apache kafka

Similar to Apache kafka (20)

Recently uploaded

Recently uploaded (20)

Apache kafka

Editor's Notes