SlideShare a Scribd company logo
1 of 17
Pilot Kafka Service
Manuel Martín Márquez
Kafka
• Kafka is a distributed streaming platform
• High Scalable (partition)
• Fault Tolerant (replication)
• Allow high level of parallelism and decoupling between
data producers and data consumers
• De facto standard for near real-time store, access
and process data streams
• Critical component of most of the Big Data Platform
and therefore of Hadoop ecosystem
3
Kafka Basic Concepts
4
Source: Hortonworks
Broker: Kafka node on the cluster
Topics: Stream of records category
- Multiple writers and readers
- Partitioned
- Replicated
Consumer: pulls messages off of a Kafka topic
Producer: push messages into a Kafka topic
Data Retention:
- Based on time or size
Zookeeper: Stores Kafka Metadata
Kafka entry points
• Custom implementation of producer and consumer using Kafka client API
• Java, Scala, C++, Python
• Kafka Connectors
• LogFile, HDFS, JDBC, ElasticSearch…
• Logstash
• Source and sink
• Apache Flume out-of-the-box can use Kafka as
• Source, Channel, Sink
• Other ingestion or processing tools support Kafka
• Apache Spark, LinkedIn Gobblin, Apache Storm…
5
Kafka for Data Integration and Processing
6
Stream
Source
Central data buffer
Flush
periodically
HDFS
Big Files
Events
Indexed data
Flush
immediately
Batch
processing
Fast data
access
Real time
stream
processing
Stream
Source
Stream
Source
Stream
Source
No data lost during
downtime (scheduled
and unscheduled) of a
Hadoop cluster
Kafka buffers protects the
recent data from being lost
(before a daily HDFS snapshot
can backed them up)
Kafka at CERN – it monitoring
7
Kafka cluster
(buffering) *
Processing
Data enrichment
Data aggregation
Batch Processing
Transport
Flume
Kafka
sink
Flume
sinks
FTS
Data
Sources
Rucio
XRootD
Jobs
…
Lemon
syslog
app log
DB
HTTP
feed
AMQ
Flume
AMQ
Flume
DB
Flume
HTTP
Flume
Log GW
Flume
Metric
GW
Logs
Lemon
metrics
HDFS
ElasticS
earch
…
Storage &
Search
Others
(influxdb)
Data
Access
CLI, API
Kafka at CERN – it monitoring (Requirements)
8
• Throughput and retention policy
• Currently 200 GB/day (forecast 500 GB/day)
• Retention Policy 12h in qa and 24 hours in prod (largest retention policy to cover potential
problems over weekends)
• ~ 4000 messages, up to 10k peaks
• ~50 topics
• Security (Kerberos)
• Flume can be potentially upgrade to 1.7 early in 2017 (work in progress already).
• Administration Capabilities
• Administrative operations
• Topic configuration, rebalancing, user management, start/stop cluster
• Possibility to increase retention policy, replication factor
Kafka at CERN – CALS
9
Kafka at CERN – CALS (Requirements)
10
• Throughput and retention policy
• Currently 30 GB/hour only including the logging processes
• Plan to incrementally include all the systems with potentially mean several TBs
• Compression with Snappy will be evaluated to determined performance
• Retention policy 24 hours, which is the time they need to buffer data and compact it to send it to Hadoop
• Security (Kerberos)
• Infrastructure
• Openstack under several conditions:
• TN need to be supported for several reasons
• High availability of the service CALS on top of private cloud (No CALS no BEAM in the LHC)
• Administration Capabilities
• Administrative operations
• Topic configuration, rebalancing, user management, start/stop cluster
• Possibility to increase retention policy, replication factor
Kafka at CERN
11
• Security Team
• Already using Kafka for pattern matching
• Data integration
• LHC Postmortem
• Potentially ingested by CALS
• Industrial Control Systems
• WinCCOA Data
Pilot Kafka Service
12
• Scope
• Study the current Kafka use case together with
the different teams involved
• Collect requirements
• Understand feasibility and added value of Kafka
as a central service
Pilot Kafka Service
13
• Collect requirements (5 Major Use Cases):
• CALS, IT-Monitoring, Security Team, Industrial Control, Post-mortem
• Throughput, Retention Policy, Security, Infrastructure, Administration
Capabilities
• Agreement to test the service from the first phase
• Ensure the service cope with their requirements
• More details: https://twiki.cern.ch/twiki/bin/viewauth/DB/CERNonly/KafkaService
Pilot Kafka Service – Current Development
14
• Pilot Implementation - rapid iteration which will help
to understand service and use case.
• On-demand Kafka service approach
• Self-Service Cluster creation, management and expansion
• Allow users to perform administrative tasks that are traditionally carried
out by administrators
• Facilitating operating system and engine updates (Kafka, Zookeeper)
• Transparently integrate all the needed services (Security, Storage,
Procurement, etc)
• Support for service continuity in case of hardware failure
Pilot Kafka Service – Current Development
15
• Configuration and Management REST API
• Security enabled - Kerberos on Kafka and
Zookeeper (SSL optional)
• Monitoring Capabilities
• OpenStack on GPN
• Network storage
• Dedicated Kafka and Zookeeper per user
Towards Kafka Production Service
16
• Service evaluation phase and time line
Towards Kafka Production Service
17
• Consolidation to Production
• Web Interface to manage clusters (Self-service)
• Evolution of the configuration management API
• Functionalities toward the self-service platform
• Integration with Openstack
• Full monitoring beyond JMX metrics
• Kafka-Mirroring (High Availability)
• Deploy service in TN (due to service design that is transparent
for us)
• Kafka as close as possible to consumers and producers

More Related Content

Similar to messaging.pptx

Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
Taking the open cloud to 11
Taking the open cloud to 11Taking the open cloud to 11
Taking the open cloud to 11Joe Brockmeier
 
Building High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in KafkaBuilding High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in Kafkaconfluent
 
Distributed messaging through Kafka
Distributed messaging through KafkaDistributed messaging through Kafka
Distributed messaging through KafkaDileep Kalidindi
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperRahul Jain
 
HBaseCon2016-final
HBaseCon2016-finalHBaseCon2016-final
HBaseCon2016-finalMaryann Xue
 
Kafka Deployment to Steel Thread
Kafka Deployment to Steel ThreadKafka Deployment to Steel Thread
Kafka Deployment to Steel Threadconfluent
 
Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kiloSteven Li
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyPeter Clapham
 
Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfGuozhang Wang
 
eHarmony @ Hbase Conference 2016 by vijay vangapandu.
eHarmony @ Hbase Conference 2016 by vijay vangapandu.eHarmony @ Hbase Conference 2016 by vijay vangapandu.
eHarmony @ Hbase Conference 2016 by vijay vangapandu.Vijaykumar Vangapandu
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesHBaseCon
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 
Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?Cask Data
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4Michael Kehoe
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxKnoldus Inc.
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019confluent
 

Similar to messaging.pptx (20)

Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Taking the open cloud to 11
Taking the open cloud to 11Taking the open cloud to 11
Taking the open cloud to 11
 
Building High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in KafkaBuilding High-Throughput, Low-Latency Pipelines in Kafka
Building High-Throughput, Low-Latency Pipelines in Kafka
 
Distributed messaging through Kafka
Distributed messaging through KafkaDistributed messaging through Kafka
Distributed messaging through Kafka
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
HBaseCon2016-final
HBaseCon2016-finalHBaseCon2016-final
HBaseCon2016-final
 
Kafka Explainaton
Kafka ExplainatonKafka Explainaton
Kafka Explainaton
 
Kafka Deployment to Steel Thread
Kafka Deployment to Steel ThreadKafka Deployment to Steel Thread
Kafka Deployment to Steel Thread
 
Open stack ha design & deployment kilo
Open stack ha design & deployment   kiloOpen stack ha design & deployment   kilo
Open stack ha design & deployment kilo
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Consensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdfConsensus in Apache Kafka: From Theory to Production.pdf
Consensus in Apache Kafka: From Theory to Production.pdf
 
eHarmony @ Hbase Conference 2016 by vijay vangapandu.
eHarmony @ Hbase Conference 2016 by vijay vangapandu.eHarmony @ Hbase Conference 2016 by vijay vangapandu.
eHarmony @ Hbase Conference 2016 by vijay vangapandu.
 
Apache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New FeaturesApache Phoenix: Use Cases and New Features
Apache Phoenix: Use Cases and New Features
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?Webinar: What's new in CDAP 3.5?
Webinar: What's new in CDAP 3.5?
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
 
Unleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptxUnleashing Real-time Power with Kafka.pptx
Unleashing Real-time Power with Kafka.pptx
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 20190-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
0-60: Tesla's Streaming Data Platform ( Jesse Yates, Tesla) Kafka Summit SF 2019
 

Recently uploaded

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Recently uploaded (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

messaging.pptx

  • 1.
  • 2. Pilot Kafka Service Manuel Martín Márquez
  • 3. Kafka • Kafka is a distributed streaming platform • High Scalable (partition) • Fault Tolerant (replication) • Allow high level of parallelism and decoupling between data producers and data consumers • De facto standard for near real-time store, access and process data streams • Critical component of most of the Big Data Platform and therefore of Hadoop ecosystem 3
  • 4. Kafka Basic Concepts 4 Source: Hortonworks Broker: Kafka node on the cluster Topics: Stream of records category - Multiple writers and readers - Partitioned - Replicated Consumer: pulls messages off of a Kafka topic Producer: push messages into a Kafka topic Data Retention: - Based on time or size Zookeeper: Stores Kafka Metadata
  • 5. Kafka entry points • Custom implementation of producer and consumer using Kafka client API • Java, Scala, C++, Python • Kafka Connectors • LogFile, HDFS, JDBC, ElasticSearch… • Logstash • Source and sink • Apache Flume out-of-the-box can use Kafka as • Source, Channel, Sink • Other ingestion or processing tools support Kafka • Apache Spark, LinkedIn Gobblin, Apache Storm… 5
  • 6. Kafka for Data Integration and Processing 6 Stream Source Central data buffer Flush periodically HDFS Big Files Events Indexed data Flush immediately Batch processing Fast data access Real time stream processing Stream Source Stream Source Stream Source No data lost during downtime (scheduled and unscheduled) of a Hadoop cluster Kafka buffers protects the recent data from being lost (before a daily HDFS snapshot can backed them up)
  • 7. Kafka at CERN – it monitoring 7 Kafka cluster (buffering) * Processing Data enrichment Data aggregation Batch Processing Transport Flume Kafka sink Flume sinks FTS Data Sources Rucio XRootD Jobs … Lemon syslog app log DB HTTP feed AMQ Flume AMQ Flume DB Flume HTTP Flume Log GW Flume Metric GW Logs Lemon metrics HDFS ElasticS earch … Storage & Search Others (influxdb) Data Access CLI, API
  • 8. Kafka at CERN – it monitoring (Requirements) 8 • Throughput and retention policy • Currently 200 GB/day (forecast 500 GB/day) • Retention Policy 12h in qa and 24 hours in prod (largest retention policy to cover potential problems over weekends) • ~ 4000 messages, up to 10k peaks • ~50 topics • Security (Kerberos) • Flume can be potentially upgrade to 1.7 early in 2017 (work in progress already). • Administration Capabilities • Administrative operations • Topic configuration, rebalancing, user management, start/stop cluster • Possibility to increase retention policy, replication factor
  • 9. Kafka at CERN – CALS 9
  • 10. Kafka at CERN – CALS (Requirements) 10 • Throughput and retention policy • Currently 30 GB/hour only including the logging processes • Plan to incrementally include all the systems with potentially mean several TBs • Compression with Snappy will be evaluated to determined performance • Retention policy 24 hours, which is the time they need to buffer data and compact it to send it to Hadoop • Security (Kerberos) • Infrastructure • Openstack under several conditions: • TN need to be supported for several reasons • High availability of the service CALS on top of private cloud (No CALS no BEAM in the LHC) • Administration Capabilities • Administrative operations • Topic configuration, rebalancing, user management, start/stop cluster • Possibility to increase retention policy, replication factor
  • 11. Kafka at CERN 11 • Security Team • Already using Kafka for pattern matching • Data integration • LHC Postmortem • Potentially ingested by CALS • Industrial Control Systems • WinCCOA Data
  • 12. Pilot Kafka Service 12 • Scope • Study the current Kafka use case together with the different teams involved • Collect requirements • Understand feasibility and added value of Kafka as a central service
  • 13. Pilot Kafka Service 13 • Collect requirements (5 Major Use Cases): • CALS, IT-Monitoring, Security Team, Industrial Control, Post-mortem • Throughput, Retention Policy, Security, Infrastructure, Administration Capabilities • Agreement to test the service from the first phase • Ensure the service cope with their requirements • More details: https://twiki.cern.ch/twiki/bin/viewauth/DB/CERNonly/KafkaService
  • 14. Pilot Kafka Service – Current Development 14 • Pilot Implementation - rapid iteration which will help to understand service and use case. • On-demand Kafka service approach • Self-Service Cluster creation, management and expansion • Allow users to perform administrative tasks that are traditionally carried out by administrators • Facilitating operating system and engine updates (Kafka, Zookeeper) • Transparently integrate all the needed services (Security, Storage, Procurement, etc) • Support for service continuity in case of hardware failure
  • 15. Pilot Kafka Service – Current Development 15 • Configuration and Management REST API • Security enabled - Kerberos on Kafka and Zookeeper (SSL optional) • Monitoring Capabilities • OpenStack on GPN • Network storage • Dedicated Kafka and Zookeeper per user
  • 16. Towards Kafka Production Service 16 • Service evaluation phase and time line
  • 17. Towards Kafka Production Service 17 • Consolidation to Production • Web Interface to manage clusters (Self-service) • Evolution of the configuration management API • Functionalities toward the self-service platform • Integration with Openstack • Full monitoring beyond JMX metrics • Kafka-Mirroring (High Availability) • Deploy service in TN (due to service design that is transparent for us) • Kafka as close as possible to consumers and producers

Editor's Notes

  1. Critical Service no data no beam