SlideShare a Scribd company logo
1 of 22
Download to read offline
Some Challenges Deploying a
Kafka / Elasticsearch Pipeline
with Kubernetes
08/28/2019
© 2019 Proofpoint. All rights reserved
Very Attacked People
2© 2019 Proofpoint. All rights reserved
Message Intelligence Service
Near real time connection and metadata omnibus
3© 2019 Proofpoint. All rights reserved
2B2TB
5000
60 r/s
Prior and Current Generation
4© 2019 Proofpoint. All rights reserved
© 2019 Proofpoint. All rights reserved
Why Kubernetes?
•Available for all public clouds and bare metal
•Widely known
•Highly Scalable
•Likely “winner” for long time
5© 2019 Proofpoint. All rights reserved
Kubernetes Advantages
•System resources are shared across services
•Automatic service scaling as needed
– Nodes and pods are created and destroyed ad-hoc
•Pods scheduled on any node with capacity
6© 2019 Proofpoint. All rights reserved
Kubernetes Stateful Service Challenges
•Services are scaled as needed
– Nodes and pods are created and destroyed ad-hoc
•Pods scheduled on any node with capacity
•Service pods do not receive any disk resources.
7© 2019 Proofpoint. All rights reserved
Kubernetes Challenge:
Upgrades are upsetting
•KOPS prevents using local storage
Solutions:
– EBS volumes (which have own problems)
– Upgrade using in-place container OS
•Coreos/Tectonic
•Flatcar/Lokomotiv
•Juju
8© 2019 Proofpoint. All rights reserved
Kubernetes Challenge:
Disk Resources
•Stateful Set (STS)
– Guaranteed ordering and uniqueness of Pods
– Persistent pod identity
– Storage available
•PVC / SC / PV
– Pod specification requests a Persistent Volume Claim
– PVC requests a Storage Class
– SC is producer of Persistent Volume
– MIS uses Elastic Block Storage PVs
•EBS Volumes
– Access Limited to Availability Zone
9© 2019 Proofpoint. All rights reserved
Kubernetes Challenge:
Unpredictable Nodes
•Node taints prevent stateless pods being scheduled on
stateful nodes
•Each node class is part of an Instance Group
•Pod tolerations allow stateful sets to be scheduled on
stateful nodes
•Pod anti-affinities spread the STS across zones and nodes
•Single zone cluster would probably still need anti-affinities
•Non-shared cluster still requires taints/tolerations
10© 2019 Proofpoint. All rights reserved
Instance Groups
Instance
Group
# EC2 Type Node Taint Node Label EBS
(GB)
kafka 10 m5.xlarge dedicated=mis service: kafka 500
es-data 51 m5.4xlarge dedicated=mis service: es-data 4,000
es-master 3 m5.large dedicated=mis service: es-
master
10
es-client 3 m5.2xlarge dedicated=mis service: es-client
apps 10 m5.2xlarge dedicated=mis service: apps
pod-
logging
3 m5.2xlarge dedicated=mis service: pod-
logging
11© 2019 Proofpoint. All rights reserved
Applications
12© 2019 Proofpoint. All rights reserved
Application Instance Group
Kafka kafka-nodes
Zookeeper kafka-nodes
Secor apps-nodes
Elasticsearch Master es-master-nodes
Elasticsearch Data es-data-nodes
Elasticsearch Client es-client-nodes
Kafka Connect apps-nodes
Search Service apps-nodes
Logging Service pod-logging-nodes
Kibana apps-nodes
Admin apps-nodes
Kubernetes Challenge:
EBS Read-Only Volume
•NVMe driver default timeout is 30 seconds
•driver fails the I/O and filesystem remounted as read-only
•Solutions:
–Update driver timeout (insufficient)
–Update driver (required new AMI)
13© 2019 Proofpoint. All rights reserved
Kubernetes Challenge:
CNI plugin does not scale
•Weave is unstable past 160 nodes
Solutions:
– Limit nodes
– Use different Container Network Interface
14© 2019 Proofpoint. All rights reserved
Kafka Challenge:
Parsing json records is slow
•API Gateway introduced unacceptable latency
•Secor routing extremely slow
Solutions:
– Incoming gateway uses header to determine message topic
– Secor modified to use Kafka header to determine partition
15© 2019 Proofpoint. All rights reserved
Elasticsearch Challenge:
Document Ingestion is slow
•Carefully tune index templates
– Map all fields
– Index only fields that are searched
– Use keyword type where possible
•Tune shard size
– Aim for 50G shards
– Add nodes as needed
16© 2019 Proofpoint. All rights reserved
Elasticsearch Challenge:
Document deletion is slow
•Drop entire index instead of individual documents
– Create index templates with schema
– Cron job to create daily indices
– Cron job to delete expired indices
17© 2019 Proofpoint. All rights reserved
Elasticsearch Challenge:
Document search is slow
•Only index fields that are searched
•Use keyword type where possible
•Limit search to appropriate daily indices
•Use shard routing and index.routing_partition_size to
limit queried shards
•Use index.routing.allocation.total_shards_per_node to
evenly distributed shards across data nodes
18© 2019 Proofpoint. All rights reserved
Elasticsearch Challenge:
Bursts and Resurrected Indices
•Email clusters may reconnect after extensive time offline
– Burst of incoming data
– Data may already be expired
Solutions:
– Network headroom handles bursts
– Hourly cron job deletes expired indices
19© 2019 Proofpoint. All rights reserved
Future Directions
•I3 EC2 instances
– Local SSD storage
– Decrease costs and increase ingestion speed
•Multiple retention periods
– 30, 60, 90, 365-day retentions
•Operators
– Graceful scaling
– In place software upgrade
•Additional meta-data
20© 2019 Proofpoint. All rights reserved
Questions? / Contacts
• chonton@proofpoint.com
• ahong@proofpoint.com
21© 2019 Proofpoint. All rights reserved
© 2019 Proofpoint. All rights reserved
22

More Related Content

What's hot

Introduction to container mangement
Introduction to container mangementIntroduction to container mangement
Introduction to container mangementMartin Marcher
 
Virtualizing Apache Spark with Justin Murray
Virtualizing Apache Spark with Justin MurrayVirtualizing Apache Spark with Justin Murray
Virtualizing Apache Spark with Justin MurrayDatabricks
 
Kubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerKubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerAkshay Mathur
 
Manage thousands of k8s applications with minimal efforts using kube carrier
Manage thousands of k8s applications with minimal efforts using kube carrierManage thousands of k8s applications with minimal efforts using kube carrier
Manage thousands of k8s applications with minimal efforts using kube carrierLibbySchulze
 
Persistent Storage for Containerized Applications
Persistent Storage for Containerized ApplicationsPersistent Storage for Containerized Applications
Persistent Storage for Containerized ApplicationsColleen Corrice
 
Helix core on aws webinar
Helix core on aws webinar Helix core on aws webinar
Helix core on aws webinar Perforce
 
Multi-Cloud Orchestration for Kubernetes with Cloudify - Webinar Presentation
Multi-Cloud Orchestration for Kubernetes with Cloudify - Webinar PresentationMulti-Cloud Orchestration for Kubernetes with Cloudify - Webinar Presentation
Multi-Cloud Orchestration for Kubernetes with Cloudify - Webinar PresentationCloudify Community
 
AppOrbit DevOps NYC
AppOrbit DevOps NYCAppOrbit DevOps NYC
AppOrbit DevOps NYCBob Sokol
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
MySQL Head to Head Performance
MySQL Head to Head PerformanceMySQL Head to Head Performance
MySQL Head to Head PerformanceKyle Bader
 
Build bare metal kubernetes cluster for hpc on open stack in translational me...
Build bare metal kubernetes cluster for hpc on open stack in translational me...Build bare metal kubernetes cluster for hpc on open stack in translational me...
Build bare metal kubernetes cluster for hpc on open stack in translational me...Shuquan Huang
 
Magento 2 with Remote Storage
Magento 2 with Remote StorageMagento 2 with Remote Storage
Magento 2 with Remote StorageOleg Posyniak
 
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabsDesign Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabszekeLabs Technologies
 
OpenNebulaConf2015 1.03 Private, Public, Hybrid: The Real Economics of Open S...
OpenNebulaConf2015 1.03 Private, Public, Hybrid: The Real Economics of Open S...OpenNebulaConf2015 1.03 Private, Public, Hybrid: The Real Economics of Open S...
OpenNebulaConf2015 1.03 Private, Public, Hybrid: The Real Economics of Open S...OpenNebula Project
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?Kyle Bader
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 
Cloud Native Patterns
Cloud Native PatternsCloud Native Patterns
Cloud Native PatternsBilgin Ibryam
 

What's hot (20)

Introduction to container mangement
Introduction to container mangementIntroduction to container mangement
Introduction to container mangement
 
Kubernetes on DC/OS
Kubernetes on DC/OSKubernetes on DC/OS
Kubernetes on DC/OS
 
Virtualizing Apache Spark with Justin Murray
Virtualizing Apache Spark with Justin MurrayVirtualizing Apache Spark with Justin Murray
Virtualizing Apache Spark with Justin Murray
 
AKS
AKSAKS
AKS
 
Kubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning ControllerKubernetes as Orchestrator for A10 Lightning Controller
Kubernetes as Orchestrator for A10 Lightning Controller
 
Manage thousands of k8s applications with minimal efforts using kube carrier
Manage thousands of k8s applications with minimal efforts using kube carrierManage thousands of k8s applications with minimal efforts using kube carrier
Manage thousands of k8s applications with minimal efforts using kube carrier
 
Persistent Storage for Containerized Applications
Persistent Storage for Containerized ApplicationsPersistent Storage for Containerized Applications
Persistent Storage for Containerized Applications
 
Helix core on aws webinar
Helix core on aws webinar Helix core on aws webinar
Helix core on aws webinar
 
Multi-Cloud Orchestration for Kubernetes with Cloudify - Webinar Presentation
Multi-Cloud Orchestration for Kubernetes with Cloudify - Webinar PresentationMulti-Cloud Orchestration for Kubernetes with Cloudify - Webinar Presentation
Multi-Cloud Orchestration for Kubernetes with Cloudify - Webinar Presentation
 
AppOrbit DevOps NYC
AppOrbit DevOps NYCAppOrbit DevOps NYC
AppOrbit DevOps NYC
 
MySQL on Ceph
MySQL on CephMySQL on Ceph
MySQL on Ceph
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
MySQL Head to Head Performance
MySQL Head to Head PerformanceMySQL Head to Head Performance
MySQL Head to Head Performance
 
Build bare metal kubernetes cluster for hpc on open stack in translational me...
Build bare metal kubernetes cluster for hpc on open stack in translational me...Build bare metal kubernetes cluster for hpc on open stack in translational me...
Build bare metal kubernetes cluster for hpc on open stack in translational me...
 
Magento 2 with Remote Storage
Magento 2 with Remote StorageMagento 2 with Remote Storage
Magento 2 with Remote Storage
 
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabsDesign Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
Design Patterns for Pods and Containers in Kubernetes - Webinar by zekeLabs
 
OpenNebulaConf2015 1.03 Private, Public, Hybrid: The Real Economics of Open S...
OpenNebulaConf2015 1.03 Private, Public, Hybrid: The Real Economics of Open S...OpenNebulaConf2015 1.03 Private, Public, Hybrid: The Real Economics of Open S...
OpenNebulaConf2015 1.03 Private, Public, Hybrid: The Real Economics of Open S...
 
Which Hypervisor is Best?
Which Hypervisor is Best?Which Hypervisor is Best?
Which Hypervisor is Best?
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Cloud Native Patterns
Cloud Native PatternsCloud Native Patterns
Cloud Native Patterns
 

Similar to Kafka and elastic on kubernetes

Lc3 beijing-june262018-sahdev zala-guangya
Lc3 beijing-june262018-sahdev zala-guangyaLc3 beijing-june262018-sahdev zala-guangya
Lc3 beijing-june262018-sahdev zala-guangyaSahdev Zala
 
Microservices Development - ICP Workshop Batch II
Microservices Development - ICP Workshop Batch IIMicroservices Development - ICP Workshop Batch II
Microservices Development - ICP Workshop Batch IIPT Datacomm Diangraha
 
The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)Simon Haslam
 
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...DevOps_Fest
 
20191201 kubernetes managed weblogic revival - part 2
20191201 kubernetes managed weblogic revival - part 220191201 kubernetes managed weblogic revival - part 2
20191201 kubernetes managed weblogic revival - part 2makker_nl
 
YugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesYugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesDoKC
 
Kubernetes Internals
Kubernetes InternalsKubernetes Internals
Kubernetes InternalsShimi Bandiel
 
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Aerospike
 
Overcoming write availability challenges of PostgreSQL
Overcoming write availability challenges of PostgreSQLOvercoming write availability challenges of PostgreSQL
Overcoming write availability challenges of PostgreSQLEDB
 
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...Paul Brebner
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent
 
Running Kafka on Kubernetes, across three clouds at Adobe
Running Kafka on Kubernetes, across three clouds at AdobeRunning Kafka on Kubernetes, across three clouds at Adobe
Running Kafka on Kubernetes, across three clouds at AdobeDoKC
 
Migrating a build farm from on-prem to AWS
Migrating a build farm from on-prem to AWSMigrating a build farm from on-prem to AWS
Migrating a build farm from on-prem to AWSClaes Buckwalter
 
Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...
Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...
Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...Flink Forward
 
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Till Rohrmann
 
IBM Cloud Integration Platform High Availability - Integration Tech Conference
IBM Cloud Integration Platform High Availability - Integration Tech ConferenceIBM Cloud Integration Platform High Availability - Integration Tech Conference
IBM Cloud Integration Platform High Availability - Integration Tech ConferenceRobert Nicholson
 
Architecture of Falcon, a new chat messaging backend system build on Scala
Architecture of Falcon,  a new chat messaging backend system  build on ScalaArchitecture of Falcon,  a new chat messaging backend system  build on Scala
Architecture of Falcon, a new chat messaging backend system build on ScalaTanUkkii
 
Hybrid cloud openstack meetup
Hybrid cloud openstack meetupHybrid cloud openstack meetup
Hybrid cloud openstack meetupdfilppi
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...DataWorks Summit
 
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011Toby Bloom
 

Similar to Kafka and elastic on kubernetes (20)

Lc3 beijing-june262018-sahdev zala-guangya
Lc3 beijing-june262018-sahdev zala-guangyaLc3 beijing-june262018-sahdev zala-guangya
Lc3 beijing-june262018-sahdev zala-guangya
 
Microservices Development - ICP Workshop Batch II
Microservices Development - ICP Workshop Batch IIMicroservices Development - ICP Workshop Batch II
Microservices Development - ICP Workshop Batch II
 
The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)The Kubernetes WebLogic revival (part 2)
The Kubernetes WebLogic revival (part 2)
 
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
 
20191201 kubernetes managed weblogic revival - part 2
20191201 kubernetes managed weblogic revival - part 220191201 kubernetes managed weblogic revival - part 2
20191201 kubernetes managed weblogic revival - part 2
 
YugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on KubernetesYugabyteDB - Distributed SQL Database on Kubernetes
YugabyteDB - Distributed SQL Database on Kubernetes
 
Kubernetes Internals
Kubernetes InternalsKubernetes Internals
Kubernetes Internals
 
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
Handling Increasing Load and Reducing Costs Using Aerospike NoSQL Database - ...
 
Overcoming write availability challenges of PostgreSQL
Overcoming write availability challenges of PostgreSQLOvercoming write availability challenges of PostgreSQL
Overcoming write availability challenges of PostgreSQL
 
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
The Impact of Hardware and Software Version Changes on Apache Kafka Performan...
 
Capital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream ProcessingCapital One Delivers Risk Insights in Real Time with Stream Processing
Capital One Delivers Risk Insights in Real Time with Stream Processing
 
Running Kafka on Kubernetes, across three clouds at Adobe
Running Kafka on Kubernetes, across three clouds at AdobeRunning Kafka on Kubernetes, across three clouds at Adobe
Running Kafka on Kubernetes, across three clouds at Adobe
 
Migrating a build farm from on-prem to AWS
Migrating a build farm from on-prem to AWSMigrating a build farm from on-prem to AWS
Migrating a build farm from on-prem to AWS
 
Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...
Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...
Flink Forward San Francisco 2019: Future of Apache Flink Deployments: Contain...
 
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
 
IBM Cloud Integration Platform High Availability - Integration Tech Conference
IBM Cloud Integration Platform High Availability - Integration Tech ConferenceIBM Cloud Integration Platform High Availability - Integration Tech Conference
IBM Cloud Integration Platform High Availability - Integration Tech Conference
 
Architecture of Falcon, a new chat messaging backend system build on Scala
Architecture of Falcon,  a new chat messaging backend system  build on ScalaArchitecture of Falcon,  a new chat messaging backend system  build on Scala
Architecture of Falcon, a new chat messaging backend system build on Scala
 
Hybrid cloud openstack meetup
Hybrid cloud openstack meetupHybrid cloud openstack meetup
Hybrid cloud openstack meetup
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...
 
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
Cloud Computing: Safe Haven from the Data Deluge? AGBT 2011
 

Recently uploaded

1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdfAldoGarca30
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxpritamlangde
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxkalpana413121
 
Ground Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementGround Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementDr. Deepak Mudgal
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsvanyagupta248
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...jabtakhaidam7
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
Ghuma $ Russian Call Girls Ahmedabad ₹7.5k Pick Up & Drop With Cash Payment 8...
Ghuma $ Russian Call Girls Ahmedabad ₹7.5k Pick Up & Drop With Cash Payment 8...Ghuma $ Russian Call Girls Ahmedabad ₹7.5k Pick Up & Drop With Cash Payment 8...
Ghuma $ Russian Call Girls Ahmedabad ₹7.5k Pick Up & Drop With Cash Payment 8...gragchanchal546
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Ramkumar k
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdfKamal Acharya
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...vershagrag
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptAfnanAhmad53
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 

Recently uploaded (20)

1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Digital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptxDigital Communication Essentials: DPCM, DM, and ADM .pptx
Digital Communication Essentials: DPCM, DM, and ADM .pptx
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptx
 
Ground Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth ReinforcementGround Improvement Technique: Earth Reinforcement
Ground Improvement Technique: Earth Reinforcement
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
Jaipur ❤CALL GIRL 0000000000❤CALL GIRLS IN Jaipur ESCORT SERVICE❤CALL GIRL IN...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Ghuma $ Russian Call Girls Ahmedabad ₹7.5k Pick Up & Drop With Cash Payment 8...
Ghuma $ Russian Call Girls Ahmedabad ₹7.5k Pick Up & Drop With Cash Payment 8...Ghuma $ Russian Call Girls Ahmedabad ₹7.5k Pick Up & Drop With Cash Payment 8...
Ghuma $ Russian Call Girls Ahmedabad ₹7.5k Pick Up & Drop With Cash Payment 8...
 
Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)Theory of Time 2024 (Universal Theory for Everything)
Theory of Time 2024 (Universal Theory for Everything)
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
💚Trustworthy Call Girls Pune Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top...
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
fitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .pptfitting shop and tools used in fitting shop .ppt
fitting shop and tools used in fitting shop .ppt
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 

Kafka and elastic on kubernetes

  • 1. Some Challenges Deploying a Kafka / Elasticsearch Pipeline with Kubernetes 08/28/2019 © 2019 Proofpoint. All rights reserved
  • 2. Very Attacked People 2© 2019 Proofpoint. All rights reserved
  • 3. Message Intelligence Service Near real time connection and metadata omnibus 3© 2019 Proofpoint. All rights reserved 2B2TB 5000 60 r/s
  • 4. Prior and Current Generation 4© 2019 Proofpoint. All rights reserved © 2019 Proofpoint. All rights reserved
  • 5. Why Kubernetes? •Available for all public clouds and bare metal •Widely known •Highly Scalable •Likely “winner” for long time 5© 2019 Proofpoint. All rights reserved
  • 6. Kubernetes Advantages •System resources are shared across services •Automatic service scaling as needed – Nodes and pods are created and destroyed ad-hoc •Pods scheduled on any node with capacity 6© 2019 Proofpoint. All rights reserved
  • 7. Kubernetes Stateful Service Challenges •Services are scaled as needed – Nodes and pods are created and destroyed ad-hoc •Pods scheduled on any node with capacity •Service pods do not receive any disk resources. 7© 2019 Proofpoint. All rights reserved
  • 8. Kubernetes Challenge: Upgrades are upsetting •KOPS prevents using local storage Solutions: – EBS volumes (which have own problems) – Upgrade using in-place container OS •Coreos/Tectonic •Flatcar/Lokomotiv •Juju 8© 2019 Proofpoint. All rights reserved
  • 9. Kubernetes Challenge: Disk Resources •Stateful Set (STS) – Guaranteed ordering and uniqueness of Pods – Persistent pod identity – Storage available •PVC / SC / PV – Pod specification requests a Persistent Volume Claim – PVC requests a Storage Class – SC is producer of Persistent Volume – MIS uses Elastic Block Storage PVs •EBS Volumes – Access Limited to Availability Zone 9© 2019 Proofpoint. All rights reserved
  • 10. Kubernetes Challenge: Unpredictable Nodes •Node taints prevent stateless pods being scheduled on stateful nodes •Each node class is part of an Instance Group •Pod tolerations allow stateful sets to be scheduled on stateful nodes •Pod anti-affinities spread the STS across zones and nodes •Single zone cluster would probably still need anti-affinities •Non-shared cluster still requires taints/tolerations 10© 2019 Proofpoint. All rights reserved
  • 11. Instance Groups Instance Group # EC2 Type Node Taint Node Label EBS (GB) kafka 10 m5.xlarge dedicated=mis service: kafka 500 es-data 51 m5.4xlarge dedicated=mis service: es-data 4,000 es-master 3 m5.large dedicated=mis service: es- master 10 es-client 3 m5.2xlarge dedicated=mis service: es-client apps 10 m5.2xlarge dedicated=mis service: apps pod- logging 3 m5.2xlarge dedicated=mis service: pod- logging 11© 2019 Proofpoint. All rights reserved
  • 12. Applications 12© 2019 Proofpoint. All rights reserved Application Instance Group Kafka kafka-nodes Zookeeper kafka-nodes Secor apps-nodes Elasticsearch Master es-master-nodes Elasticsearch Data es-data-nodes Elasticsearch Client es-client-nodes Kafka Connect apps-nodes Search Service apps-nodes Logging Service pod-logging-nodes Kibana apps-nodes Admin apps-nodes
  • 13. Kubernetes Challenge: EBS Read-Only Volume •NVMe driver default timeout is 30 seconds •driver fails the I/O and filesystem remounted as read-only •Solutions: –Update driver timeout (insufficient) –Update driver (required new AMI) 13© 2019 Proofpoint. All rights reserved
  • 14. Kubernetes Challenge: CNI plugin does not scale •Weave is unstable past 160 nodes Solutions: – Limit nodes – Use different Container Network Interface 14© 2019 Proofpoint. All rights reserved
  • 15. Kafka Challenge: Parsing json records is slow •API Gateway introduced unacceptable latency •Secor routing extremely slow Solutions: – Incoming gateway uses header to determine message topic – Secor modified to use Kafka header to determine partition 15© 2019 Proofpoint. All rights reserved
  • 16. Elasticsearch Challenge: Document Ingestion is slow •Carefully tune index templates – Map all fields – Index only fields that are searched – Use keyword type where possible •Tune shard size – Aim for 50G shards – Add nodes as needed 16© 2019 Proofpoint. All rights reserved
  • 17. Elasticsearch Challenge: Document deletion is slow •Drop entire index instead of individual documents – Create index templates with schema – Cron job to create daily indices – Cron job to delete expired indices 17© 2019 Proofpoint. All rights reserved
  • 18. Elasticsearch Challenge: Document search is slow •Only index fields that are searched •Use keyword type where possible •Limit search to appropriate daily indices •Use shard routing and index.routing_partition_size to limit queried shards •Use index.routing.allocation.total_shards_per_node to evenly distributed shards across data nodes 18© 2019 Proofpoint. All rights reserved
  • 19. Elasticsearch Challenge: Bursts and Resurrected Indices •Email clusters may reconnect after extensive time offline – Burst of incoming data – Data may already be expired Solutions: – Network headroom handles bursts – Hourly cron job deletes expired indices 19© 2019 Proofpoint. All rights reserved
  • 20. Future Directions •I3 EC2 instances – Local SSD storage – Decrease costs and increase ingestion speed •Multiple retention periods – 30, 60, 90, 365-day retentions •Operators – Graceful scaling – In place software upgrade •Additional meta-data 20© 2019 Proofpoint. All rights reserved
  • 21. Questions? / Contacts • chonton@proofpoint.com • ahong@proofpoint.com 21© 2019 Proofpoint. All rights reserved
  • 22. © 2019 Proofpoint. All rights reserved 22