SlideShare a Scribd company logo
Kubernetes Clusters at scale
on AWS @ Intuit
- Shri Javadekar (@shrinandj)
Intuit Confidential and Proprietary 2
● Design and development started in Jan ‘18
● First application was running Kafka on Kubernetes
● Running clusters in dev/test, pre-prod and prod environments
since Apr ‘18.
● Over 150 Kubernetes clusters and 3000 namespaces today…
Journey so far ...
Intuit Confidential and Proprietary 3
Journey so far ...
1634
Preprod Services
1989
Total Services
435
Prod Services
Intuit Confidential and Proprietary 4
● Intuit Kubernetes Service
○ Using Kops today
○ Moving to EKS
● Intuit Kubernetes Service Manager (may open source)
● Custom Resources for cluster lifecycle management (aka. Keiko)
Modern SaaS platform today ...
Intuit Confidential and Proprietary 5
alb-ingress kube-dns fluentd metrics prometheus autoscaler
Addons
User namespace 1 User namespace 2 User namespace 3 User namespace n
Applications
kube-apiserver kube-proxy
K8s Control Plane
kube-scheduler kube-controlleretcd
Each Kubernetes cluster today ...
Intuit Confidential and Proprietary 6
Master Nodes
alb-ingress kiam eventrouter metrics kube-dns autoscaler
Addons
kube-apiserver kube-proxy
K8s Control Plane
kube-scheduler kube-controlleretcd
Each Kubernetes cluster today ...
Intuit Confidential and Proprietary 7
The Problems
Intuit Confidential and Proprietary 8
Addons
- Common functionality needed by all apps on a
cluster
- DNS, log forwarding, metrics, identity, etc.
- Integrate with other AWS services such as ALB.
Intuit Confidential and Proprietary 9
Multi-tenancy
- What does each tenant mean?
- Namespace?
- Kubernetes objects with the same label?
- Some CRD?
We decided to go with Kubernetes Namespaces
Intuit Confidential and Proprietary 10
Multi-tenancy
- What does each tenant mean?
- Namespace?
- Kubernetes objects with the same label?
- Some CRD?
We decided to go with Kubernetes Namespaces
Intuit Confidential and Proprietary 11
More Multi-tenancy issues
- Noisy neighbour
- Customized setup
- Tenant specific AMIs
- Tenant specific instance types
- Cost accounting
Intuit Confidential and Proprietary 12
Multi-tenancy solutions
- Instance Group per Namespace
- Customized labels
- Centralized upgrades
We decided to go with ...
Intuit Confidential and Proprietary 13
Resilience and hardening ...
- Pods stuck in terminating state ...
- EC2 instance networking broken ...
Intuit Confidential and Proprietary 14
Deep monitoring
- Not enough to simply check if components are “up”
- Deep monitoring
- Actually exercise the functionality
- Periodically
- Preferably automatic remediation
Intuit Confidential and Proprietary 15
Cost efficiency
- How do we reduce costs?
Intuit Confidential and Proprietary 16
Keiko
“Keiko provides a set of independent open-source tools
for orchestration and management of multi-tenant,
reliable, secure and efficient Kubernetes clusters at scale.”
github.com/keikoproj
Instance manager Kube forensics
Upgrade
manager
Active monitor Addon manager Governor Minion manager
Intuit Confidential and Proprietary 17
Keiko
github.com/keikoproj
Instance manager Kube forensics
Upgrade
manager
Active monitor Addon manager Governor Minion manager
github.com/keikoproj
twitter.com/keikoproj
Intuit Confidential and Proprietary 18
Instance-manager
- Declaratively provision and manage ASGs (nodes)
- Number and type of nodes
- Labels and taints
- Subnets and security groups
$ kubectl create -f /tmp/hello_world.yaml
instancegroup.instancemgr.keikoproj.io/hello-world created
$ kubectl get igs
NAME STATE MIN MAX GROUP NAME PROVISIONER STRATEGY
AGE
hello-world Ready 2 3 shri-east-2-instance-manager-hello-world-NodeGroup-16Y8ZA1ZJW8JK eks-cf crd 3m
nodes Ready 2 3 shri-east-2-instance-manager-nodes-NodeGroup-1K1T3YSXCCCK9 eks-cf crd 1d
Intuit Confidential and Proprietary 19
Upgrade-manager
- Upgrade Manager provides RollingUpgrade, a
Kubernetes native mechanism for doing
rolling-updates of instances in an AutoScaling group
using a CRD and a controller.
Intuit Confidential and Proprietary 20
Addon-Manager
Addons are critical components within a Kubernetes cluster that
provide basic services needed by applications like DNS,
Ingress, Metrics, Logging, etc. Addon Manager provides a CRD
for lifecycle management of such addons using Argo
Workflows.
Intuit Confidential and Proprietary 21
Addon-Manager
Intuit Confidential and Proprietary 22
Governor
Governor improves the stability of large Kubernetes
clusters by proactively terminating failed but stuck pods
and misbehaving nodes.
Intuit Confidential and Proprietary 23
Minion-manager
Minion-manager enables the intelligent use of Spot
Instances in Kubernetes clusters on AWS. This is done
by factoring in on-demand prices, spot-instance prices
and current state of the AutoScalingGroups.
Intuit Confidential and Proprietary 24
Kube-forensics
Kube-forensics allows a cluster administrator to dump
the current state of a running pod and all its containers
so that security professionals can perform offline
forensic analysis.
Intuit Confidential and Proprietary 25
Active-monitor
Active-Monitor is a Kubernetes custom
resource controller which uses Argo
Workflows for deep cluster monitoring.
Intuit Confidential and Proprietary 26
Keiko
Orchestration
Instance-manager Upgrade-manager
Reliability
Governor
Cost Eff
Minion Manager
Addon Manager
Security
Kube-Forensics
Monitoring
Active-monitor
Intuit Confidential and Proprietary 27
Keiko Demo
Intuit Confidential and Proprietary 28
Coming up ...
- Kubernetes control plane using EKS
- Multi-cluster Service Mesh using Istio
- OpenTelemetry
- GitOps for AWS resources
- Experimentation platform
- And more ...
Intuit Confidential and Proprietary 29
There’s a lot happening ...
<Shameless plug about hiring (referrals) …>
Thank you

More Related Content

What's hot

Elasticsearch on Kubernetes
Elasticsearch on KubernetesElasticsearch on Kubernetes
Elasticsearch on Kubernetes
Joerg Henning
 
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Edureka!
 
Kubernetes & Google Kubernetes Engine (GKE)
Kubernetes & Google Kubernetes Engine (GKE)Kubernetes & Google Kubernetes Engine (GKE)
Kubernetes & Google Kubernetes Engine (GKE)
Akash Agrawal
 
Google container engine (GKE)
Google container engine (GKE)Google container engine (GKE)
Google container engine (GKE)
Md. Sadhan Sarker
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
Gayan Gunarathne
 
Enhancing Kubernetes with Autoscaling & Hybrid Cloud IaaS
Enhancing Kubernetes with Autoscaling & Hybrid Cloud IaaSEnhancing Kubernetes with Autoscaling & Hybrid Cloud IaaS
Enhancing Kubernetes with Autoscaling & Hybrid Cloud IaaS
Matt Baldwin
 
A Primer on Kubernetes and Google Container Engine
A Primer on Kubernetes and Google Container EngineA Primer on Kubernetes and Google Container Engine
A Primer on Kubernetes and Google Container Engine
RightScale
 
Intuit Kubernetes Journey
Intuit Kubernetes JourneyIntuit Kubernetes Journey
Intuit Kubernetes Journey
Ravi Hari
 
Kubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory GuideKubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory Guide
Bytemark
 
Are you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the networkAre you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the network
Megan O'Keefe
 
Intro to creating kubernetes operators
Intro to creating kubernetes operators Intro to creating kubernetes operators
Intro to creating kubernetes operators
Juraj Hantak
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
csegayan
 
Working with kubernetes
Working with kubernetesWorking with kubernetes
Working with kubernetes
Nagaraj Shenoy
 
Kubernetes Deployment Tutorial | Kubernetes Tutorial For Beginners | Kubernet...
Kubernetes Deployment Tutorial | Kubernetes Tutorial For Beginners | Kubernet...Kubernetes Deployment Tutorial | Kubernetes Tutorial For Beginners | Kubernet...
Kubernetes Deployment Tutorial | Kubernetes Tutorial For Beginners | Kubernet...
Edureka!
 
The evolving container landscape
The evolving container landscapeThe evolving container landscape
The evolving container landscape
Nilesh Trivedi
 
Kubernetes a comprehensive overview
Kubernetes   a comprehensive overviewKubernetes   a comprehensive overview
Kubernetes a comprehensive overview
Gabriel Carro
 
ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...
ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...
ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...
AWS User Group Kochi
 
Role based access control - RBAC - Kubernetes
Role based access control - RBAC - KubernetesRole based access control - RBAC - Kubernetes
Role based access control - RBAC - Kubernetes
Milan Das
 
Deploy Elasticsearch Cluster on Kubernetes
Deploy Elasticsearch Cluster on KubernetesDeploy Elasticsearch Cluster on Kubernetes
Deploy Elasticsearch Cluster on Kubernetes
Ismaeel Enjreny
 
Get started with Kubernetes on GKE
Get started with Kubernetes on GKEGet started with Kubernetes on GKE
Get started with Kubernetes on GKE
Zachary Russell
 

What's hot (20)

Elasticsearch on Kubernetes
Elasticsearch on KubernetesElasticsearch on Kubernetes
Elasticsearch on Kubernetes
 
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
 
Kubernetes & Google Kubernetes Engine (GKE)
Kubernetes & Google Kubernetes Engine (GKE)Kubernetes & Google Kubernetes Engine (GKE)
Kubernetes & Google Kubernetes Engine (GKE)
 
Google container engine (GKE)
Google container engine (GKE)Google container engine (GKE)
Google container engine (GKE)
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
 
Enhancing Kubernetes with Autoscaling & Hybrid Cloud IaaS
Enhancing Kubernetes with Autoscaling & Hybrid Cloud IaaSEnhancing Kubernetes with Autoscaling & Hybrid Cloud IaaS
Enhancing Kubernetes with Autoscaling & Hybrid Cloud IaaS
 
A Primer on Kubernetes and Google Container Engine
A Primer on Kubernetes and Google Container EngineA Primer on Kubernetes and Google Container Engine
A Primer on Kubernetes and Google Container Engine
 
Intuit Kubernetes Journey
Intuit Kubernetes JourneyIntuit Kubernetes Journey
Intuit Kubernetes Journey
 
Kubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory GuideKubernetes for Beginners: An Introductory Guide
Kubernetes for Beginners: An Introductory Guide
 
Are you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the networkAre you ready to be edgy? Bringing applications to the edge of the network
Are you ready to be edgy? Bringing applications to the edge of the network
 
Intro to creating kubernetes operators
Intro to creating kubernetes operators Intro to creating kubernetes operators
Intro to creating kubernetes operators
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
 
Working with kubernetes
Working with kubernetesWorking with kubernetes
Working with kubernetes
 
Kubernetes Deployment Tutorial | Kubernetes Tutorial For Beginners | Kubernet...
Kubernetes Deployment Tutorial | Kubernetes Tutorial For Beginners | Kubernet...Kubernetes Deployment Tutorial | Kubernetes Tutorial For Beginners | Kubernet...
Kubernetes Deployment Tutorial | Kubernetes Tutorial For Beginners | Kubernet...
 
The evolving container landscape
The evolving container landscapeThe evolving container landscape
The evolving container landscape
 
Kubernetes a comprehensive overview
Kubernetes   a comprehensive overviewKubernetes   a comprehensive overview
Kubernetes a comprehensive overview
 
ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...
ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...
ACDKOCHI19 - Turbocharge Developer productivity with platform build on K8S an...
 
Role based access control - RBAC - Kubernetes
Role based access control - RBAC - KubernetesRole based access control - RBAC - Kubernetes
Role based access control - RBAC - Kubernetes
 
Deploy Elasticsearch Cluster on Kubernetes
Deploy Elasticsearch Cluster on KubernetesDeploy Elasticsearch Cluster on Kubernetes
Deploy Elasticsearch Cluster on Kubernetes
 
Get started with Kubernetes on GKE
Get started with Kubernetes on GKEGet started with Kubernetes on GKE
Get started with Kubernetes on GKE
 

Similar to Acd19 kubertes cluster at scale on aws at intuit

Delivering-Off-The-Shelf Software with Kubernetes- November 12, 2020
Delivering-Off-The-Shelf Software with Kubernetes- November 12, 2020Delivering-Off-The-Shelf Software with Kubernetes- November 12, 2020
Delivering-Off-The-Shelf Software with Kubernetes- November 12, 2020
VMware Tanzu
 
Kubernetes Administration from Zero to Hero.pdf
Kubernetes Administration from Zero to Hero.pdfKubernetes Administration from Zero to Hero.pdf
Kubernetes Administration from Zero to Hero.pdf
ArzooGupta16
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
Gayan Gunarathne
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
Ambassador Labs
 
1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika
Juraj Hantak
 
Unleashing k8 s to reduce complexities of an entire middleware platform
Unleashing k8 s to reduce complexities of an entire middleware platformUnleashing k8 s to reduce complexities of an entire middleware platform
Unleashing k8 s to reduce complexities of an entire middleware platform
Lakmal Warusawithana
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
Helder Klemp
 
Mattia Gandolfi - Improving utilization and portability with Containers and C...
Mattia Gandolfi - Improving utilization and portability with Containers and C...Mattia Gandolfi - Improving utilization and portability with Containers and C...
Mattia Gandolfi - Improving utilization and portability with Containers and C...
Codemotion
 
Nugwc k8s session-16-march-2021
Nugwc k8s session-16-march-2021Nugwc k8s session-16-march-2021
Nugwc k8s session-16-march-2021
Avanti Patil
 
Google Cloud Platform Kubernetes Workshop IYTE
Google Cloud Platform Kubernetes Workshop IYTEGoogle Cloud Platform Kubernetes Workshop IYTE
Google Cloud Platform Kubernetes Workshop IYTE
Gokhan Boranalp
 
An intro to Kubernetes operators
An intro to Kubernetes operatorsAn intro to Kubernetes operators
An intro to Kubernetes operators
J On The Beach
 
Container Orchestration using kubernetes
Container Orchestration using kubernetesContainer Orchestration using kubernetes
Container Orchestration using kubernetes
Puneet Kumar Bhatia (MBA, ITIL V3 Certified)
 
Intro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps WorkshopIntro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps Workshop
Weaveworks
 
Kubernetes intro
Kubernetes introKubernetes intro
Kubernetes intro
Pravin Magdum
 
Fabio rapposelli pks-vmug
Fabio rapposelli   pks-vmugFabio rapposelli   pks-vmug
Fabio rapposelli pks-vmug
VMUG IT
 
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
Athens Big Data
 
Virtual Kubernetes Clusters on Amazon EKS
Virtual Kubernetes Clusters on Amazon EKSVirtual Kubernetes Clusters on Amazon EKS
Virtual Kubernetes Clusters on Amazon EKS
Jim Bugwadia
 
Kubermatic.pdf
Kubermatic.pdfKubermatic.pdf
Kubermatic.pdf
LibbySchulze
 
Kubermatic CNCF Webinar - start.kubermatic.pdf
Kubermatic CNCF Webinar - start.kubermatic.pdfKubermatic CNCF Webinar - start.kubermatic.pdf
Kubermatic CNCF Webinar - start.kubermatic.pdf
LibbySchulze
 
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
VMUG IT
 

Similar to Acd19 kubertes cluster at scale on aws at intuit (20)

Delivering-Off-The-Shelf Software with Kubernetes- November 12, 2020
Delivering-Off-The-Shelf Software with Kubernetes- November 12, 2020Delivering-Off-The-Shelf Software with Kubernetes- November 12, 2020
Delivering-Off-The-Shelf Software with Kubernetes- November 12, 2020
 
Kubernetes Administration from Zero to Hero.pdf
Kubernetes Administration from Zero to Hero.pdfKubernetes Administration from Zero to Hero.pdf
Kubernetes Administration from Zero to Hero.pdf
 
Containers kuberenetes
Containers kuberenetesContainers kuberenetes
Containers kuberenetes
 
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOpsDevOps Days Boston 2017: Real-world Kubernetes for DevOps
DevOps Days Boston 2017: Real-world Kubernetes for DevOps
 
1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika1. CNCF kubernetes meetup - Ondrej Sika
1. CNCF kubernetes meetup - Ondrej Sika
 
Unleashing k8 s to reduce complexities of an entire middleware platform
Unleashing k8 s to reduce complexities of an entire middleware platformUnleashing k8 s to reduce complexities of an entire middleware platform
Unleashing k8 s to reduce complexities of an entire middleware platform
 
Introduction to kubernetes
Introduction to kubernetesIntroduction to kubernetes
Introduction to kubernetes
 
Mattia Gandolfi - Improving utilization and portability with Containers and C...
Mattia Gandolfi - Improving utilization and portability with Containers and C...Mattia Gandolfi - Improving utilization and portability with Containers and C...
Mattia Gandolfi - Improving utilization and portability with Containers and C...
 
Nugwc k8s session-16-march-2021
Nugwc k8s session-16-march-2021Nugwc k8s session-16-march-2021
Nugwc k8s session-16-march-2021
 
Google Cloud Platform Kubernetes Workshop IYTE
Google Cloud Platform Kubernetes Workshop IYTEGoogle Cloud Platform Kubernetes Workshop IYTE
Google Cloud Platform Kubernetes Workshop IYTE
 
An intro to Kubernetes operators
An intro to Kubernetes operatorsAn intro to Kubernetes operators
An intro to Kubernetes operators
 
Container Orchestration using kubernetes
Container Orchestration using kubernetesContainer Orchestration using kubernetes
Container Orchestration using kubernetes
 
Intro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps WorkshopIntro to Kubernetes & GitOps Workshop
Intro to Kubernetes & GitOps Workshop
 
Kubernetes intro
Kubernetes introKubernetes intro
Kubernetes intro
 
Fabio rapposelli pks-vmug
Fabio rapposelli   pks-vmugFabio rapposelli   pks-vmug
Fabio rapposelli pks-vmug
 
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
18th Athens Big Data Meetup - 2nd Talk - Run Spark and Flink Jobs on Kubernetes
 
Virtual Kubernetes Clusters on Amazon EKS
Virtual Kubernetes Clusters on Amazon EKSVirtual Kubernetes Clusters on Amazon EKS
Virtual Kubernetes Clusters on Amazon EKS
 
Kubermatic.pdf
Kubermatic.pdfKubermatic.pdf
Kubermatic.pdf
 
Kubermatic CNCF Webinar - start.kubermatic.pdf
Kubermatic CNCF Webinar - start.kubermatic.pdfKubermatic CNCF Webinar - start.kubermatic.pdf
Kubermatic CNCF Webinar - start.kubermatic.pdf
 
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
 

More from John Varghese

Lessons Learned From Cloud Migrations: Planning is Everything
Lessons Learned From Cloud Migrations: Planning is EverythingLessons Learned From Cloud Migrations: Planning is Everything
Lessons Learned From Cloud Migrations: Planning is Everything
John Varghese
 
Leveraging AWS Cloudfront & S3 Services to Deliver Static Assets of a SPA
Leveraging AWS Cloudfront & S3 Services to Deliver Static Assets of a SPALeveraging AWS Cloudfront & S3 Services to Deliver Static Assets of a SPA
Leveraging AWS Cloudfront & S3 Services to Deliver Static Assets of a SPA
John Varghese
 
AWS Transit Gateway-Benefits and Best Practices
AWS Transit Gateway-Benefits and Best PracticesAWS Transit Gateway-Benefits and Best Practices
AWS Transit Gateway-Benefits and Best Practices
John Varghese
 
Bridging Operations and Development With Observabilty
Bridging Operations and Development With ObservabiltyBridging Operations and Development With Observabilty
Bridging Operations and Development With Observabilty
John Varghese
 
Security Observability for Cloud Based Applications
Security Observability for Cloud Based ApplicationsSecurity Observability for Cloud Based Applications
Security Observability for Cloud Based Applications
John Varghese
 
Who Broke My Crypto
Who Broke My CryptoWho Broke My Crypto
Who Broke My Crypto
John Varghese
 
Building an IoT System to Protect My Lunch
Building an IoT System to Protect My LunchBuilding an IoT System to Protect My Lunch
Building an IoT System to Protect My Lunch
John Varghese
 
Building a Highly Secure S3 Bucket
Building a Highly Secure S3 BucketBuilding a Highly Secure S3 Bucket
Building a Highly Secure S3 Bucket
John Varghese
 
Reduce Amazon RDS Costs up to 50% with Proxies
Reduce Amazon RDS Costs up to 50% with ProxiesReduce Amazon RDS Costs up to 50% with Proxies
Reduce Amazon RDS Costs up to 50% with Proxies
John Varghese
 
Keynote - Lead the change around you
Keynote - Lead the change around youKeynote - Lead the change around you
Keynote - Lead the change around you
John Varghese
 
AWS Systems manager 2019
AWS Systems manager 2019AWS Systems manager 2019
AWS Systems manager 2019
John Varghese
 
Emerging job trends and best practices in the aws community
Emerging job trends and best practices in the aws communityEmerging job trends and best practices in the aws community
Emerging job trends and best practices in the aws community
John Varghese
 
Automating security in aws with divvy cloud
Automating security in aws with divvy cloudAutomating security in aws with divvy cloud
Automating security in aws with divvy cloud
John Varghese
 
AWS temporary credentials challenges in prevention detection mitigation
AWS temporary credentials   challenges in prevention detection mitigationAWS temporary credentials   challenges in prevention detection mitigation
AWS temporary credentials challenges in prevention detection mitigation
John Varghese
 
Securing aws workloads with embedded application security
Securing aws workloads with embedded application securitySecuring aws workloads with embedded application security
Securing aws workloads with embedded application security
John Varghese
 
Of CORS thats a thing how CORS in the cloud still kills security
Of CORS thats a thing how CORS in the cloud still kills securityOf CORS thats a thing how CORS in the cloud still kills security
Of CORS thats a thing how CORS in the cloud still kills security
John Varghese
 
Native cloud security monitoring
Native cloud security monitoringNative cloud security monitoring
Native cloud security monitoring
John Varghese
 
Last year in AWS - 2019
Last year in AWS - 2019Last year in AWS - 2019
Last year in AWS - 2019
John Varghese
 
Gpu accelerated BERT deployment on aws
Gpu accelerated BERT deployment on awsGpu accelerated BERT deployment on aws
Gpu accelerated BERT deployment on aws
John Varghese
 
EKS security best practices
EKS security best practicesEKS security best practices
EKS security best practices
John Varghese
 

More from John Varghese (20)

Lessons Learned From Cloud Migrations: Planning is Everything
Lessons Learned From Cloud Migrations: Planning is EverythingLessons Learned From Cloud Migrations: Planning is Everything
Lessons Learned From Cloud Migrations: Planning is Everything
 
Leveraging AWS Cloudfront & S3 Services to Deliver Static Assets of a SPA
Leveraging AWS Cloudfront & S3 Services to Deliver Static Assets of a SPALeveraging AWS Cloudfront & S3 Services to Deliver Static Assets of a SPA
Leveraging AWS Cloudfront & S3 Services to Deliver Static Assets of a SPA
 
AWS Transit Gateway-Benefits and Best Practices
AWS Transit Gateway-Benefits and Best PracticesAWS Transit Gateway-Benefits and Best Practices
AWS Transit Gateway-Benefits and Best Practices
 
Bridging Operations and Development With Observabilty
Bridging Operations and Development With ObservabiltyBridging Operations and Development With Observabilty
Bridging Operations and Development With Observabilty
 
Security Observability for Cloud Based Applications
Security Observability for Cloud Based ApplicationsSecurity Observability for Cloud Based Applications
Security Observability for Cloud Based Applications
 
Who Broke My Crypto
Who Broke My CryptoWho Broke My Crypto
Who Broke My Crypto
 
Building an IoT System to Protect My Lunch
Building an IoT System to Protect My LunchBuilding an IoT System to Protect My Lunch
Building an IoT System to Protect My Lunch
 
Building a Highly Secure S3 Bucket
Building a Highly Secure S3 BucketBuilding a Highly Secure S3 Bucket
Building a Highly Secure S3 Bucket
 
Reduce Amazon RDS Costs up to 50% with Proxies
Reduce Amazon RDS Costs up to 50% with ProxiesReduce Amazon RDS Costs up to 50% with Proxies
Reduce Amazon RDS Costs up to 50% with Proxies
 
Keynote - Lead the change around you
Keynote - Lead the change around youKeynote - Lead the change around you
Keynote - Lead the change around you
 
AWS Systems manager 2019
AWS Systems manager 2019AWS Systems manager 2019
AWS Systems manager 2019
 
Emerging job trends and best practices in the aws community
Emerging job trends and best practices in the aws communityEmerging job trends and best practices in the aws community
Emerging job trends and best practices in the aws community
 
Automating security in aws with divvy cloud
Automating security in aws with divvy cloudAutomating security in aws with divvy cloud
Automating security in aws with divvy cloud
 
AWS temporary credentials challenges in prevention detection mitigation
AWS temporary credentials   challenges in prevention detection mitigationAWS temporary credentials   challenges in prevention detection mitigation
AWS temporary credentials challenges in prevention detection mitigation
 
Securing aws workloads with embedded application security
Securing aws workloads with embedded application securitySecuring aws workloads with embedded application security
Securing aws workloads with embedded application security
 
Of CORS thats a thing how CORS in the cloud still kills security
Of CORS thats a thing how CORS in the cloud still kills securityOf CORS thats a thing how CORS in the cloud still kills security
Of CORS thats a thing how CORS in the cloud still kills security
 
Native cloud security monitoring
Native cloud security monitoringNative cloud security monitoring
Native cloud security monitoring
 
Last year in AWS - 2019
Last year in AWS - 2019Last year in AWS - 2019
Last year in AWS - 2019
 
Gpu accelerated BERT deployment on aws
Gpu accelerated BERT deployment on awsGpu accelerated BERT deployment on aws
Gpu accelerated BERT deployment on aws
 
EKS security best practices
EKS security best practicesEKS security best practices
EKS security best practices
 

Recently uploaded

"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

Acd19 kubertes cluster at scale on aws at intuit

  • 1. Kubernetes Clusters at scale on AWS @ Intuit - Shri Javadekar (@shrinandj)
  • 2. Intuit Confidential and Proprietary 2 ● Design and development started in Jan ‘18 ● First application was running Kafka on Kubernetes ● Running clusters in dev/test, pre-prod and prod environments since Apr ‘18. ● Over 150 Kubernetes clusters and 3000 namespaces today… Journey so far ...
  • 3. Intuit Confidential and Proprietary 3 Journey so far ... 1634 Preprod Services 1989 Total Services 435 Prod Services
  • 4. Intuit Confidential and Proprietary 4 ● Intuit Kubernetes Service ○ Using Kops today ○ Moving to EKS ● Intuit Kubernetes Service Manager (may open source) ● Custom Resources for cluster lifecycle management (aka. Keiko) Modern SaaS platform today ...
  • 5. Intuit Confidential and Proprietary 5 alb-ingress kube-dns fluentd metrics prometheus autoscaler Addons User namespace 1 User namespace 2 User namespace 3 User namespace n Applications kube-apiserver kube-proxy K8s Control Plane kube-scheduler kube-controlleretcd Each Kubernetes cluster today ...
  • 6. Intuit Confidential and Proprietary 6 Master Nodes alb-ingress kiam eventrouter metrics kube-dns autoscaler Addons kube-apiserver kube-proxy K8s Control Plane kube-scheduler kube-controlleretcd Each Kubernetes cluster today ...
  • 7. Intuit Confidential and Proprietary 7 The Problems
  • 8. Intuit Confidential and Proprietary 8 Addons - Common functionality needed by all apps on a cluster - DNS, log forwarding, metrics, identity, etc. - Integrate with other AWS services such as ALB.
  • 9. Intuit Confidential and Proprietary 9 Multi-tenancy - What does each tenant mean? - Namespace? - Kubernetes objects with the same label? - Some CRD? We decided to go with Kubernetes Namespaces
  • 10. Intuit Confidential and Proprietary 10 Multi-tenancy - What does each tenant mean? - Namespace? - Kubernetes objects with the same label? - Some CRD? We decided to go with Kubernetes Namespaces
  • 11. Intuit Confidential and Proprietary 11 More Multi-tenancy issues - Noisy neighbour - Customized setup - Tenant specific AMIs - Tenant specific instance types - Cost accounting
  • 12. Intuit Confidential and Proprietary 12 Multi-tenancy solutions - Instance Group per Namespace - Customized labels - Centralized upgrades We decided to go with ...
  • 13. Intuit Confidential and Proprietary 13 Resilience and hardening ... - Pods stuck in terminating state ... - EC2 instance networking broken ...
  • 14. Intuit Confidential and Proprietary 14 Deep monitoring - Not enough to simply check if components are “up” - Deep monitoring - Actually exercise the functionality - Periodically - Preferably automatic remediation
  • 15. Intuit Confidential and Proprietary 15 Cost efficiency - How do we reduce costs?
  • 16. Intuit Confidential and Proprietary 16 Keiko “Keiko provides a set of independent open-source tools for orchestration and management of multi-tenant, reliable, secure and efficient Kubernetes clusters at scale.” github.com/keikoproj Instance manager Kube forensics Upgrade manager Active monitor Addon manager Governor Minion manager
  • 17. Intuit Confidential and Proprietary 17 Keiko github.com/keikoproj Instance manager Kube forensics Upgrade manager Active monitor Addon manager Governor Minion manager github.com/keikoproj twitter.com/keikoproj
  • 18. Intuit Confidential and Proprietary 18 Instance-manager - Declaratively provision and manage ASGs (nodes) - Number and type of nodes - Labels and taints - Subnets and security groups $ kubectl create -f /tmp/hello_world.yaml instancegroup.instancemgr.keikoproj.io/hello-world created $ kubectl get igs NAME STATE MIN MAX GROUP NAME PROVISIONER STRATEGY AGE hello-world Ready 2 3 shri-east-2-instance-manager-hello-world-NodeGroup-16Y8ZA1ZJW8JK eks-cf crd 3m nodes Ready 2 3 shri-east-2-instance-manager-nodes-NodeGroup-1K1T3YSXCCCK9 eks-cf crd 1d
  • 19. Intuit Confidential and Proprietary 19 Upgrade-manager - Upgrade Manager provides RollingUpgrade, a Kubernetes native mechanism for doing rolling-updates of instances in an AutoScaling group using a CRD and a controller.
  • 20. Intuit Confidential and Proprietary 20 Addon-Manager Addons are critical components within a Kubernetes cluster that provide basic services needed by applications like DNS, Ingress, Metrics, Logging, etc. Addon Manager provides a CRD for lifecycle management of such addons using Argo Workflows.
  • 21. Intuit Confidential and Proprietary 21 Addon-Manager
  • 22. Intuit Confidential and Proprietary 22 Governor Governor improves the stability of large Kubernetes clusters by proactively terminating failed but stuck pods and misbehaving nodes.
  • 23. Intuit Confidential and Proprietary 23 Minion-manager Minion-manager enables the intelligent use of Spot Instances in Kubernetes clusters on AWS. This is done by factoring in on-demand prices, spot-instance prices and current state of the AutoScalingGroups.
  • 24. Intuit Confidential and Proprietary 24 Kube-forensics Kube-forensics allows a cluster administrator to dump the current state of a running pod and all its containers so that security professionals can perform offline forensic analysis.
  • 25. Intuit Confidential and Proprietary 25 Active-monitor Active-Monitor is a Kubernetes custom resource controller which uses Argo Workflows for deep cluster monitoring.
  • 26. Intuit Confidential and Proprietary 26 Keiko Orchestration Instance-manager Upgrade-manager Reliability Governor Cost Eff Minion Manager Addon Manager Security Kube-Forensics Monitoring Active-monitor
  • 27. Intuit Confidential and Proprietary 27 Keiko Demo
  • 28. Intuit Confidential and Proprietary 28 Coming up ... - Kubernetes control plane using EKS - Multi-cluster Service Mesh using Istio - OpenTelemetry - GitOps for AWS resources - Experimentation platform - And more ...
  • 29. Intuit Confidential and Proprietary 29 There’s a lot happening ... <Shameless plug about hiring (referrals) …>