SlideShare a Scribd company logo
1 of 8
Download to read offline
Operationalizing Data Infrastructure on Kubernetes
1
Confidential & Proprietary 2
Operationalizing Data Infrastructure on Kubernetes
Who We Are
Application Delivery Platform for Kubernetes
We’ve bundled around 50 common OSS solutions for
deployment on kubernetes, mostly focusing on data
infrastructure
Have run data stacks in kubernetes in all major clouds
and Equinix metal, for a variety of customers
Engineer with tours at Amazon, Twitter (Vine),
Frame.io and Facebook
Now CTO at Plural.sh
Lots of experience both developing and operating
large scale distributed systems
Michael Guarino
CTO & Co-founder
Confidential & Proprietary
What is the Data Stack
Operationalizing Data Infrastructure on Kubernetes
Visualization
Orchestration
Batch Processing
Datawarehouse/Datalake
Security
Service mesh
Network policies
3
Confidential & Proprietary
How We’re Doing Things
Operationalizing Data Infrastructure on Kubernetes
We use Kubernetes for all orchestration
Powerful helm ecosystem with prebuilt packages for many of our
needs
Eliminates expensive managed infrastructure as much as possible
Operational unification for disparate stacks on disparate clouds
GitOps For Everything
Operational Tooling should be web-based
Easiest on-ramp to train new engineers
Fastest response time if something goes wrong
4
Confidential & Proprietary
Biggest Challenges We’ve Seen
Operationalizing Data Infrastructure on Kubernetes
Managing Upgrade Cycles
Major dependency complexity Validation + testing of new versions
K8s upgrades - super hard and impacts everything Tracking security patches *
Security Hardening
All web interfaces need secure login
(hard because OSS often doesn’t support this)
Software supply chain management
Scan images
Vendor as many images as possible (esp to avoid dockerhub)
Network security internal to cluster
Service meshes seem to not deliver on their promise still
Integration of different tools is tricky
Shared network layer is critical here
(why we like running single-cluster)
Example:
↔
5
Confidential & Proprietary
Misconceptions
Operationalizing Data Infrastructure on Kubernetes
Scale is hard
With the right infra, honestly pretty
tractable. Most scaling tasks are
literally just turning a dial now
Autoscaling can still be tricky
If you’re running multi-region
transactional, distributed databases
maybe tricky
Figuring out how to run
most OSS data stacks are
hard
If you learn a pretty small set of tools,
we can usually get something running
in a day
Virtually everything has an ok-ish helm
chart
Kubernetes itself is hard
API is sprawling, but realistically only
10-20% is needed to grok
Make an opinionated toolchain
choice to simplify management
Use a managed control plane
IS hard in on-prem environments
where networks are irregular
6
Incredible cost savings
Managed infra often at 40%+ markup
to compute, and if you’re doing large
scale batch processing, this is a huge
amount of overhead
Confidential & Proprietary
What Benefits We’ve Seen
Operationalizing Data Infrastructure on Kubernetes
Much simpler security
model
Everything in a hardened network, no
worries about privacy/compliance
Ends up being operationally
simpler when you scale out to
multiple solutions
Can create a single management console
for your entire stack
Upgrade flow all consolidated in one system
Data gravity is a much harder problem than
application deployment
7
Shameless Plug
If you’re interested in playing around with this
model, Plural does make it just a few commands
away. We’d love to see you guys play with the
platform or even contribute:
app.plural.sh docs.plural.sh github discord
Confidential & Proprietary
Operationalizing Data Infrastructure on Kubernetes
8

More Related Content

Similar to Dok Talks #122 - Operationalizing a Data Infrastructure Stack on Kubernetes

Why integration is key in IoT solutions? (Sam Vanhoutte @Integrate2017)
Why integration is key in IoT solutions? (Sam Vanhoutte @Integrate2017)Why integration is key in IoT solutions? (Sam Vanhoutte @Integrate2017)
Why integration is key in IoT solutions? (Sam Vanhoutte @Integrate2017)Codit
 
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup AppliancesDeep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup AppliancesVeritas Technologies LLC
 
Why Cloud Management Makes Sense
Why Cloud Management Makes SenseWhy Cloud Management Makes Sense
Why Cloud Management Makes SenseRightScale
 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradataAsis Mohanty
 
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamComputeIn-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamComputePatrick McGarry
 
Data Agility for Devops - OSI 2018
Data Agility for Devops - OSI 2018Data Agility for Devops - OSI 2018
Data Agility for Devops - OSI 2018OpenEBS
 
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...Codit
 
Building Cloud capability for startups
Building Cloud capability for startupsBuilding Cloud capability for startups
Building Cloud capability for startupsSekhar Mohanty
 
OpenStack on Kubernetes (BOS Summit / May 2017 update)
OpenStack on Kubernetes (BOS Summit / May 2017 update)OpenStack on Kubernetes (BOS Summit / May 2017 update)
OpenStack on Kubernetes (BOS Summit / May 2017 update)rhirschfeld
 
Monitoring IAAS & PAAS Solutions
Monitoring IAAS & PAAS SolutionsMonitoring IAAS & PAAS Solutions
Monitoring IAAS & PAAS SolutionsColloquium
 
Securing Sensitive Data in Your Hybrid Cloud
Securing Sensitive Data in Your Hybrid CloudSecuring Sensitive Data in Your Hybrid Cloud
Securing Sensitive Data in Your Hybrid CloudRightScale
 
Cloudstack vs Openstack
Cloudstack vs OpenstackCloudstack vs Openstack
Cloudstack vs OpenstackHuzefa Husain
 
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructuredevopsdaysaustin
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructurerhirschfeld
 
Storage os kubernetes clusters need persistent data
Storage os   kubernetes clusters need persistent dataStorage os   kubernetes clusters need persistent data
Storage os kubernetes clusters need persistent dataLibbySchulze
 

Similar to Dok Talks #122 - Operationalizing a Data Infrastructure Stack on Kubernetes (20)

Why integration is key in IoT solutions? (Sam Vanhoutte @Integrate2017)
Why integration is key in IoT solutions? (Sam Vanhoutte @Integrate2017)Why integration is key in IoT solutions? (Sam Vanhoutte @Integrate2017)
Why integration is key in IoT solutions? (Sam Vanhoutte @Integrate2017)
 
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup AppliancesDeep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
 
Cluster computing
Cluster computingCluster computing
Cluster computing
 
Why Cloud Management Makes Sense
Why Cloud Management Makes SenseWhy Cloud Management Makes Sense
Why Cloud Management Makes Sense
 
Innovate on Cloud with AWS
Innovate on Cloud with AWSInnovate on Cloud with AWS
Innovate on Cloud with AWS
 
oracle.pptx
oracle.pptxoracle.pptx
oracle.pptx
 
Netezza vs teradata
Netezza vs teradataNetezza vs teradata
Netezza vs teradata
 
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamComputeIn-Ceph-tion: Deploying a Ceph cluster on DreamCompute
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
 
Data Agility for Devops - OSI 2018
Data Agility for Devops - OSI 2018Data Agility for Devops - OSI 2018
Data Agility for Devops - OSI 2018
 
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
Maturing IoT solutions with Microsoft Azure (Sam Vanhoutte & Glenn Colpaert a...
 
Building Cloud capability for startups
Building Cloud capability for startupsBuilding Cloud capability for startups
Building Cloud capability for startups
 
OpenStack on Kubernetes (BOS Summit / May 2017 update)
OpenStack on Kubernetes (BOS Summit / May 2017 update)OpenStack on Kubernetes (BOS Summit / May 2017 update)
OpenStack on Kubernetes (BOS Summit / May 2017 update)
 
AWSome Insider's View of NetBackup 8.1
AWSome Insider's View of NetBackup 8.1AWSome Insider's View of NetBackup 8.1
AWSome Insider's View of NetBackup 8.1
 
Monitoring IAAS & PAAS Solutions
Monitoring IAAS & PAAS SolutionsMonitoring IAAS & PAAS Solutions
Monitoring IAAS & PAAS Solutions
 
Securing Sensitive Data in Your Hybrid Cloud
Securing Sensitive Data in Your Hybrid CloudSecuring Sensitive Data in Your Hybrid Cloud
Securing Sensitive Data in Your Hybrid Cloud
 
Cloudstack vs Openstack
Cloudstack vs OpenstackCloudstack vs Openstack
Cloudstack vs Openstack
 
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
2016 - Open Mic - IGNITE - Open Infrastructure = ANY Infrastructure
 
OpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid InfrastructureOpenStack Preso: DevOps on Hybrid Infrastructure
OpenStack Preso: DevOps on Hybrid Infrastructure
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Storage os kubernetes clusters need persistent data
Storage os   kubernetes clusters need persistent dataStorage os   kubernetes clusters need persistent data
Storage os kubernetes clusters need persistent data
 

More from DoKC

Distributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and HowDistributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and HowDoKC
 
Is It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes OperatorsIs It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes OperatorsDoKC
 
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryStop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryDoKC
 
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...DoKC
 
The State of Stateful on Kubernetes
The State of Stateful on KubernetesThe State of Stateful on Kubernetes
The State of Stateful on KubernetesDoKC
 
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...DoKC
 
Make Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-ReadyMake Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-ReadyDoKC
 
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...DoKC
 
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the CloudRun PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the CloudDoKC
 
The Kubernetes Native Database
The Kubernetes Native DatabaseThe Kubernetes Native Database
The Kubernetes Native DatabaseDoKC
 
ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023DoKC
 
Implementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentImplementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentDoKC
 
StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154DoKC
 
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...DoKC
 
Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151DoKC
 
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...DoKC
 
Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147DoKC
 
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...DoKC
 
We will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8sWe will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8sDoKC
 
Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators DoKC
 

More from DoKC (20)

Distributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and HowDistributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and How
 
Is It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes OperatorsIs It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes Operators
 
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryStop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
 
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
 
The State of Stateful on Kubernetes
The State of Stateful on KubernetesThe State of Stateful on Kubernetes
The State of Stateful on Kubernetes
 
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
 
Make Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-ReadyMake Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-Ready
 
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
 
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the CloudRun PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
 
The Kubernetes Native Database
The Kubernetes Native DatabaseThe Kubernetes Native Database
The Kubernetes Native Database
 
ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023
 
Implementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentImplementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch government
 
StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154
 
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
 
Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151
 
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
 
Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147
 
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
 
We will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8sWe will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8s
 
Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators
 

Recently uploaded

Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 

Recently uploaded (20)

Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 

Dok Talks #122 - Operationalizing a Data Infrastructure Stack on Kubernetes

  • 2. Confidential & Proprietary 2 Operationalizing Data Infrastructure on Kubernetes Who We Are Application Delivery Platform for Kubernetes We’ve bundled around 50 common OSS solutions for deployment on kubernetes, mostly focusing on data infrastructure Have run data stacks in kubernetes in all major clouds and Equinix metal, for a variety of customers Engineer with tours at Amazon, Twitter (Vine), Frame.io and Facebook Now CTO at Plural.sh Lots of experience both developing and operating large scale distributed systems Michael Guarino CTO & Co-founder
  • 3. Confidential & Proprietary What is the Data Stack Operationalizing Data Infrastructure on Kubernetes Visualization Orchestration Batch Processing Datawarehouse/Datalake Security Service mesh Network policies 3
  • 4. Confidential & Proprietary How We’re Doing Things Operationalizing Data Infrastructure on Kubernetes We use Kubernetes for all orchestration Powerful helm ecosystem with prebuilt packages for many of our needs Eliminates expensive managed infrastructure as much as possible Operational unification for disparate stacks on disparate clouds GitOps For Everything Operational Tooling should be web-based Easiest on-ramp to train new engineers Fastest response time if something goes wrong 4
  • 5. Confidential & Proprietary Biggest Challenges We’ve Seen Operationalizing Data Infrastructure on Kubernetes Managing Upgrade Cycles Major dependency complexity Validation + testing of new versions K8s upgrades - super hard and impacts everything Tracking security patches * Security Hardening All web interfaces need secure login (hard because OSS often doesn’t support this) Software supply chain management Scan images Vendor as many images as possible (esp to avoid dockerhub) Network security internal to cluster Service meshes seem to not deliver on their promise still Integration of different tools is tricky Shared network layer is critical here (why we like running single-cluster) Example: ↔ 5
  • 6. Confidential & Proprietary Misconceptions Operationalizing Data Infrastructure on Kubernetes Scale is hard With the right infra, honestly pretty tractable. Most scaling tasks are literally just turning a dial now Autoscaling can still be tricky If you’re running multi-region transactional, distributed databases maybe tricky Figuring out how to run most OSS data stacks are hard If you learn a pretty small set of tools, we can usually get something running in a day Virtually everything has an ok-ish helm chart Kubernetes itself is hard API is sprawling, but realistically only 10-20% is needed to grok Make an opinionated toolchain choice to simplify management Use a managed control plane IS hard in on-prem environments where networks are irregular 6
  • 7. Incredible cost savings Managed infra often at 40%+ markup to compute, and if you’re doing large scale batch processing, this is a huge amount of overhead Confidential & Proprietary What Benefits We’ve Seen Operationalizing Data Infrastructure on Kubernetes Much simpler security model Everything in a hardened network, no worries about privacy/compliance Ends up being operationally simpler when you scale out to multiple solutions Can create a single management console for your entire stack Upgrade flow all consolidated in one system Data gravity is a much harder problem than application deployment 7
  • 8. Shameless Plug If you’re interested in playing around with this model, Plural does make it just a few commands away. We’d love to see you guys play with the platform or even contribute: app.plural.sh docs.plural.sh github discord Confidential & Proprietary Operationalizing Data Infrastructure on Kubernetes 8