Unclouding Container Challenges

Rakuten Group, Inc.
Rakuten Group, Inc.Rakuten Group, Inc.
Unclouding Container Challenges
Apr 21st, 2021
Harpratap Singh Layal
Cloud Platform Department
Rakuten Group, Inc.
2
Background – Compute platforms
Bare metal as a Service (BMaaS)
16 core
32 GB
1 Gbps
16 core
32 GB
1 Gbps
32 core
128 GB
10 gbps
16 core
64 GB
10 gbps
Container as a Service (CaaS)
Cluster X
App 1 App 2
App 2
Cluster Y
App 3 App 2
App 4
3
Background – What is CaaS?
PaaS
(Heroku
12 factor
apps)
Managed K8s
control plane (GKE,
EKS, AKS Full
customization)
Simple
container
scheduler
(Fargate,
CloudRun)
Only expose
selected
K8s API
(CaaS)
Opioninated
(Less flexibility)
Developer control & Responsibility
Default Container Networking, CI/CD, monitoring, security for Stateless & Stateful apps, Cron, GPU workloads
4
Challenge #1 : Communication Cost
5
Challenge #1 : Communication Cost
Doing it the traditional way –
1. Communication lag – takes too long to formulate requirements from developers
2. XY problem – no idea what the real problem is
3. Validation and policy injection is manually done
6
Challenge #1 : Communication Cost
Solution: Create an opionated Internal Developer Platform and form an API based contract with
users
Philosophy :
• When you have APIs and their documentation users rarely need to communicate with you
• Easier to explicitly define what you provide and what you don’t
• Standardization = low re-invention of wheel, less pets, easier to propagate tech culture
Implementation :
• In CaaS we make use of K8s APIs to expose features to users. Custom Resource Definitions (CRDs)
and Operators fits us well.
• Admission control webhooks, podSecurityPolicy and networkPolicy
7
Challenge #1 : Communication Cost
Jiange : Validation without human communication
Jiange
etcd K8s API
8
Challenge #2: Day 2 Ops
9
Challenge #2 : Day 2 Ops
Day 1 Ops :
• Provisioning
• Step 1
• Step 2
• Step 3… N
• Procedural – easy to automate
Day 2 Ops:
• Maintainence
• Not always the same
• Improvements – need to keep an eye on various components
• Metrics
• Logs
• Traces
10
Challenge #2 : Day 2 Ops
Solution: Infrastructure as Data instead of Infrastructure as Code
Script
for X
Script
for Y
Script
for Z
IaC – run scripts one by one
Data
Store Infra
Infra
Control
Loop
Reconcile Spec
Reconcile Status
IaD – Store the state as Data and
reconcile until state is achieved
11
Challenge #2 : Day 2 Ops
Solution: Infrastructure as Data instead of Infrastructure as Code
In CaaS we have written controllers based on same approach
• Klone – Binary that provisions master nodes and system components based on git configs (written in
Go)
• Node operator – used for creating worker nodes
• Namespace operator – used for creating user namespaces with correct permissions, good defaults,
jenkins repositories, harbor projects etc when user on boards.
• Gateway controller – For creating istio ingress gateways
• Wildcard instant domain controller – For instantly creating simple domains to test out services
• Cloud controlller manager – for creating load balancers
• Endpoints controller – for creating container native load balancers
12
Challenge #3 : Day 2 Ops
Internet
Load Balancer
K8s API
Node
List
Cloud
Controller
Manager
K8s cluster nodes
13
Challenge #3: Container networks
14
Challenge #3 : Container Networks
• Kubernetes network != Host Network
• Pods are not first class citizens (not flat network)
• Pods are ephemeral
• Fair Load balancing does not happen when using NodePorts
• Additional hops (through K8s node Iptables)
• Source IP is not preserved
• Network is difficult to use
15
Challenge #3 : Container Networks
Solution: No one size fits all, provide all
solutions with good defaults and let users
choose
Shared Gateway +
Auto Assigned
Domain
Dedicated Gateway +
Custom Domain
Domain Auto Assigned Any Domain
Performance Not isolated Isolated
Maintainence (for
users)
Zero High
Customization Low Fully customizable
Cost Low High
16
Challenge #3 : Container Networks
Solution: Container Native Load balancing
Legacy Load
Balancer
Container Native
Load Balancer
Number of hops 2 1
IP preservation Remote IP lost Remote IP
preserved
Load Balancing Across nodes Across containers
Health checks Only for Nodes Application level
health checks
17
Future Challenges:
Multicluster CaaS -
Network
Deployments
IPv4 not enough (need IPv6 and/or VPCs)
Stateful apps -
Local persistence
Remote persistence
GPU
SRIOV
CPU pinning
Single Data proxy
 Unclouding  Container Challenges
1 of 18

Recommended

How to collect and utilize logs at Kubernetes with Elastic Stack by
How to collect and utilize logs at Kubernetes with Elastic StackHow to collect and utilize logs at Kubernetes with Elastic Stack
How to collect and utilize logs at Kubernetes with Elastic StackRakuten Group, Inc.
249 views43 slides
How Confluent Completes the Event Streaming Platform (Addison Huddy & Dan Ros... by
How Confluent Completes the Event Streaming Platform (Addison Huddy & Dan Ros...How Confluent Completes the Event Streaming Platform (Addison Huddy & Dan Ros...
How Confluent Completes the Event Streaming Platform (Addison Huddy & Dan Ros...HostedbyConfluent
740 views17 slides
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha... by
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...
Give Your Confluent Platform Superpowers! (Sandeep Togrika, Intel and Bert Ha...HostedbyConfluent
720 views17 slides
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik by
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikKeeping Analytics Data Fresh in a Streaming Architecture | John Neal, Qlik
Keeping Analytics Data Fresh in a Streaming Architecture | John Neal, QlikHostedbyConfluent
379 views13 slides
Better Kafka Performance Without Changing Any Code | Simon Ritter, Azul by
Better Kafka Performance Without Changing Any Code | Simon Ritter, AzulBetter Kafka Performance Without Changing Any Code | Simon Ritter, Azul
Better Kafka Performance Without Changing Any Code | Simon Ritter, AzulHostedbyConfluent
284 views21 slides
Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa... by
Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...
Kafka Excellence at Scale – Cloud, Kubernetes, Infrastructure as Code (Vik Wa...HostedbyConfluent
804 views10 slides

More Related Content

What's hot

Automate Your Kafka Cluster with Kubernetes Custom Resources by
Automate Your Kafka Cluster with Kubernetes Custom Resources Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources confluent
3.1K views77 slides
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo... by
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...HostedbyConfluent
470 views14 slides
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec... by
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...HostedbyConfluent
319 views13 slides
Delivering Cloud Native Batch Solutions - Dodd Pfeffer by
Delivering Cloud Native Batch Solutions - Dodd PfefferDelivering Cloud Native Batch Solutions - Dodd Pfeffer
Delivering Cloud Native Batch Solutions - Dodd PfefferVMware Tanzu
447 views20 slides
The service mesh management plane by
The service mesh management planeThe service mesh management plane
The service mesh management planeLibbySchulze
203 views24 slides
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture by
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New ArchitectureGwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architectureconfluent
7.5K views24 slides

What's hot(20)

Automate Your Kafka Cluster with Kubernetes Custom Resources by confluent
Automate Your Kafka Cluster with Kubernetes Custom Resources Automate Your Kafka Cluster with Kubernetes Custom Resources
Automate Your Kafka Cluster with Kubernetes Custom Resources
confluent3.1K views
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo... by HostedbyConfluent
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
HostedbyConfluent470 views
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec... by HostedbyConfluent
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
HostedbyConfluent319 views
Delivering Cloud Native Batch Solutions - Dodd Pfeffer by VMware Tanzu
Delivering Cloud Native Batch Solutions - Dodd PfefferDelivering Cloud Native Batch Solutions - Dodd Pfeffer
Delivering Cloud Native Batch Solutions - Dodd Pfeffer
VMware Tanzu447 views
The service mesh management plane by LibbySchulze
The service mesh management planeThe service mesh management plane
The service mesh management plane
LibbySchulze203 views
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture by confluent
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New ArchitectureGwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
Gwen Shapira, Confluent | Kafka Summit 2020 Keynote | Kafka’s New Architecture
confluent7.5K views
Beyond the Brokers | Emma Humber and Andrew Borley, IBM by HostedbyConfluent
Beyond the Brokers | Emma Humber and Andrew Borley, IBMBeyond the Brokers | Emma Humber and Andrew Borley, IBM
Beyond the Brokers | Emma Humber and Andrew Borley, IBM
HostedbyConfluent336 views
Kafka Deployment to Steel Thread by confluent
Kafka Deployment to Steel ThreadKafka Deployment to Steel Thread
Kafka Deployment to Steel Thread
confluent167 views
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K... by HostedbyConfluent
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
HostedbyConfluent436 views
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter by HostedbyConfluent
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterTwitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
HostedbyConfluent467 views
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F... by HostedbyConfluent
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
One Click Streaming Data Pipelines & Flows | Leveraging Kafka & Spark | Ido F...
HostedbyConfluent253 views
The Road Most Traveled: A Kafka Story | Heikki Nousiainen, Aiven by HostedbyConfluent
The Road Most Traveled: A Kafka Story | Heikki Nousiainen, AivenThe Road Most Traveled: A Kafka Story | Heikki Nousiainen, Aiven
The Road Most Traveled: A Kafka Story | Heikki Nousiainen, Aiven
HostedbyConfluent432 views
Taming a massive fleet of Python-based Kafka apps at Robinhood | Chandra Kuch... by HostedbyConfluent
Taming a massive fleet of Python-based Kafka apps at Robinhood | Chandra Kuch...Taming a massive fleet of Python-based Kafka apps at Robinhood | Chandra Kuch...
Taming a massive fleet of Python-based Kafka apps at Robinhood | Chandra Kuch...
HostedbyConfluent1.1K views
Distributed Enterprise Monitoring and Management of Apache Kafka (William McL... by HostedbyConfluent
Distributed Enterprise Monitoring and Management of Apache Kafka (William McL...Distributed Enterprise Monitoring and Management of Apache Kafka (William McL...
Distributed Enterprise Monitoring and Management of Apache Kafka (William McL...
HostedbyConfluent810 views
Deploying Kafka Streams Applications with Docker and Kubernetes by confluent
Deploying Kafka Streams Applications with Docker and KubernetesDeploying Kafka Streams Applications with Docker and Kubernetes
Deploying Kafka Streams Applications with Docker and Kubernetes
confluent12.9K views
Event-driven Applications with Kafka, Micronaut, and AWS Lambda | Dave Klein,... by HostedbyConfluent
Event-driven Applications with Kafka, Micronaut, and AWS Lambda | Dave Klein,...Event-driven Applications with Kafka, Micronaut, and AWS Lambda | Dave Klein,...
Event-driven Applications with Kafka, Micronaut, and AWS Lambda | Dave Klein,...
HostedbyConfluent578 views
Migrating to Apache Spark at Netflix by Databricks
Migrating to Apache Spark at NetflixMigrating to Apache Spark at Netflix
Migrating to Apache Spark at Netflix
Databricks2.2K views
RedisConf17 - Dynomite - Making Non-distributed Databases Distributed by Redis Labs
RedisConf17 - Dynomite - Making Non-distributed Databases DistributedRedisConf17 - Dynomite - Making Non-distributed Databases Distributed
RedisConf17 - Dynomite - Making Non-distributed Databases Distributed
Redis Labs1.1K views

Similar to Unclouding Container Challenges

Kubernetes @ Squarespace (SRE Portland Meetup October 2017) by
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
237 views51 slides
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ... by
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...QAware GmbH
2.3K views60 slides
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... by
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...Josef Adersberger
672 views60 slides
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ... by
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...Ambassador Labs
2.7K views67 slides
Kubernetes @ Squarespace: Kubernetes in the Datacenter by
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKevin Lynch
1.2K views53 slides
Using Kubernetes to make cellular data plans cheaper for 50M users by
Using Kubernetes to make cellular data plans cheaper for 50M usersUsing Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M usersMirantis
400 views31 slides

Similar to Unclouding Container Challenges(20)

Kubernetes @ Squarespace (SRE Portland Meetup October 2017) by Kevin Lynch
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kevin Lynch237 views
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ... by QAware GmbH
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
Migrating Hundreds of Legacy Applications to Kubernetes - The Good, the Bad, ...
QAware GmbH2.3K views
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... by Josef Adersberger
 The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ... The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
Josef Adersberger672 views
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ... by Ambassador Labs
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace ...
Ambassador Labs2.7K views
Kubernetes @ Squarespace: Kubernetes in the Datacenter by Kevin Lynch
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kevin Lynch1.2K views
Using Kubernetes to make cellular data plans cheaper for 50M users by Mirantis
Using Kubernetes to make cellular data plans cheaper for 50M usersUsing Kubernetes to make cellular data plans cheaper for 50M users
Using Kubernetes to make cellular data plans cheaper for 50M users
Mirantis400 views
Docker on docker leveraging kubernetes in docker ee by Docker, Inc.
Docker on docker leveraging kubernetes in docker eeDocker on docker leveraging kubernetes in docker ee
Docker on docker leveraging kubernetes in docker ee
Docker, Inc.421 views
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft by Chester Chen
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
Chester Chen331 views
Container orchestration and microservices world by Karol Chrapek
Container orchestration and microservices worldContainer orchestration and microservices world
Container orchestration and microservices world
Karol Chrapek191 views
AWS and GKE Migration and Multicloud by Chris Gaun
AWS and GKE Migration and MulticloudAWS and GKE Migration and Multicloud
AWS and GKE Migration and Multicloud
Chris Gaun878 views
Kubernetes Monitoring & Best Practices by Ajeet Singh Raina
Kubernetes Monitoring & Best PracticesKubernetes Monitoring & Best Practices
Kubernetes Monitoring & Best Practices
Ajeet Singh Raina1.7K views
Circonus: Design failures - A Case Study by Heinrich Hartmann
Circonus: Design failures - A Case StudyCirconus: Design failures - A Case Study
Circonus: Design failures - A Case Study
Heinrich Hartmann294 views
Microservices @ Work - A Practice Report of Developing Microservices by QAware GmbH
Microservices @ Work - A Practice Report of Developing MicroservicesMicroservices @ Work - A Practice Report of Developing Microservices
Microservices @ Work - A Practice Report of Developing Microservices
QAware GmbH581 views
Building a data pipeline to ingest data into Hadoop in minutes using Streamse... by Guglielmo Iozzia
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Guglielmo Iozzia1.2K views
RINA overview and ongoing research in EC-funded projects, ISO SC6 WG7 by Eleni Trouva
RINA overview and ongoing research in EC-funded projects, ISO SC6 WG7RINA overview and ongoing research in EC-funded projects, ISO SC6 WG7
RINA overview and ongoing research in EC-funded projects, ISO SC6 WG7
Eleni Trouva1.7K views
AOUG_11Nov2016_Challenges_with_EBS12_2 by Sean Braymen
AOUG_11Nov2016_Challenges_with_EBS12_2AOUG_11Nov2016_Challenges_with_EBS12_2
AOUG_11Nov2016_Challenges_with_EBS12_2
Sean Braymen1.2K views
Containerized Hadoop beyond Kubernetes by DataWorks Summit
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
DataWorks Summit2.2K views
IBM Bluemix Nice meetup #5 - 20170504 - Orchestrer Docker avec Kubernetes by IBM France Lab
IBM Bluemix Nice meetup #5 - 20170504 - Orchestrer Docker avec KubernetesIBM Bluemix Nice meetup #5 - 20170504 - Orchestrer Docker avec Kubernetes
IBM Bluemix Nice meetup #5 - 20170504 - Orchestrer Docker avec Kubernetes
IBM France Lab357 views
Get Lower Latency and Higher Throughput for Java Applications by ScyllaDB
Get Lower Latency and Higher Throughput for Java ApplicationsGet Lower Latency and Higher Throughput for Java Applications
Get Lower Latency and Higher Throughput for Java Applications
ScyllaDB809 views

More from Rakuten Group, Inc.

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話 by
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話Rakuten Group, Inc.
121 views32 slides
楽天における安全な秘匿情報管理への道のり by
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のりRakuten Group, Inc.
174 views43 slides
What Makes Software Green? by
What Makes Software Green?What Makes Software Green?
What Makes Software Green?Rakuten Group, Inc.
138 views39 slides
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At... by
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Rakuten Group, Inc.
225 views33 slides
大規模なリアルタイム監視の導入と展開 by
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開Rakuten Group, Inc.
525 views18 slides
楽天における大規模データベースの運用 by
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用Rakuten Group, Inc.
788 views20 slides

More from Rakuten Group, Inc.(20)

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話 by Rakuten Group, Inc.
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
楽天における安全な秘匿情報管理への道のり by Rakuten Group, Inc.
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At... by Rakuten Group, Inc.
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
大規模なリアルタイム監視の導入と展開 by Rakuten Group, Inc.
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開
楽天における大規模データベースの運用 by Rakuten Group, Inc.
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用
楽天サービスを支えるネットワークインフラストラクチャー by Rakuten Group, Inc.
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー
楽天の規模とクラウドプラットフォーム統括部の役割 by Rakuten Group, Inc.
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割
The Data Platform Administration Handling the 100 PB.pdf by Rakuten Group, Inc.
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
Supporting Internal Customers as Technical Account Managers.pdf by Rakuten Group, Inc.
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdf
Travel & Leisure Platform Department's tech info by Rakuten Group, Inc.
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info by Rakuten Group, Inc.
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
100PBを越えるデータプラットフォームの実情 by Rakuten Group, Inc.
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情
社内エンジニアを支えるテクニカルアカウントマネージャー by Rakuten Group, Inc.
社内エンジニアを支えるテクニカルアカウントマネージャー社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー
モニタリングプラットフォーム開発の裏側 by Rakuten Group, Inc.
モニタリングプラットフォーム開発の裏側モニタリングプラットフォーム開発の裏側
モニタリングプラットフォーム開発の裏側

Recently uploaded

The Power of Heat Decarbonisation Plans in the Built Environment by
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built EnvironmentIES VE
67 views20 slides
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue by
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueShapeBlue
147 views20 slides
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueShapeBlue
134 views54 slides
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc
130 views29 slides
"Surviving highload with Node.js", Andrii Shumada by
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada Fwdays
49 views29 slides
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ... by
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...ShapeBlue
48 views17 slides

Recently uploaded(20)

The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE67 views
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue by ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue147 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue134 views
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f... by TrustArc
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc Webinar - Managing Online Tracking Technology Vendors_ A Checklist f...
TrustArc130 views
"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays49 views
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ... by ShapeBlue
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
ShapeBlue48 views
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue86 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue69 views
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue by ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue149 views
DRBD Deep Dive - Philipp Reisner - LINBIT by ShapeBlue
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBIT
ShapeBlue110 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software373 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue63 views
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... by ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue97 views
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty54 views
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda... by ShapeBlue
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
ShapeBlue93 views

Unclouding Container Challenges

  • 1. Unclouding Container Challenges Apr 21st, 2021 Harpratap Singh Layal Cloud Platform Department Rakuten Group, Inc.
  • 2. 2 Background – Compute platforms Bare metal as a Service (BMaaS) 16 core 32 GB 1 Gbps 16 core 32 GB 1 Gbps 32 core 128 GB 10 gbps 16 core 64 GB 10 gbps Container as a Service (CaaS) Cluster X App 1 App 2 App 2 Cluster Y App 3 App 2 App 4
  • 3. 3 Background – What is CaaS? PaaS (Heroku 12 factor apps) Managed K8s control plane (GKE, EKS, AKS Full customization) Simple container scheduler (Fargate, CloudRun) Only expose selected K8s API (CaaS) Opioninated (Less flexibility) Developer control & Responsibility Default Container Networking, CI/CD, monitoring, security for Stateless & Stateful apps, Cron, GPU workloads
  • 4. 4 Challenge #1 : Communication Cost
  • 5. 5 Challenge #1 : Communication Cost Doing it the traditional way – 1. Communication lag – takes too long to formulate requirements from developers 2. XY problem – no idea what the real problem is 3. Validation and policy injection is manually done
  • 6. 6 Challenge #1 : Communication Cost Solution: Create an opionated Internal Developer Platform and form an API based contract with users Philosophy : • When you have APIs and their documentation users rarely need to communicate with you • Easier to explicitly define what you provide and what you don’t • Standardization = low re-invention of wheel, less pets, easier to propagate tech culture Implementation : • In CaaS we make use of K8s APIs to expose features to users. Custom Resource Definitions (CRDs) and Operators fits us well. • Admission control webhooks, podSecurityPolicy and networkPolicy
  • 7. 7 Challenge #1 : Communication Cost Jiange : Validation without human communication Jiange etcd K8s API
  • 9. 9 Challenge #2 : Day 2 Ops Day 1 Ops : • Provisioning • Step 1 • Step 2 • Step 3… N • Procedural – easy to automate Day 2 Ops: • Maintainence • Not always the same • Improvements – need to keep an eye on various components • Metrics • Logs • Traces
  • 10. 10 Challenge #2 : Day 2 Ops Solution: Infrastructure as Data instead of Infrastructure as Code Script for X Script for Y Script for Z IaC – run scripts one by one Data Store Infra Infra Control Loop Reconcile Spec Reconcile Status IaD – Store the state as Data and reconcile until state is achieved
  • 11. 11 Challenge #2 : Day 2 Ops Solution: Infrastructure as Data instead of Infrastructure as Code In CaaS we have written controllers based on same approach • Klone – Binary that provisions master nodes and system components based on git configs (written in Go) • Node operator – used for creating worker nodes • Namespace operator – used for creating user namespaces with correct permissions, good defaults, jenkins repositories, harbor projects etc when user on boards. • Gateway controller – For creating istio ingress gateways • Wildcard instant domain controller – For instantly creating simple domains to test out services • Cloud controlller manager – for creating load balancers • Endpoints controller – for creating container native load balancers
  • 12. 12 Challenge #3 : Day 2 Ops Internet Load Balancer K8s API Node List Cloud Controller Manager K8s cluster nodes
  • 14. 14 Challenge #3 : Container Networks • Kubernetes network != Host Network • Pods are not first class citizens (not flat network) • Pods are ephemeral • Fair Load balancing does not happen when using NodePorts • Additional hops (through K8s node Iptables) • Source IP is not preserved • Network is difficult to use
  • 15. 15 Challenge #3 : Container Networks Solution: No one size fits all, provide all solutions with good defaults and let users choose Shared Gateway + Auto Assigned Domain Dedicated Gateway + Custom Domain Domain Auto Assigned Any Domain Performance Not isolated Isolated Maintainence (for users) Zero High Customization Low Fully customizable Cost Low High
  • 16. 16 Challenge #3 : Container Networks Solution: Container Native Load balancing Legacy Load Balancer Container Native Load Balancer Number of hops 2 1 IP preservation Remote IP lost Remote IP preserved Load Balancing Across nodes Across containers Health checks Only for Nodes Application level health checks
  • 17. 17 Future Challenges: Multicluster CaaS - Network Deployments IPv4 not enough (need IPv6 and/or VPCs) Stateful apps - Local persistence Remote persistence GPU SRIOV CPU pinning Single Data proxy