Introduction
Private Cloud in LINE
(2019/01)
Yuki Nishiwaki
Agenda
1. Introduction/Background of Private Cloud
2. OpenStack in LINE
3. Challenge of OpenStack
Who are we?
Responsibility
- Develop/Maintain Common/Fundamental Function for Private Cloud (IaaS)
- Consider/Think of Optimization for Whole Private Cloud
Network Service Operation PlatformStorage
Software
- IaaS (OpenStack + α)
- Kubernetes
Knowledge
- Software
- Network, Virtualization, Linux
Private Cloud
OpenStack
VM
(Nova)
Image
Store
(Glance)
Network
Controller
(Neutron)
Identify
(Keystone)
DNS
Controller
(Designate)
Loadbalancer
L4LB L7LB
Kubernetes
(Rancher)
Storage
Block
Storage
(Ceph)
Object
Storage
(Ceph)
Database
Search/Analytics
Engine
(ElasticSearch)
RDBMS
(Mysql)
KVS
(Redis)
Messaging
(Kafka)
Function
(Knative)
Baremetal
Platform
Service
Network
Storage
Operation
Operation Tools
Today’s Topic
OpenStack
VM
(Nova)
Image
Store
(Glance)
Network
Controller
(Neutron)
Identify
(Keystone)
DNS
Controller
(Designate)
Loadbalancer
L4LB L7LB
Kubernetes
(Rancher)
Storage
Block
Storage
(Ceph)
Object
Storage
(Ceph)
Database
Search/Analytics
Engine
(ElasticSearch)
RDBMS
(Mysql)
KVS
(Redis)
Messaging
(Kafka)
Function
(Knative)
Baremetal
Operation Tools
OpenStack in LINE
導入時期 2016年
Version Mitaka + Customization
クラスタ数 4
Hypervisor数 1100+
● Dev Cluster: 400
● Prod Cluster: 600 (region 1)
● Prod Cluster: 76 (region 2)
● Prod Cluster: 80 (region 3)
VM数 26000+
● Dev Cluster: 15503
● Prod Cluster: 8870 (region 1)
● Prod Cluster: 335 (region 2)
● Prod Cluster: 229 (region 3)
Difficulty of building OpenStack Cloud
TOR
Core
Aggregation
ToR
Aggregation
ToR
Hypervisor
Hypervisor
Hypervisor
Hypervisor
Hypervisor
Hypervisor
Hypervisor
Hypervisor
Aggregation
ToR
OpenStack
database
OpenStack
database
OpenStack
API
OpenStack
API
Core
Aggregation
Datacenter
Rack
● Knowledge of Networking
○ Design/Plan whole DC Network
● Knowledge of Operation for Large Product
○ Build Operation Tool which is not for
specific software
○ Consider User Support
● Knowledge of Server Kitting
○ Communicate procurement department
● Knowledge of OpenStack Software
○ Design deployment of OpenStack
○ Deploy OpenStack
○ Customize OpenStack
○ Troubleshooting
■ OpenStack Component
■ Related Software
Building OpenStack is not completed in one team
Network Operation Platform
● Maintain
○ Golden VM Image
○ ElasticSearch for logging
○ Prometheus for alerting
● Develop Operation Tools
● User Support
● Buy New Servers
● Design/Planning
○ DC Network
○ Inter-DC Network
● Implement Network Orchestrator
(Outside OpenStack)
● Design OpenStack Deployment
● Deploy OpenStack
● Customize OpenStack
● Troubleshooting
Member: 3+ Member: 4+ Member: 4+
Challenge of OpenStack
Basically We are trying to make OpenStack(IaaS) stable
What we have done
1. Legacy System Integration
2. Bring New Network Architecture into OpenStack Network
3. Maintain Customization for OSS while keep to catch up upstream
What we will do
1. Scale Emulation Environment
2. Internal Communication Visualizing/Tuning
3. Containerize OpenStack
4. Event Hub as a Platform
Challenge of OpenStack
Basically We are trying to make OpenStack(IaaS) stable
What we have done
1. Legacy System Integration
2. Bring New Network Architecture into OpenStack Network
3. Maintain Customization for OSS while keep to catch up upstream
What we will do
1. Scale Emulation Environment
2. Internal Communication Visualizing/Tuning
3. Containerize OpenStack
4. Event Hub as a Platform
Configuration Management
Challenge 1: Integration with Legacy System
Even before cloud, We have many Company-wide Systems for some purpose
CMDB
Monitoring System
Server Login
Authority Management
IPDB
Server
Register Spec, OS, Location..
Register IP address, Hostname
Register server as a monitoring target
Register acceptable user of server
setup
Ask for new server
Infra Dev
Challenge 1: Integration with Legacy System
After private cloud, “Server Creation” is completed without Infrastructure
department interruption. Thus Private Cloud itself should register new server
Private Cloud
Configuration Management
CMDB
Monitoring System
Server Login
Authority Management
IPDB
Server
Create new server
Dev
Register
Challenge 2: New Network Architecture in our DC
For scalability, operatabilty.
We introduce CLOS Network Architecture and terminate L3 on Hypervisor.
Previous New
Challenge 2: Support new architecture in OpenStack
Network Controller
(Neutron)
neutron-server
neutron-dhcp-agent
neutron-linuxbridge-agent
OSS implementation
neutron-metadata-agent
Expect to share L2 Network
We want all vms not to share l2 network
neutron-custom-agent
Replace
New
Challenge 3: Improve Customization for OSS
● We have customized many OpenStack Components
○ Perf
● Previously we just customize it after customize again and again
OpenStack
VM
(Nova)
Image
Store
(Glance)
Network
Controller
(Neutron)
Identify
(Keystone)
DNS
Controller
(Designate)
VM
(Nova)
customize commit for A
customize commit for C
customize commit for A
customize commit for B
customize commit for AIt’s difficult for us to take specific patch away from
our customized OpenStack.
Specific version
upstreamLINE version
forked
Challenge 3: Improve Customization for OSS
VM
(Nova)
customize commit for A
customize commit for C
customize commit for A
customize commit for B
customize commit for A
Specific version
upstreamLINE version
forked
patch for A
patch for B
patch for C
Base Commit ID
VM
(Nova)
Specific version
maintain by git
maintain by git
● Don’t fork/Stop to fork
● Just maintain only patch file in git
=> easily take patch out than before
Challenge will be different from Day1 to Day2
Day1 (So far)
● Develop user faced feature
○ Keep same experience as before
(legacy system)
○ Support new architecture
● Daily operation
○ Predictable
○ Unpredictable based on trouble
Day2 (from now)
● Enhance Operation
● Optimize Development
● Reduce daily operation
○ Predictable
○ Unpredictable
Challenge of OpenStack
Basically We are trying to make OpenStack(IaaS) stable
What we have done
1. Legacy System Integration
2. Bring New Network Architecture into OpenStack Network
3. Maintain Customization for OSS while keep to catch up upstream
What we will do
1. Scale Emulation Environment
2. Internal Communication Visualizing/Tuning
3. Containerize OpenStack
4. Event Hub as a Platform
Future Challenge 1: Scale Emulation Environment
導入時期 2016年
Version Mitaka + Customization
クラスタ数 4+1 (WIP: Semi Public Cloud)
Hypervisor数 1100+
● Dev Cluster: 400
● Prod Cluster: 600 (region 1)
● Prod Cluster: 76 (region 2)
● Prod Cluster: 80 (region 3)
VM数 26000+
● Dev Cluster: 15503
● Prod Cluster: 8870 (region 1)
● Prod Cluster: 335 (region 2)
● Prod Cluster: 229 (region 3)
The number of hypervisor is continuously
increased
We faced the situation
- Timing/Scale related error
- Some operation took long time
!
We need environment to simulate scale from following point of view without
preparing same number of Hypervisor
● Database Access
● RPC over RabbitMQ
Future Challenge 1: Scale Emulation Environment
They are control plane specific load.
We can use this environment for tuning of control plane in OpenStack
● Implement Fake Agent
(nova-compute)
(neutron-agent)
● Use container instead
of actual HV
Future Challenge 1: Scale Emulation Environment
Hypervisor
(nova-compute, neutron-agent)
Controle Plane
Controle Plane
Controle Plane
600 HV
Orchestrate/Manage
Real Environment Scale Environment
Controle Plane
Controle Plane
Controle Plane
● Use same env
600 fake-HV
Server
Fake HV (docker container)
(nova-compute, neutron-agent)
Hypervisor
(nova-compute, neutron-agent)Hypervisor (HV)
(nova-compute, neutron-agent)
Fake HV (docker container)
(nova-compute, neutron-agent)
● Implement Fake Agent
(nova-compute)
(neutron-agent)
● Use container instead
of actual HV
Future Challenge 1: Scale Emulation Environment
Hypervisor
(nova-compute, neutron-agent)
Controle Plane
Controle Plane
Controle Plane
600 HV
Orchestrate/Manage
Real Environment Scale Environment
Controle Plane
Controle Plane
Controle Plane
● Use same env
600 fake-HV
Server
Fake HV (docker container)
(nova-compute, neutron-agent)
Hypervisor
(nova-compute, neutron-agent)Hypervisor (HV)
(nova-compute, neutron-agent)
Fake HV (docker container)
(nova-compute, neutron-agent)Easy to add new Fake HV
=> We can emulate any number of scale
Future Challenge 2: Communication Visualizing
There are 2 types of communication among OpenStack each software
Authentication
(Keystone)
VM
(Nova)
Network
(Neutron)
Microservice
● Restful API
(between component)
● RPC over Messaging Bus
(inside component)
Restful API
Restful API
Restful API
neutron-agent
neutron-server
RPC
Future Challenge 2: Communication Visualizing
Authentication
(Keystone)
VM
(Nova)
Network
(Neutron)
Microservice
Restful API
Restful API
Restful API
neutron-agent
neutron-server
RPC
Anytime this can be broken
Communication can be failed.
- Because of scale
- Because of in-proper config
Error sometimes got
propagated from one to other
Future Challenge 2: Communication Visualizing
Authentication
(Keystone)
VM
(Nova)
Network
(Neutron)
Microservice
Restful API
Restful API
Restful API
neutron-agent
neutron-server
RPC
Anytime this can be broken
Communication can be failed.
- Because of scale
- Because of in-proper config
Error sometimes got
propagated from one to other
1. Very difficult to troubleshoot this kind of issue because of
- Error got propagated from one to another
- Log is not always enough information
- Log is only shown when something happen
2. Sometimes problem can be predicted by some metrics
- how many rpc got received
- how many rpc waited for reply
Future Challenge 2: Communication Visualizing
Authentication
(Keystone)
VM
(Nova)
Network
(Neutron)
Microservice
Restful API
Restful API
Restful API
neutron-agent
neutron-server
RPC
Monitoring tool
Monitor Communication
related metrics
Future Challenge 3: Containerize OpenStack
Motivation/Current Pain Point
● Complexity of packaging tool like RPM
○ Dependency between packages
○ Configuration for new file
=> We need to build RPM everytime we changed the code
● Impossible to run different version of OpenStack on same server
○ Dependency of common library of OpenStack
=> we actually deployed much more control plane servers than we actually need
● Lack of observability for all softwares running on control plane
○ No way to identify which part is to install depended library and which part is to install our
software in deployment script (ansible, chef…)
○ Deployment script doesn’t take care software running after deployed
○ We can not notice if some developer run something temporally script
Future Challenge 3: Containerize OpenStack
Server Server Server
Ansible
PlaybookAnsible
PlaybookAnsible
Playbook
Install library
Install software
Start software
K8s manifest
K8s manifest
nova-api
neutron-server
common-library
RPM
Server
nova-api
neutron-server
common-library
Docker
Registry
Get package
Server Server Server
nova-api container
nova-api
common-library
nova-api container
nova-api
common-library
Install software
Start software
Future Challenge 4: EventHub for All Component
OpenStack
VM
(Nova)
Image
Store
(Glance)
Network
Controller
(Neutron)
Identify
(Keystone)
DNS
Controller
(Designate)
Loadbalancer
L4LB L7LB
Kubernetes
(Rancher)
Storage
Block
Storage
(Ceph)
Object
Storage
(Ceph)
Database
Search/Analytics
Engine
(ElasticSearch)
RDBMS
(Mysql)
KVS
(Redis)
Messaging
(Kafka)
Function
(Knative)
Baremetal
Operation Tools
Future Challenge 4: EventHub for All Component
OpenStack
VM
(Nova)
Image
Store
(Glance)
Network
Controller
(Neutron)
Identify
(Keystone)
DNS
Controller
(Designate)
Loadbalancer
L4LB L7LB
Kubernetes
(Rancher)
Storage
Block
Storage
(Ceph)
Object
Storage
(Ceph)
Database
Search/Analytics
Engine
(ElasticSearch)
RDBMS
(Mysql)
KVS
(Redis)
Messaging
(Kafka)
Function
(Knative)
Baremetal
Operation Tools
Depending on others
Some component/operation script want to do something
When User(actually project) in Keystone is deleted
When VM is created
When RealServer is added to Loadbalancer
Pub/Sub Concept in Microservice Architecture
Authentication
Component
VM
Component
Publish important event of
own component
Subscribe just interested
events
Network
Component
This component can do
something when interested event
happenedThis component don’t have to
consider who this component
need to work with
Messaging bus
(RabbitMQ)
Pub/Sub Concept in Microservice Architecture
Authentication
Component
VM
Component
Publish important event of
own component
Subscribe just interested
events
Network
Component
This component can do
something when interested event
happenedThis component don’t have to
consider who this component
need to work with
Messaging bus
This mechanism allow us to extend Private Cloud
(Microservice) without changing existing code for future
Future Challenge 4: EventHub for All Component
This part of notification logic has been already implemented in OpenStack but...
Authentication
Component
(Keystone)
Messaging bus
(RabbitMQ)
VM
Component
(Nova)
Operation ScriptA
Operation ScriptB
L7LB
Kubernetes
Publish Event Subscribe Event
Logic for access rabbitmq
Logic for access rabbitmq
Logic for access rabbitmq
Logic for access rabbitmq
Business logic
Business logic
Business logic
Business logic
Future Challenge 4: EventHub for All Component
This part of notification logic has been already implemented in OpenStack but...
Authentication
Component
(Keystone)
Messaging bus
(RabbitMQ)
VM
Component
(Nova)
Operation ScriptA
Operation ScriptB
L7LB
Kubernetes
Publish Event Subscribe Event
Logic for access rabbitmq
Logic for access rabbitmq
Logic for access rabbitmq
Logic for access rabbitmq
Business logic
Business logic
Business logic
Business logic
● Sometimes Logic for access rabbitmq code got bigger
than actual business logic
● All of components/script need to implement that logic first
Future Challenge 4: EventHub for All Component
We are currently developing new component which allow us to register program
with interested event. It will make more easy to co-work with other component
Authentication
Component
(Keystone)
Messaging bus
(RabbitMQ)
VM
Component
(Nova)
Operation ScriptA
Operation ScriptB
L7LB
Kubernetes
Publish Event
Logic for access rabbitmq
Business logic
Business logic
Business logic
Business logic
Subscribe Event
Business logic
Business logic
Business logic
Function as a Service
New
For more future: IaaS to PaaS, CaaS….
We are currently trying to introduce additional abstraction layer above from IaaS
● https://engineering.linecorp.com/ja/blog/japan-container-days-v18-12-report/
● https://www.slideshare.net/linecorp/lines-private-cloud-meet-cloud-native-world
Take a glance at “K8s on OpenStack”
Many Container Related Project started in LINE
Published
● https://www.slideshare.net/linecorp/parallel-selenium-test-with-docker
● https://www.slideshare.net/linecorp/test-in-dockerized-system-architecture-of-line-now-line-now-docker
● https://www.slideshare.net/linecorp/local-development-environment-for-micro-services-with-docker
● https://www.slideshare.net/linecorp/clova-92916456 (Japanese Only)
Undergoing Project
Currently Application Engineer maintain it...
VM
Kubernetes Kubernetes
Container
Container
Container
Container
Container
Container
Container
Container
Container
Container
Container
Container
Developers A in Japan Developers B in Taiwan
Private Cloud
Private Cloud Developers
Responsibility border
Application Developer
OS
VM
OS
VM
OS
BM
OS
BM
OS
BM
OS
IaaS
Private Cloud Developers
Operating Knowledge is distributed
VM
Kubernetes Kubernetes
Container
Container
Container
Container
Container
Container
Container
Container
Container
Container
Container
Container
Developers A in Japan Developers B in Taiwan
Private Cloud
Responsibility border
Application Developer
OS
VM
OS
VM
OS
BM
OS
BM
OS
BM
OS
knowledge knowledge
● Lack of mechanism to share
knowledge between them
● Quality will be uneven
● New team start from beginner
IaaS
Problem
Time to extend our responsibility from IaaS to
Private Cloud Developers
VM
Kubernetes Kubernetes
Container
Container
Container
Container
Container
Container
Container
Container
Developers A in Japan Developers B in Taiwan
Private Cloud
Responsibility border
Application Developer
OS
VM
OS
VM
OS
BM
OS
BM
OS
BM
OS
knowledge knowledge
knowledge
IaaS
KaaS
Rancher 2.X based KaaS
Kubernetes
Kubernetes
Kubernetes
 Kubernetes
  Kubernetes
  Kubernetes
  API
クラスタの操作:
 - クラスタの作成
 - クラスタの変更
 - ノードの追加
クラスタの利用:
 - アプリケーションのデプロイ
 - アプリケーションのアップデート
 - アプリケーションのスケールアウト ...
クラスタ管理
- クラスタ/ノード作成、変更
- クラスタ/ノード監視、ヒーリング

Introduction of private cloud in LINE - OpenStack最新情報セミナー(2019年2月)

  • 1.
    Introduction Private Cloud inLINE (2019/01) Yuki Nishiwaki
  • 2.
    Agenda 1. Introduction/Background ofPrivate Cloud 2. OpenStack in LINE 3. Challenge of OpenStack
  • 3.
    Who are we? Responsibility -Develop/Maintain Common/Fundamental Function for Private Cloud (IaaS) - Consider/Think of Optimization for Whole Private Cloud Network Service Operation PlatformStorage Software - IaaS (OpenStack + α) - Kubernetes Knowledge - Software - Network, Virtualization, Linux
  • 4.
  • 5.
  • 6.
    OpenStack in LINE 導入時期2016年 Version Mitaka + Customization クラスタ数 4 Hypervisor数 1100+ ● Dev Cluster: 400 ● Prod Cluster: 600 (region 1) ● Prod Cluster: 76 (region 2) ● Prod Cluster: 80 (region 3) VM数 26000+ ● Dev Cluster: 15503 ● Prod Cluster: 8870 (region 1) ● Prod Cluster: 335 (region 2) ● Prod Cluster: 229 (region 3)
  • 7.
    Difficulty of buildingOpenStack Cloud TOR Core Aggregation ToR Aggregation ToR Hypervisor Hypervisor Hypervisor Hypervisor Hypervisor Hypervisor Hypervisor Hypervisor Aggregation ToR OpenStack database OpenStack database OpenStack API OpenStack API Core Aggregation Datacenter Rack ● Knowledge of Networking ○ Design/Plan whole DC Network ● Knowledge of Operation for Large Product ○ Build Operation Tool which is not for specific software ○ Consider User Support ● Knowledge of Server Kitting ○ Communicate procurement department ● Knowledge of OpenStack Software ○ Design deployment of OpenStack ○ Deploy OpenStack ○ Customize OpenStack ○ Troubleshooting ■ OpenStack Component ■ Related Software
  • 8.
    Building OpenStack isnot completed in one team Network Operation Platform ● Maintain ○ Golden VM Image ○ ElasticSearch for logging ○ Prometheus for alerting ● Develop Operation Tools ● User Support ● Buy New Servers ● Design/Planning ○ DC Network ○ Inter-DC Network ● Implement Network Orchestrator (Outside OpenStack) ● Design OpenStack Deployment ● Deploy OpenStack ● Customize OpenStack ● Troubleshooting Member: 3+ Member: 4+ Member: 4+
  • 9.
    Challenge of OpenStack BasicallyWe are trying to make OpenStack(IaaS) stable What we have done 1. Legacy System Integration 2. Bring New Network Architecture into OpenStack Network 3. Maintain Customization for OSS while keep to catch up upstream What we will do 1. Scale Emulation Environment 2. Internal Communication Visualizing/Tuning 3. Containerize OpenStack 4. Event Hub as a Platform
  • 10.
    Challenge of OpenStack BasicallyWe are trying to make OpenStack(IaaS) stable What we have done 1. Legacy System Integration 2. Bring New Network Architecture into OpenStack Network 3. Maintain Customization for OSS while keep to catch up upstream What we will do 1. Scale Emulation Environment 2. Internal Communication Visualizing/Tuning 3. Containerize OpenStack 4. Event Hub as a Platform
  • 11.
    Configuration Management Challenge 1:Integration with Legacy System Even before cloud, We have many Company-wide Systems for some purpose CMDB Monitoring System Server Login Authority Management IPDB Server Register Spec, OS, Location.. Register IP address, Hostname Register server as a monitoring target Register acceptable user of server setup Ask for new server Infra Dev
  • 12.
    Challenge 1: Integrationwith Legacy System After private cloud, “Server Creation” is completed without Infrastructure department interruption. Thus Private Cloud itself should register new server Private Cloud Configuration Management CMDB Monitoring System Server Login Authority Management IPDB Server Create new server Dev Register
  • 13.
    Challenge 2: NewNetwork Architecture in our DC For scalability, operatabilty. We introduce CLOS Network Architecture and terminate L3 on Hypervisor. Previous New
  • 14.
    Challenge 2: Supportnew architecture in OpenStack Network Controller (Neutron) neutron-server neutron-dhcp-agent neutron-linuxbridge-agent OSS implementation neutron-metadata-agent Expect to share L2 Network We want all vms not to share l2 network neutron-custom-agent Replace New
  • 15.
    Challenge 3: ImproveCustomization for OSS ● We have customized many OpenStack Components ○ Perf ● Previously we just customize it after customize again and again OpenStack VM (Nova) Image Store (Glance) Network Controller (Neutron) Identify (Keystone) DNS Controller (Designate) VM (Nova) customize commit for A customize commit for C customize commit for A customize commit for B customize commit for AIt’s difficult for us to take specific patch away from our customized OpenStack. Specific version upstreamLINE version forked
  • 16.
    Challenge 3: ImproveCustomization for OSS VM (Nova) customize commit for A customize commit for C customize commit for A customize commit for B customize commit for A Specific version upstreamLINE version forked patch for A patch for B patch for C Base Commit ID VM (Nova) Specific version maintain by git maintain by git ● Don’t fork/Stop to fork ● Just maintain only patch file in git => easily take patch out than before
  • 17.
    Challenge will bedifferent from Day1 to Day2 Day1 (So far) ● Develop user faced feature ○ Keep same experience as before (legacy system) ○ Support new architecture ● Daily operation ○ Predictable ○ Unpredictable based on trouble Day2 (from now) ● Enhance Operation ● Optimize Development ● Reduce daily operation ○ Predictable ○ Unpredictable
  • 18.
    Challenge of OpenStack BasicallyWe are trying to make OpenStack(IaaS) stable What we have done 1. Legacy System Integration 2. Bring New Network Architecture into OpenStack Network 3. Maintain Customization for OSS while keep to catch up upstream What we will do 1. Scale Emulation Environment 2. Internal Communication Visualizing/Tuning 3. Containerize OpenStack 4. Event Hub as a Platform
  • 19.
    Future Challenge 1:Scale Emulation Environment 導入時期 2016年 Version Mitaka + Customization クラスタ数 4+1 (WIP: Semi Public Cloud) Hypervisor数 1100+ ● Dev Cluster: 400 ● Prod Cluster: 600 (region 1) ● Prod Cluster: 76 (region 2) ● Prod Cluster: 80 (region 3) VM数 26000+ ● Dev Cluster: 15503 ● Prod Cluster: 8870 (region 1) ● Prod Cluster: 335 (region 2) ● Prod Cluster: 229 (region 3) The number of hypervisor is continuously increased We faced the situation - Timing/Scale related error - Some operation took long time !
  • 20.
    We need environmentto simulate scale from following point of view without preparing same number of Hypervisor ● Database Access ● RPC over RabbitMQ Future Challenge 1: Scale Emulation Environment They are control plane specific load. We can use this environment for tuning of control plane in OpenStack
  • 21.
    ● Implement FakeAgent (nova-compute) (neutron-agent) ● Use container instead of actual HV Future Challenge 1: Scale Emulation Environment Hypervisor (nova-compute, neutron-agent) Controle Plane Controle Plane Controle Plane 600 HV Orchestrate/Manage Real Environment Scale Environment Controle Plane Controle Plane Controle Plane ● Use same env 600 fake-HV Server Fake HV (docker container) (nova-compute, neutron-agent) Hypervisor (nova-compute, neutron-agent)Hypervisor (HV) (nova-compute, neutron-agent) Fake HV (docker container) (nova-compute, neutron-agent)
  • 22.
    ● Implement FakeAgent (nova-compute) (neutron-agent) ● Use container instead of actual HV Future Challenge 1: Scale Emulation Environment Hypervisor (nova-compute, neutron-agent) Controle Plane Controle Plane Controle Plane 600 HV Orchestrate/Manage Real Environment Scale Environment Controle Plane Controle Plane Controle Plane ● Use same env 600 fake-HV Server Fake HV (docker container) (nova-compute, neutron-agent) Hypervisor (nova-compute, neutron-agent)Hypervisor (HV) (nova-compute, neutron-agent) Fake HV (docker container) (nova-compute, neutron-agent)Easy to add new Fake HV => We can emulate any number of scale
  • 23.
    Future Challenge 2:Communication Visualizing There are 2 types of communication among OpenStack each software Authentication (Keystone) VM (Nova) Network (Neutron) Microservice ● Restful API (between component) ● RPC over Messaging Bus (inside component) Restful API Restful API Restful API neutron-agent neutron-server RPC
  • 24.
    Future Challenge 2:Communication Visualizing Authentication (Keystone) VM (Nova) Network (Neutron) Microservice Restful API Restful API Restful API neutron-agent neutron-server RPC Anytime this can be broken Communication can be failed. - Because of scale - Because of in-proper config Error sometimes got propagated from one to other
  • 25.
    Future Challenge 2:Communication Visualizing Authentication (Keystone) VM (Nova) Network (Neutron) Microservice Restful API Restful API Restful API neutron-agent neutron-server RPC Anytime this can be broken Communication can be failed. - Because of scale - Because of in-proper config Error sometimes got propagated from one to other 1. Very difficult to troubleshoot this kind of issue because of - Error got propagated from one to another - Log is not always enough information - Log is only shown when something happen 2. Sometimes problem can be predicted by some metrics - how many rpc got received - how many rpc waited for reply
  • 26.
    Future Challenge 2:Communication Visualizing Authentication (Keystone) VM (Nova) Network (Neutron) Microservice Restful API Restful API Restful API neutron-agent neutron-server RPC Monitoring tool Monitor Communication related metrics
  • 27.
    Future Challenge 3:Containerize OpenStack Motivation/Current Pain Point ● Complexity of packaging tool like RPM ○ Dependency between packages ○ Configuration for new file => We need to build RPM everytime we changed the code ● Impossible to run different version of OpenStack on same server ○ Dependency of common library of OpenStack => we actually deployed much more control plane servers than we actually need ● Lack of observability for all softwares running on control plane ○ No way to identify which part is to install depended library and which part is to install our software in deployment script (ansible, chef…) ○ Deployment script doesn’t take care software running after deployed ○ We can not notice if some developer run something temporally script
  • 28.
    Future Challenge 3:Containerize OpenStack Server Server Server Ansible PlaybookAnsible PlaybookAnsible Playbook Install library Install software Start software K8s manifest K8s manifest nova-api neutron-server common-library RPM Server nova-api neutron-server common-library Docker Registry Get package Server Server Server nova-api container nova-api common-library nova-api container nova-api common-library Install software Start software
  • 29.
    Future Challenge 4:EventHub for All Component OpenStack VM (Nova) Image Store (Glance) Network Controller (Neutron) Identify (Keystone) DNS Controller (Designate) Loadbalancer L4LB L7LB Kubernetes (Rancher) Storage Block Storage (Ceph) Object Storage (Ceph) Database Search/Analytics Engine (ElasticSearch) RDBMS (Mysql) KVS (Redis) Messaging (Kafka) Function (Knative) Baremetal Operation Tools
  • 30.
    Future Challenge 4:EventHub for All Component OpenStack VM (Nova) Image Store (Glance) Network Controller (Neutron) Identify (Keystone) DNS Controller (Designate) Loadbalancer L4LB L7LB Kubernetes (Rancher) Storage Block Storage (Ceph) Object Storage (Ceph) Database Search/Analytics Engine (ElasticSearch) RDBMS (Mysql) KVS (Redis) Messaging (Kafka) Function (Knative) Baremetal Operation Tools Depending on others Some component/operation script want to do something When User(actually project) in Keystone is deleted When VM is created When RealServer is added to Loadbalancer
  • 31.
    Pub/Sub Concept inMicroservice Architecture Authentication Component VM Component Publish important event of own component Subscribe just interested events Network Component This component can do something when interested event happenedThis component don’t have to consider who this component need to work with Messaging bus (RabbitMQ)
  • 32.
    Pub/Sub Concept inMicroservice Architecture Authentication Component VM Component Publish important event of own component Subscribe just interested events Network Component This component can do something when interested event happenedThis component don’t have to consider who this component need to work with Messaging bus This mechanism allow us to extend Private Cloud (Microservice) without changing existing code for future
  • 33.
    Future Challenge 4:EventHub for All Component This part of notification logic has been already implemented in OpenStack but... Authentication Component (Keystone) Messaging bus (RabbitMQ) VM Component (Nova) Operation ScriptA Operation ScriptB L7LB Kubernetes Publish Event Subscribe Event Logic for access rabbitmq Logic for access rabbitmq Logic for access rabbitmq Logic for access rabbitmq Business logic Business logic Business logic Business logic
  • 34.
    Future Challenge 4:EventHub for All Component This part of notification logic has been already implemented in OpenStack but... Authentication Component (Keystone) Messaging bus (RabbitMQ) VM Component (Nova) Operation ScriptA Operation ScriptB L7LB Kubernetes Publish Event Subscribe Event Logic for access rabbitmq Logic for access rabbitmq Logic for access rabbitmq Logic for access rabbitmq Business logic Business logic Business logic Business logic ● Sometimes Logic for access rabbitmq code got bigger than actual business logic ● All of components/script need to implement that logic first
  • 35.
    Future Challenge 4:EventHub for All Component We are currently developing new component which allow us to register program with interested event. It will make more easy to co-work with other component Authentication Component (Keystone) Messaging bus (RabbitMQ) VM Component (Nova) Operation ScriptA Operation ScriptB L7LB Kubernetes Publish Event Logic for access rabbitmq Business logic Business logic Business logic Business logic Subscribe Event Business logic Business logic Business logic Function as a Service New
  • 36.
    For more future:IaaS to PaaS, CaaS…. We are currently trying to introduce additional abstraction layer above from IaaS ● https://engineering.linecorp.com/ja/blog/japan-container-days-v18-12-report/ ● https://www.slideshare.net/linecorp/lines-private-cloud-meet-cloud-native-world
  • 37.
    Take a glanceat “K8s on OpenStack”
  • 38.
    Many Container RelatedProject started in LINE Published ● https://www.slideshare.net/linecorp/parallel-selenium-test-with-docker ● https://www.slideshare.net/linecorp/test-in-dockerized-system-architecture-of-line-now-line-now-docker ● https://www.slideshare.net/linecorp/local-development-environment-for-micro-services-with-docker ● https://www.slideshare.net/linecorp/clova-92916456 (Japanese Only) Undergoing Project
  • 39.
    Currently Application Engineermaintain it... VM Kubernetes Kubernetes Container Container Container Container Container Container Container Container Container Container Container Container Developers A in Japan Developers B in Taiwan Private Cloud Private Cloud Developers Responsibility border Application Developer OS VM OS VM OS BM OS BM OS BM OS IaaS
  • 40.
    Private Cloud Developers OperatingKnowledge is distributed VM Kubernetes Kubernetes Container Container Container Container Container Container Container Container Container Container Container Container Developers A in Japan Developers B in Taiwan Private Cloud Responsibility border Application Developer OS VM OS VM OS BM OS BM OS BM OS knowledge knowledge ● Lack of mechanism to share knowledge between them ● Quality will be uneven ● New team start from beginner IaaS Problem
  • 41.
    Time to extendour responsibility from IaaS to Private Cloud Developers VM Kubernetes Kubernetes Container Container Container Container Container Container Container Container Developers A in Japan Developers B in Taiwan Private Cloud Responsibility border Application Developer OS VM OS VM OS BM OS BM OS BM OS knowledge knowledge knowledge IaaS KaaS
  • 42.
    Rancher 2.X basedKaaS Kubernetes Kubernetes Kubernetes  Kubernetes   Kubernetes   Kubernetes   API クラスタの操作:  - クラスタの作成  - クラスタの変更  - ノードの追加 クラスタの利用:  - アプリケーションのデプロイ  - アプリケーションのアップデート  - アプリケーションのスケールアウト ... クラスタ管理 - クラスタ/ノード作成、変更 - クラスタ/ノード監視、ヒーリング