Upgrading Openstack from Kilo to Mitaka

Francesco Pantano
Francesco PantanoCloud System Engineer at FASTWEB
Upgrading
OpenStack from Kilo
to Mitaka (where is
Liberty???)
Milano, 28 Settembre
2017
Amedeo Salvati
Francesco Pantano
FASTWEB C1 – PUBLIC
1
Fastweb and FASTcloud
AGENDA
Upgrading OpenStack
Virtualized components
The Ansible way
Lesson learned
FASTWEB C1 – PUBLIC
2
Fastweb and FASTcloud
FASTWEB C1 – PUBLIC
3
Fastweb S.p.A. is an Italian telecommunications company that provides landline,
broadband Internet and digital television services
Fastweb is fully owned by the Swiss telecommunication company Swisscom
Not only a fiber company!
DC Tier IV
FASTWEB C1 – PUBLIC
4
Following Milan, a new Tier IV Data Center
Surface
Power Use Effectiveness (PUE)
Location Milan Rome
600 m2
500 m2
1,25 1,25
Certification Tier IV Tier IV
2018
FASTcloud
FASTWEB C1 – PUBLIC
5
Starting from 2015 our cloud solution is based on OpenStack
Our FASTcloud services runs on Italian jurisdiction
We offer flexible solutions such as Virtual Server, Virtual Private Data Center, Private
IaaS with dedicated hardware
As a telecommunication company we offer to our customer our cloud services over
Internet and VPN MPLS
Business requirements to upgrade to mitaka
Double Jump - From kilo to mitaka
Need to reduce the downtime for the customers, specifically for
the L3 agents
FASTWEB C1 – PUBLIC
6
The Upgrade Path
FASTWEB C1 – PUBLIC
7
Possible ways to upgrade:
1. Big Bang (in-place) upgrade;
2. Side by Side clusters;
3. Control Plane side by side;
4. Rolling upgrades (upgrade levels)
Have you planned
a rollback path?
Think about impacts:
1. On the infrastructure
2. From user side
3. From applications side
Think about Disaster
FASTWEB C1 – PUBLIC
8
Clustering Openstack Services
FASTWEB C1 – PUBLIC
9
Provides HA for all our services
Keep all services consistent building
constraints
Make some services clustering free could
be the answer
Follow the divide et impera paradigm
Cons:
Resource constraints make difficult the
management of some services
Virtualized components
FASTWEB C1 – PUBLIC
10
RADOS
GATEWAY
Virtualized components: Galera cluster
FASTWEB C1 – PUBLIC
11
Goal:
1. Be cluster free
2. Replication mode (Galera cluster)
for fault tolerance
High- availability service that provides:
1. High System uptime
2. No Data loss
3. Scalability for growth
Virtualized components: Nova service
FASTWEB C1 – PUBLIC
12
Nova Control Plane:
a. 2 nodes in HA
b. VIP to access services
c. Haproxy + keepalived
nova.conf
[upgrade_levels]
compute = kilo
nova.conf
[upgrade_levels]
compute = liberty
Pin the compute RPC version:
[upgrade_levels] = X + 1 but not > 1
Managing Openstack: The Ansible way
FASTWEB C1 – PUBLIC
13
IaaS Software
Host Operating System
Openstack services roles
Ceph rados gateway roles
Reverse proxy management
Upgrade to Liberty path
Upgrade to Mitaka path
Full-Stack Automation
with Ansible
Common and common-openstack roles to keep
aligned the infrastructure components
Playbooks to update the control plane services
from kilo to liberty and from liberty to mitaka
Playbooks:
It’s time to upgrade: Planning
FASTWEB C1 – PUBLIC
14
Make the
Integration
tests
Upgrade
control Plane
to Mitaka
Disable
virtualized
services
from PCS
Routers
Rollback
Upgrade
Control Plane
to Liberty
Align the
haproxy/keepalived
config
Add neutron
auxiliary blades
and switch
routers
Prepare all virtual
environment
(provision the
VMs using
ansible roles)
Neutron aux mode: adding two new agents
FASTWEB C1 – PUBLIC
15
neutron.conf
● dhcp_agents_per_network = 2
● max_l3_agents_per_router= 2
dhcp_conf.ini
● enable_metadata_on_isolated_network
x 3
x 2
Neutron aux mode: moving routers
FASTWEB C1 – PUBLIC
16
Aux Neutron L3 agent
Neutron L3 agentCompute node
“${NEUTRON_CLIENT}” l3-agent-router-[add|remove] “${AGENT_ID}” “${ROUTER_ID}”
for router in $(ip netns | grep qrouter); do
ip netns exec $router ip link 
set dev $interface down;
done;
Force the routers to switch
Test critical sections: update db schema
FASTWEB C1 – PUBLIC
17
Fix Neutron db
Table
ha_router_agent_port_bindings
for duplicate entries
● Dump the entire db and replicate it on the
Instance B;
● Execute the update schema for each service
to test it works correctly:
Ansible bool condition:
when: update_schema
openstack-db --service “${service}” --update
Production
Database
Mirrored
Database
Instance A Instance B
NovaCinder NeutronKeystone Heat Glance
Upgrade compute nodes: the Big Picture
FASTWEB C1 – PUBLIC
18
Lesson Learned: the MTU issue
FASTWEB C1 – PUBLIC
19
The MTU on the qbrXYZ and qrouter-XYZ interfaces are
1500 instead of the rest of the infrastructure where Jumbo
frame is enabled
neutron.conf
[DEFAULT]
global_physnet_mtu = 9000
ml2_conf.ini
[DEFAULT]
path_mtu = 9000
Best Practices
FASTWEB C1 – PUBLIC
20
● Review the release notes for each release to learn about new,
updated and deprecated parameters
● Openstack mirrored environment
● Identify critical update paths (i.e. openstack db schema update)
● Parallelize as much as possible (i.e. packages update)
● Make use of Ansible templates (ready to go to newton)
Questions
FASTWEB C1 – PUBLIC
21
Thanks
FASTWEB C1 – PUBLIC
22
Amedeo Salvati
amedeo@linux.com
@amedeosalvati on twitter
Francesco Pantano
fmount@inventati.org
@fmount9 on twitter
1 of 22

Recommended

Sky x technology by
Sky x technologySky x technology
Sky x technologyHafsa Hayath
313 views22 slides
Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr... by
Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...
Enabling Data Scientists to easily create and own Kafka Consumers | Stefan Kr...HostedbyConfluent
323 views76 slides
VPNaaS in Neutron by
VPNaaS in NeutronVPNaaS in Neutron
VPNaaS in NeutronKazunori Takeuchi
9.3K views17 slides
Distributed routing by
Distributed routingDistributed routing
Distributed routingMurali Reddy
1.9K views14 slides
Routing, Network Performance, and Role of Analytics by
Routing, Network Performance, and Role of AnalyticsRouting, Network Performance, and Role of Analytics
Routing, Network Performance, and Role of AnalyticsAPNIC
1.3K views26 slides
NaaS in OpenStack - CloudCamp Moscow by
NaaS in OpenStack - CloudCamp MoscowNaaS in OpenStack - CloudCamp Moscow
NaaS in OpenStack - CloudCamp MoscowIlya Alekseyev
3.5K views30 slides

More Related Content

What's hot

Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D... by
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Tobias Schneck
523 views40 slides
Openstack Neutron, interconnections with BGP/MPLS VPNs by
Openstack Neutron, interconnections with BGP/MPLS VPNsOpenstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNsThomas Morin
1.7K views15 slides
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce... by
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...Flink Forward
1.1K views43 slides
KubeOne by
KubeOne KubeOne
KubeOne loodse
619 views32 slides
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter by
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterTwitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterHostedbyConfluent
460 views15 slides
Webinar-Linux Networking is Awesome by
Webinar-Linux Networking is AwesomeWebinar-Linux Networking is Awesome
Webinar-Linux Networking is AwesomeCumulus Networks
1.4K views31 slides

What's hot(20)

Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D... by Tobias Schneck
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Tobias Schneck523 views
Openstack Neutron, interconnections with BGP/MPLS VPNs by Thomas Morin
Openstack Neutron, interconnections with BGP/MPLS VPNsOpenstack Neutron, interconnections with BGP/MPLS VPNs
Openstack Neutron, interconnections with BGP/MPLS VPNs
Thomas Morin1.7K views
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce... by Flink Forward
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...
Flink Forward Berlin 2017: Aljoscha Krettek - Talk Python to me: Stream Proce...
Flink Forward1.1K views
KubeOne by loodse
KubeOne KubeOne
KubeOne
loodse619 views
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter by HostedbyConfluent
Twitter’s Apache Kafka Adoption Journey | Ming Liu, TwitterTwitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
Twitter’s Apache Kafka Adoption Journey | Ming Liu, Twitter
HostedbyConfluent460 views
Webinar-Linux Networking is Awesome by Cumulus Networks
Webinar-Linux Networking is AwesomeWebinar-Linux Networking is Awesome
Webinar-Linux Networking is Awesome
Cumulus Networks1.4K views
Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Con... by Flink Forward
Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Con...Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Con...
Flink Forward Berlin 2017: Dominik Bruhn - Deploying Flink Jobs as Docker Con...
Flink Forward750 views
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p... by Flink Forward
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward Berlin 2017: Piotr Wawrzyniak - Extending Apache Flink stream p...
Flink Forward527 views
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey... by Flink Forward
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Time to-live: How to Perform Automatic State Cleanup in Apache Flink - Andrey...
Flink Forward1.2K views
Flink Forward Berlin 2017: Andreas Kunft - Efficiently executing R Dataframes... by Flink Forward
Flink Forward Berlin 2017: Andreas Kunft - Efficiently executing R Dataframes...Flink Forward Berlin 2017: Andreas Kunft - Efficiently executing R Dataframes...
Flink Forward Berlin 2017: Andreas Kunft - Efficiently executing R Dataframes...
Flink Forward542 views
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F... by Till Rohrmann
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Future of Apache Flink Deployments: Containers, Kubernetes and More - Flink F...
Till Rohrmann1.7K views
OpenStack Control Plane High Availability by Michael Solberg
OpenStack Control Plane High AvailabilityOpenStack Control Plane High Availability
OpenStack Control Plane High Availability
Michael Solberg2.5K views
High availability and fault tolerance of openstack by Deepak Mane
High availability and fault tolerance of openstackHigh availability and fault tolerance of openstack
High availability and fault tolerance of openstack
Deepak Mane5.3K views
Flink Forward San Francisco 2019: Developing and operating real-time applicat... by Flink Forward
Flink Forward San Francisco 2019: Developing and operating real-time applicat...Flink Forward San Francisco 2019: Developing and operating real-time applicat...
Flink Forward San Francisco 2019: Developing and operating real-time applicat...
Flink Forward390 views

Similar to Upgrading Openstack from Kilo to Mitaka

FastCLOUD ovirt meetup by
FastCLOUD ovirt meetupFastCLOUD ovirt meetup
FastCLOUD ovirt meetupFrancesco Pantano
88 views11 slides
Understanding network and service virtualization by
Understanding network and service virtualizationUnderstanding network and service virtualization
Understanding network and service virtualizationSDN Hub
3.7K views48 slides
HP Virtual Connect technical fundamental101 v2.1 by
HP Virtual Connect technical fundamental101   v2.1HP Virtual Connect technical fundamental101   v2.1
HP Virtual Connect technical fundamental101 v2.1ผู้ชาย แห่งสายลม
6.8K views38 slides
VMworld 2016: Advanced Network Services with NSX by
VMworld 2016: Advanced Network Services with NSXVMworld 2016: Advanced Network Services with NSX
VMworld 2016: Advanced Network Services with NSXVMworld
4K views51 slides
Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se... by
Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se...Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se...
Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se...Wojciech Barczyński
225 views86 slides
Building Multi-Site and Multi-OpenStack Cloud with OpenStack Cascading by
Building Multi-Site and Multi-OpenStack Cloud with OpenStack CascadingBuilding Multi-Site and Multi-OpenStack Cloud with OpenStack Cascading
Building Multi-Site and Multi-OpenStack Cloud with OpenStack CascadingJoe Huang
9.2K views28 slides

Similar to Upgrading Openstack from Kilo to Mitaka(20)

Understanding network and service virtualization by SDN Hub
Understanding network and service virtualizationUnderstanding network and service virtualization
Understanding network and service virtualization
SDN Hub3.7K views
VMworld 2016: Advanced Network Services with NSX by VMworld
VMworld 2016: Advanced Network Services with NSXVMworld 2016: Advanced Network Services with NSX
VMworld 2016: Advanced Network Services with NSX
VMworld4K views
Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se... by Wojciech Barczyński
Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se...Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se...
Effective Kubernetes - Is Kubernetes the new Linux? Is the new Application Se...
Building Multi-Site and Multi-OpenStack Cloud with OpenStack Cascading by Joe Huang
Building Multi-Site and Multi-OpenStack Cloud with OpenStack CascadingBuilding Multi-Site and Multi-OpenStack Cloud with OpenStack Cascading
Building Multi-Site and Multi-OpenStack Cloud with OpenStack Cascading
Joe Huang9.2K views
SDN & NFV Introduction - Open Source Data Center Networking by Thomas Graf
SDN & NFV Introduction - Open Source Data Center NetworkingSDN & NFV Introduction - Open Source Data Center Networking
SDN & NFV Introduction - Open Source Data Center Networking
Thomas Graf8K views
Network Virtualization & Software-defined Networking by Digicomp Academy AG
Network Virtualization & Software-defined NetworkingNetwork Virtualization & Software-defined Networking
Network Virtualization & Software-defined Networking
Digicomp Academy AG3.6K views
Quantum essex summary by Dan Wendlandt
Quantum essex summaryQuantum essex summary
Quantum essex summary
Dan Wendlandt1.4K views
June Boston openStack Summit: Preparing quantum for the data center by Kamesh Pemmaraju
June Boston openStack Summit: Preparing quantum for the data centerJune Boston openStack Summit: Preparing quantum for the data center
June Boston openStack Summit: Preparing quantum for the data center
Kamesh Pemmaraju1.2K views
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H... by NETWAYS
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
OSDC 2018 | Hardware-level data-center monitoring with Prometheus by Conrad H...
NETWAYS336 views
Evolution of kube-proxy (Brussels, Fosdem 2020) by Laurent Bernaille
Evolution of kube-proxy (Brussels, Fosdem 2020)Evolution of kube-proxy (Brussels, Fosdem 2020)
Evolution of kube-proxy (Brussels, Fosdem 2020)
Laurent Bernaille689 views
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac... by InfluxData
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
Spacecrafts Made Simple: How Loft Orbital Delivers Unparalleled Speed-to-Spac...
InfluxData1.1K views
High Availability OpenStack at PayPal - OpenStack Summit Fall Hong Kong 2013 by Scott Carlson
High Availability OpenStack at PayPal - OpenStack Summit Fall Hong Kong 2013High Availability OpenStack at PayPal - OpenStack Summit Fall Hong Kong 2013
High Availability OpenStack at PayPal - OpenStack Summit Fall Hong Kong 2013
Scott Carlson1.5K views
GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情... by VirtualTech Japan Inc.
GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...
GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...
Pivotal Cloud Foundry 2.6: A First Look by VMware Tanzu
Pivotal Cloud Foundry 2.6: A First LookPivotal Cloud Foundry 2.6: A First Look
Pivotal Cloud Foundry 2.6: A First Look
VMware Tanzu854 views
Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel... by Eduardo Patrocinio
Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...
Lessons Learned during IBM SmartCloud Orchestrator Deployment at a Large Tel...
Eduardo Patrocinio565 views

Recently uploaded

Combining Orchestration and Choreography for a Clean Architecture by
Combining Orchestration and Choreography for a Clean ArchitectureCombining Orchestration and Choreography for a Clean Architecture
Combining Orchestration and Choreography for a Clean ArchitectureThomasHeinrichs1
68 views24 slides
Web Dev - 1 PPT.pdf by
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdfgdsczhcet
52 views45 slides
"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi by
"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi
"AI Startup Growth from Idea to 1M ARR", Oleksandr UspenskyiFwdays
26 views9 slides
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum... by
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...NUS-ISS
28 views35 slides
Micron CXL product and architecture update by
Micron CXL product and architecture updateMicron CXL product and architecture update
Micron CXL product and architecture updateCXL Forum
27 views7 slides
TE Connectivity: Card Edge Interconnects by
TE Connectivity: Card Edge InterconnectsTE Connectivity: Card Edge Interconnects
TE Connectivity: Card Edge InterconnectsCXL Forum
96 views12 slides

Recently uploaded(20)

Combining Orchestration and Choreography for a Clean Architecture by ThomasHeinrichs1
Combining Orchestration and Choreography for a Clean ArchitectureCombining Orchestration and Choreography for a Clean Architecture
Combining Orchestration and Choreography for a Clean Architecture
ThomasHeinrichs168 views
Web Dev - 1 PPT.pdf by gdsczhcet
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet52 views
"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi by Fwdays
"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi
"AI Startup Growth from Idea to 1M ARR", Oleksandr Uspenskyi
Fwdays26 views
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum... by NUS-ISS
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
NUS-ISS28 views
Micron CXL product and architecture update by CXL Forum
Micron CXL product and architecture updateMicron CXL product and architecture update
Micron CXL product and architecture update
CXL Forum27 views
TE Connectivity: Card Edge Interconnects by CXL Forum
TE Connectivity: Card Edge InterconnectsTE Connectivity: Card Edge Interconnects
TE Connectivity: Card Edge Interconnects
CXL Forum96 views
Future of Learning - Yap Aye Wee.pdf by NUS-ISS
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdf
NUS-ISS38 views
Understanding GenAI/LLM and What is Google Offering - Felix Goh by NUS-ISS
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
NUS-ISS39 views
"Quality Assurance: Achieving Excellence in startup without a Dedicated QA", ... by Fwdays
"Quality Assurance: Achieving Excellence in startup without a Dedicated QA", ..."Quality Assurance: Achieving Excellence in startup without a Dedicated QA", ...
"Quality Assurance: Achieving Excellence in startup without a Dedicated QA", ...
Fwdays33 views
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu... by NUS-ISS
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
NUS-ISS32 views
"Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad... by Fwdays
"Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad..."Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad...
"Ukrainian Mobile Banking Scaling in Practice. From 0 to 100 and beyond", Vad...
Fwdays40 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10165 views
"How we switched to Kanban and how it integrates with product planning", Vady... by Fwdays
"How we switched to Kanban and how it integrates with product planning", Vady..."How we switched to Kanban and how it integrates with product planning", Vady...
"How we switched to Kanban and how it integrates with product planning", Vady...
Fwdays61 views
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor... by Vadym Kazulkin
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
Vadym Kazulkin70 views
The details of description: Techniques, tips, and tangents on alternative tex... by BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada110 views
Liqid: Composable CXL Preview by CXL Forum
Liqid: Composable CXL PreviewLiqid: Composable CXL Preview
Liqid: Composable CXL Preview
CXL Forum121 views
Five Things You SHOULD Know About Postman by Postman
Five Things You SHOULD Know About PostmanFive Things You SHOULD Know About Postman
Five Things You SHOULD Know About Postman
Postman25 views
Data-centric AI and the convergence of data and model engineering: opportunit... by Paolo Missier
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier29 views
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen... by NUS-ISS
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
NUS-ISS23 views

Upgrading Openstack from Kilo to Mitaka

  • 1. Upgrading OpenStack from Kilo to Mitaka (where is Liberty???) Milano, 28 Settembre 2017 Amedeo Salvati Francesco Pantano FASTWEB C1 – PUBLIC 1
  • 2. Fastweb and FASTcloud AGENDA Upgrading OpenStack Virtualized components The Ansible way Lesson learned FASTWEB C1 – PUBLIC 2
  • 3. Fastweb and FASTcloud FASTWEB C1 – PUBLIC 3 Fastweb S.p.A. is an Italian telecommunications company that provides landline, broadband Internet and digital television services Fastweb is fully owned by the Swiss telecommunication company Swisscom Not only a fiber company!
  • 4. DC Tier IV FASTWEB C1 – PUBLIC 4 Following Milan, a new Tier IV Data Center Surface Power Use Effectiveness (PUE) Location Milan Rome 600 m2 500 m2 1,25 1,25 Certification Tier IV Tier IV 2018
  • 5. FASTcloud FASTWEB C1 – PUBLIC 5 Starting from 2015 our cloud solution is based on OpenStack Our FASTcloud services runs on Italian jurisdiction We offer flexible solutions such as Virtual Server, Virtual Private Data Center, Private IaaS with dedicated hardware As a telecommunication company we offer to our customer our cloud services over Internet and VPN MPLS
  • 6. Business requirements to upgrade to mitaka Double Jump - From kilo to mitaka Need to reduce the downtime for the customers, specifically for the L3 agents FASTWEB C1 – PUBLIC 6
  • 7. The Upgrade Path FASTWEB C1 – PUBLIC 7 Possible ways to upgrade: 1. Big Bang (in-place) upgrade; 2. Side by Side clusters; 3. Control Plane side by side; 4. Rolling upgrades (upgrade levels) Have you planned a rollback path? Think about impacts: 1. On the infrastructure 2. From user side 3. From applications side
  • 9. Clustering Openstack Services FASTWEB C1 – PUBLIC 9 Provides HA for all our services Keep all services consistent building constraints Make some services clustering free could be the answer Follow the divide et impera paradigm Cons: Resource constraints make difficult the management of some services
  • 10. Virtualized components FASTWEB C1 – PUBLIC 10 RADOS GATEWAY
  • 11. Virtualized components: Galera cluster FASTWEB C1 – PUBLIC 11 Goal: 1. Be cluster free 2. Replication mode (Galera cluster) for fault tolerance High- availability service that provides: 1. High System uptime 2. No Data loss 3. Scalability for growth
  • 12. Virtualized components: Nova service FASTWEB C1 – PUBLIC 12 Nova Control Plane: a. 2 nodes in HA b. VIP to access services c. Haproxy + keepalived nova.conf [upgrade_levels] compute = kilo nova.conf [upgrade_levels] compute = liberty Pin the compute RPC version: [upgrade_levels] = X + 1 but not > 1
  • 13. Managing Openstack: The Ansible way FASTWEB C1 – PUBLIC 13 IaaS Software Host Operating System Openstack services roles Ceph rados gateway roles Reverse proxy management Upgrade to Liberty path Upgrade to Mitaka path Full-Stack Automation with Ansible Common and common-openstack roles to keep aligned the infrastructure components Playbooks to update the control plane services from kilo to liberty and from liberty to mitaka Playbooks:
  • 14. It’s time to upgrade: Planning FASTWEB C1 – PUBLIC 14 Make the Integration tests Upgrade control Plane to Mitaka Disable virtualized services from PCS Routers Rollback Upgrade Control Plane to Liberty Align the haproxy/keepalived config Add neutron auxiliary blades and switch routers Prepare all virtual environment (provision the VMs using ansible roles)
  • 15. Neutron aux mode: adding two new agents FASTWEB C1 – PUBLIC 15 neutron.conf ● dhcp_agents_per_network = 2 ● max_l3_agents_per_router= 2 dhcp_conf.ini ● enable_metadata_on_isolated_network x 3 x 2
  • 16. Neutron aux mode: moving routers FASTWEB C1 – PUBLIC 16 Aux Neutron L3 agent Neutron L3 agentCompute node “${NEUTRON_CLIENT}” l3-agent-router-[add|remove] “${AGENT_ID}” “${ROUTER_ID}” for router in $(ip netns | grep qrouter); do ip netns exec $router ip link set dev $interface down; done; Force the routers to switch
  • 17. Test critical sections: update db schema FASTWEB C1 – PUBLIC 17 Fix Neutron db Table ha_router_agent_port_bindings for duplicate entries ● Dump the entire db and replicate it on the Instance B; ● Execute the update schema for each service to test it works correctly: Ansible bool condition: when: update_schema openstack-db --service “${service}” --update Production Database Mirrored Database Instance A Instance B NovaCinder NeutronKeystone Heat Glance
  • 18. Upgrade compute nodes: the Big Picture FASTWEB C1 – PUBLIC 18
  • 19. Lesson Learned: the MTU issue FASTWEB C1 – PUBLIC 19 The MTU on the qbrXYZ and qrouter-XYZ interfaces are 1500 instead of the rest of the infrastructure where Jumbo frame is enabled neutron.conf [DEFAULT] global_physnet_mtu = 9000 ml2_conf.ini [DEFAULT] path_mtu = 9000
  • 20. Best Practices FASTWEB C1 – PUBLIC 20 ● Review the release notes for each release to learn about new, updated and deprecated parameters ● Openstack mirrored environment ● Identify critical update paths (i.e. openstack db schema update) ● Parallelize as much as possible (i.e. packages update) ● Make use of Ansible templates (ready to go to newton)
  • 22. Thanks FASTWEB C1 – PUBLIC 22 Amedeo Salvati amedeo@linux.com @amedeosalvati on twitter Francesco Pantano fmount@inventati.org @fmount9 on twitter