SlideShare a Scribd company logo
1 of 15
Contrail at AllegroGroup
Plan / Prepare / Production (3P)
History
➢ What we had:
○ Two separate cloud environments (Essex And Havana)
○ Floating IP in Essex and Flat Network with VLANS in
Havana
○ Network complicity in Havana
○ Network performance problems in Havana
Goals (Plan)
➢ Ease to maintain and growth
➢ Network simplicity
➢ Network isolation for tenants
➢ Floating IP and flat network
➢ New region in new DC
Fabric (Prepare - 2 weeks)
➢ Easy and fast deployment (couple of corrections in fabric
scripts), we used 1.20 version at that time
➢ Environment ready for test (adding new HV from “any”
location of server room)
➢ Basic performance tests, LBaaS
Where to find quick implementation tools:
http://www.opencontrail.org/opencontrail-quick-start-guide/
Our way
➢ Own puppet manifests based on available ones
➢ Reasons:
○ Existing infrastructure
○ Customized deployment
○ More work at the beginning, less problems later
○ Easy procedure to add hosts (compute nodes,
controllers)
○ Building new region in near future
Implementation
➢ We had everything prepared for version 1.20, and then we
get 2.01 production version ( what to do ?! )
➢ Environment deploying (OpenStack with Contrail,
One Region- 2 CC; 3 CoC; 50 HV), during DC migration
amount of computes increasing - target 250
➢ Move tenants/users/quota from old environment to new
○ we used keystone server builded from scratch and did
upgrade then pump data to Icehouse/Contrail (issue
missing users), qouta were migrated as SQL tables,
exporter.py (script to marge users/tenants/qoutas)
between regions (target for future - one keystone)
Implementation
➢ DNS - we are using Designate with two handlers (one for
Floating IPs, second one for Fixed routable IPs)
➢ Required Image modifications (target ansible automation
build)
➢ Two days before production we did update to latest
available packages from release 2.01
➢ Breaking environment
➢ Clients at new environment
Results
➢ 500 VMs spawned simultaneous
➢ Network performance
Problems
➢ cassandra (we increased number of nodes 3 => 5),
configuration tuning (TTLs in contrail-collector.conf),
compaction throughput, migration of cassandra data
to raid0 SSD disks
➢ OpenFiles issue (user, supervisor, init)
➢ Collector was flood by data from computes
iptables -A OUTPUT -p tcp --dport 8086 -m string --algo bm --
string "flowuuid" -j DROP
Problems
➢ When 500K flow is not enough (vr_flow_entries, vr_oflow_entries)
➢ Flow on Hold issue
➢ Vrouter CPU consumption to high compare to VM
(TBB_THREAD_COUNT /etc/contrail/supervisord_vrouter.conf)
Problems
➢ Rebuild instance - interface was deleted after VM was
respawn
➢ Lack of support for ironic - we will build region for ironic
➢ Disabled Tenant - Not able to login to Contrail UI (keystone
2.0)
➢ Tuning configuration files required
➢ Metadata packages not sent in one session
➢ RBAC for contrail UI
Environment expansion and further plans
➢ 1350 VMs on 150 HV in one DC at this moment
➢ Second region on it’s way
➢ 250-300 HV per region
➢ Migration from Essex and Havana
➢ OpenStack and Contrail upgrades
Q/A?
Thank you!
Check us: allegrotech.io
Join us: kariera.allegro.pl
Twitter: allegrotechblog
e-commerce full of technology

More Related Content

What's hot

What's hot (20)

Tungsten University: Introduction to Continuent Tungsten 2.0
Tungsten University: Introduction to Continuent Tungsten 2.0Tungsten University: Introduction to Continuent Tungsten 2.0
Tungsten University: Introduction to Continuent Tungsten 2.0
 
OVN DBs HA with scale test
OVN DBs HA with scale testOVN DBs HA with scale test
OVN DBs HA with scale test
 
Monitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebulaMonitoring Large-scale Cloud Infrastructures with OpenNebula
Monitoring Large-scale Cloud Infrastructures with OpenNebula
 
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 InstanceExtreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance
 
OVN - Basics and deep dive
OVN - Basics and deep diveOVN - Basics and deep dive
OVN - Basics and deep dive
 
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan HoracekOpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
 
See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...See what happened with real time kvm when building real time cloud pezhang@re...
See what happened with real time kvm when building real time cloud pezhang@re...
 
Accelerated dataplanes integration and deployment
Accelerated dataplanes integration and deploymentAccelerated dataplanes integration and deployment
Accelerated dataplanes integration and deployment
 
Monitoring of OpenNebula installations
Monitoring of OpenNebula installationsMonitoring of OpenNebula installations
Monitoring of OpenNebula installations
 
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems IncXPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc
 
Quickly Debug VM Failures in OpenStack
Quickly Debug VM Failures in OpenStackQuickly Debug VM Failures in OpenStack
Quickly Debug VM Failures in OpenStack
 
Automating linux network performance testing
Automating linux network performance testingAutomating linux network performance testing
Automating linux network performance testing
 
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
 
Geneve
GeneveGeneve
Geneve
 
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon GarciaOpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
OpenNebulaConf2015 1.09.02 Installgems Add-on - Alvaro Simon Garcia
 
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
 
Meetup 23 - 02 - OVN - The future of networking in OpenStack
Meetup 23 - 02 - OVN - The future of networking in OpenStackMeetup 23 - 02 - OVN - The future of networking in OpenStack
Meetup 23 - 02 - OVN - The future of networking in OpenStack
 
WebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck ThreadsWebLogic Stability; Detect and Analyse Stuck Threads
WebLogic Stability; Detect and Analyse Stuck Threads
 
Kubernetes Intro
Kubernetes IntroKubernetes Intro
Kubernetes Intro
 
Ovn vancouver
Ovn vancouverOvn vancouver
Ovn vancouver
 

Similar to Contrail at AllegroGroup

OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan KoomanOpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebula Project
 
Using OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting EnvironmentUsing OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting Environment
OpenStack Foundation
 
Vizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceVizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users Conference
Isaac Christoffersen
 

Similar to Contrail at AllegroGroup (20)

OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
OpenStack cloud for ConoHa, Z.com and GMO AppsCloud in okinawa opendays 2015 ...
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan KoomanOpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
OpenNebulaConf 2014 - ONE BIT to rule them all - Stefan Kooman
 
OpenNebula Conf 2014 | ONE BIT to rule them all - Stefan Kooman
OpenNebula Conf 2014 | ONE BIT to rule them all - Stefan KoomanOpenNebula Conf 2014 | ONE BIT to rule them all - Stefan Kooman
OpenNebula Conf 2014 | ONE BIT to rule them all - Stefan Kooman
 
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
Kubermatic How to Migrate 100 Clusters from On-Prem to Google Cloud Without D...
 
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without DowntimeHow to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
How to Migrate 100 Clusters from On-Prem to Google Cloud Without Downtime
 
Introduction of private cloud in LINE - OpenStack最新情報セミナー(2019年2月)
Introduction of private cloud in LINE - OpenStack最新情報セミナー(2019年2月)Introduction of private cloud in LINE - OpenStack最新情報セミナー(2019年2月)
Introduction of private cloud in LINE - OpenStack最新情報セミナー(2019年2月)
 
GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...
GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...
GMOインターネット様 発表「OpenStackのモデルの最適化とConoHa, Z.comとGMOアプリクラウドへの適用」 - OpenStack最新情...
 
Open stack networking_101_part-1
Open stack networking_101_part-1Open stack networking_101_part-1
Open stack networking_101_part-1
 
Boyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experienceBoyan Krosnov - Building a software-defined cloud - our experience
Boyan Krosnov - Building a software-defined cloud - our experience
 
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
WebCamp 2016: DevOps. Николай Дойков: Опыт создания клауда для потокового вид...
 
VMworld 2013: VMware Mirage Storage and Network Deduplication, DEMYSTIFIED
VMworld 2013: VMware Mirage Storage and Network Deduplication, DEMYSTIFIED VMworld 2013: VMware Mirage Storage and Network Deduplication, DEMYSTIFIED
VMworld 2013: VMware Mirage Storage and Network Deduplication, DEMYSTIFIED
 
Stacks and Layers: Integrating P4, C, OVS and OpenStack
Stacks and Layers: Integrating P4, C, OVS and OpenStackStacks and Layers: Integrating P4, C, OVS and OpenStack
Stacks and Layers: Integrating P4, C, OVS and OpenStack
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4
 
London Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERNLondon Ceph Day: Ceph at CERN
London Ceph Day: Ceph at CERN
 
OpenContrail deployment experience
OpenContrail deployment experienceOpenContrail deployment experience
OpenContrail deployment experience
 
Using OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting EnvironmentUsing OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting Environment
 
Vizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users ConferenceVizuri Exadata East Coast Users Conference
Vizuri Exadata East Coast Users Conference
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
Openstack summit 2015
Openstack summit 2015Openstack summit 2015
Openstack summit 2015
 

Recently uploaded

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Contrail at AllegroGroup

  • 1. Contrail at AllegroGroup Plan / Prepare / Production (3P)
  • 2. History ➢ What we had: ○ Two separate cloud environments (Essex And Havana) ○ Floating IP in Essex and Flat Network with VLANS in Havana ○ Network complicity in Havana ○ Network performance problems in Havana
  • 3. Goals (Plan) ➢ Ease to maintain and growth ➢ Network simplicity ➢ Network isolation for tenants ➢ Floating IP and flat network ➢ New region in new DC
  • 4. Fabric (Prepare - 2 weeks) ➢ Easy and fast deployment (couple of corrections in fabric scripts), we used 1.20 version at that time ➢ Environment ready for test (adding new HV from “any” location of server room) ➢ Basic performance tests, LBaaS Where to find quick implementation tools: http://www.opencontrail.org/opencontrail-quick-start-guide/
  • 5. Our way ➢ Own puppet manifests based on available ones ➢ Reasons: ○ Existing infrastructure ○ Customized deployment ○ More work at the beginning, less problems later ○ Easy procedure to add hosts (compute nodes, controllers) ○ Building new region in near future
  • 6. Implementation ➢ We had everything prepared for version 1.20, and then we get 2.01 production version ( what to do ?! ) ➢ Environment deploying (OpenStack with Contrail, One Region- 2 CC; 3 CoC; 50 HV), during DC migration amount of computes increasing - target 250 ➢ Move tenants/users/quota from old environment to new ○ we used keystone server builded from scratch and did upgrade then pump data to Icehouse/Contrail (issue missing users), qouta were migrated as SQL tables, exporter.py (script to marge users/tenants/qoutas) between regions (target for future - one keystone)
  • 7. Implementation ➢ DNS - we are using Designate with two handlers (one for Floating IPs, second one for Fixed routable IPs) ➢ Required Image modifications (target ansible automation build) ➢ Two days before production we did update to latest available packages from release 2.01 ➢ Breaking environment ➢ Clients at new environment
  • 8. Results ➢ 500 VMs spawned simultaneous ➢ Network performance
  • 9. Problems ➢ cassandra (we increased number of nodes 3 => 5), configuration tuning (TTLs in contrail-collector.conf), compaction throughput, migration of cassandra data to raid0 SSD disks ➢ OpenFiles issue (user, supervisor, init) ➢ Collector was flood by data from computes iptables -A OUTPUT -p tcp --dport 8086 -m string --algo bm -- string "flowuuid" -j DROP
  • 10. Problems ➢ When 500K flow is not enough (vr_flow_entries, vr_oflow_entries) ➢ Flow on Hold issue ➢ Vrouter CPU consumption to high compare to VM (TBB_THREAD_COUNT /etc/contrail/supervisord_vrouter.conf)
  • 11. Problems ➢ Rebuild instance - interface was deleted after VM was respawn ➢ Lack of support for ironic - we will build region for ironic ➢ Disabled Tenant - Not able to login to Contrail UI (keystone 2.0) ➢ Tuning configuration files required ➢ Metadata packages not sent in one session ➢ RBAC for contrail UI
  • 12. Environment expansion and further plans ➢ 1350 VMs on 150 HV in one DC at this moment ➢ Second region on it’s way ➢ 250-300 HV per region ➢ Migration from Essex and Havana ➢ OpenStack and Contrail upgrades
  • 13. Q/A?
  • 15. Check us: allegrotech.io Join us: kariera.allegro.pl Twitter: allegrotechblog e-commerce full of technology