Q2 MeetUp
May 31st 2017
Agenda
17:30-18:15 - Check-in, food, drinks and networking
18:15-18:30 - Intro, Summit Recap and Reminders
18:30-19:10 - Operational War Stories
19:10-19:25 - Break
19:25-20:05 - Let’s Talk OpenContrail
Let’s Say Thanks To Our Sponsors
Introductions
Stacy Véronneau - EGO Slide
● Director of OpenStack Solutions and Lead OpenStack
Architect at CloudOps.
● Using public cloud resources since 2007
○ When AWS only had 3 options :)
● Started ‘exploring’ OpenStack at Folsom
● OpenStack MeetUp organizer
○ Montreal, Ottawa, Edmonton and Toronto (Co-Org)
● Speaker and Mentor at OpenStack Summit
○ Austin, Barcelona, Boston
● Lazy Man repo
○ github.com/sveronneau
A word from the OpenStack Ottawa UG Lead
Noura Daadaa
Solutions Architect at Nokia
OpenStack Day Canada
Noura Daadaa
Introduction to OpenStack User Groups
Canada - Ottawa
•OpenStack Day Canada Objective and goals are to educate and raise the
interest in OpenStack, Bring new members and individuals into the
OpenStack community , support OpenStack and cloud community in
our Country, Community and Globally.
•OpenStack Canada – Ottawa User Group : Created on October 2016
Save the Date !
October 2017 .
https://groups.openstack.org/groups/canada-ottawa
Audience :
Cloud Architects
Engineers
Cloud Application Developer
IT Managers
OpenStack User / Operator,
SysAdmin
Upstream Developer ,, etc
Around 250 -300 Attendees
OpenStack Days Canada – Ottawa : Lead by OpenStack Ottawa User Group
OpenStack Day Canada – Ottawa 2017
OpenStack Canada Day – Ottawa Organization Committee
Noura Daadaa : OpenStack User Group Ottawa Leader – Technical Organization & Steering Committee
Kevin Gray : Technical Organization & Steering Committee
Jason Sones : Technical Organization & Steering Committee
Adam Nadeau : Program Committee
Tom Brewer : Program Committee
For information please contact Noura Daadaa:
openstackottawa@gmail.com
n.daadaa.openstack@gmail.com
https://openstackid.org/accounts/user/login
Sign Up Today ! Keep up to date !
Now Let’s Talk OpenStack!
Summit Recap and Reminders
Stacy Véronneau - CloudOps
Summit Recap
● 5000+ Attendees
● 1014 Companies Represented
● 63 Countries Represented
● 750+ Sessions
● Average Day 1 Keynotes
○ Except the following:
■ U.S. Army Cyber School
● Going from 3 borrowed servers connected to users via CAT6
cables running through a drop ceiling to a 2000-core cluster
backed by a 4PB Ceph array that is 100% code-driven.
Summit Recap
● SuperUser award goes to
○ Paddy Power Betfair (Online gambling)
■ Built a true dev/0ps and continuous delivery model for developers using OpenStack
as the middleware. They’ve also migrated 25 percent of production applications onto
OpenStack in a year, for over 100 applications total. They grew from 500
deployments a week to over 1,000 deployments a day using the OpenStack APIs to
increase time-to-market.
(https://betsandbits.com/2016/10/25/openstack-reference-architecture/)
○ UKCloud (Government)
■ The leading infrastructure-as-a-service (IaaS) provider to the United Kingdom
public sector with a 38 percent share on G-Cloud–the framework set up by the UK
government for IT procurement. From online tax returns with HMRC, complex data
analytics at Genomics England and integrated vehicle and driver records at the
Driver and Vehicle Licensing Agency (all hosted on UKCloud), pooling central
government resources and moving them to the cloud has resulted in £600 million in
savings, leading to the UK being recognized by the United Nations as the most
digitally advanced government in the world.
Summit Recap
● Day 2 Keynotes and demos saved the day!
○ Great demos (some working, some not)
○ Interop challenge
■ A big ‘Woot!’ to MO from Vexxhost
○ Edward Snowden live
■ Privacy in an always connected world
Summit Recap
● Day 2 is still a challenge
for many OpenStack
deployments.
● Almost more monitoring
vendors than storage one at
this summit.
Summit Recap
● What application tools run on your OpenStack???
○ K8s-45% ; OpenShift-18% ; CloudFoundry-18% ; Built Our Own-17% ;
Mesos-14% ; Docker Swarm 14% ; Other 17%
● PCaaS is getting more and more traction
○ Hosted private or remotely managed private clouds
● All the cool videos from the Summit can be found at:
○ https://www.openstack.org/videos/summits/boston-2017
Summit Recap
● OpenStack Canada
Slack Channel
Gathering and Party
Crashing :)
Summit Recap
● Best T-Shirt???
Reminders
● Next MeetUp will be in
September
○ Paul Belanger will talk CI/CD and
the life of an upstreamer
○ But we need more!
● Submit your talk proposal via
MeetUp page
Reminders
● Link to today’s presentations will be
shared on the Meetup page.
● Join your fellow Stackers on
○ http://openstack-canada-slack-invite.herokuapp.com
○ Pages section on the MeetUp site
New tools addition for the community
● GitHub Repo
○ https://github.com/openstack-canada-repo
● Etherpad
○ https://etherpad.openstack.org/p/openstack-canada
Stacker Talk
Operational War Stories
Mohammed Naser - VEXXHOST
● Mohammed Naser
● Deploying and contributing since 2011
● CEO @ VEXXHOST, Inc.
● OpenStack Corporate Sponsor & Infrastructure Donor
● Architected many cloud deployments, migrations
● Find me on Twitter: @_mnaser
● Always accessible on Freenode IRC: mnaser (or the Canadian users Slack!)
So, who are you again?
No clouds were harmed in the
production of this presentation
Upgrades
● Production installs absolutely need 3 controllers
● Load balancers are your life line
● Automation is an absolute must (I personally recommend Ansible)!
● A single controller should have enough capacity to run a single OpenStack service
● The magic mix:
○ Backups, backups, backups!
○ Shutdown a specific OpenStack service from all nodes except a single one
○ Upgrade service on that single node with automation (you tested this before, right?)
■ If upgrade is success, complete process on other controllers, re-add to load balancers
■ If upgrade is a fail, collect all logs, shutdown upgrade controller, restore database, startup old
controllers, leave upgrade controller in place for further inspection
○ Go out and celebrate, only to be paged in that something stopped working.
Testing & Monitoring
● Upgrade went great, or so you thought.
● OpenStack is a huge system, a simple smoke test of “hey I can create a VM” will likely not cover
the most common use cases (or it could in a private cloud, but context of a public cloud differs).
● Take advantage of Tempest
● Tempest is the most advanced set of OpenStack tests covering many different tools. Run that
before and after upgrades and check on any failures.
● Take it a step further and run it on a nightly basis to see what sort of things might have changed
in your environment
● Really have some spare time while running OpenStack (maybe you should speak about how you
do that next time?), run Rally against your cloud to benchmark it or run smoke tests.
● Now, you think you can get some peace as everything is monitored, but management wants
centralized storage...
Centralized Storage
● There are many great open source centralized storage systems, I Ceph
● You might have the luxury of redeploying your cloud. We can’t really do that.
● With local storage, instances are stored as qcow2 files in /var/lib/nova/instances/<uuid>/..
● With Ceph, instances are stored in <pool_name>/<uuid>_disk
● Quick guide to changing the entire architecture of your OpenStack cloud (bonne chance!):
○ Deploy a Ceph cluster to host all your infrastructure
○ Configure new compute nodes which use that Ceph cluster for storage
○ Disable all the compute nodes not using Ceph
○ Convert all your images from QCOW2 to RAW in Glance
○ WARNING: close your eyes and breath heavily now
■ Shut down VMs, use qemu-img to convert qcow2 disks directly to Ceph matching the correct
name, edit Nova database to point to a new compute node that fits, start up the VM again.
● Now you have centralized storage but now you have a bunch of machines with unused drives.
Hyperconverged infrastructure
● It is possible! We run it on our public cloud and all of our customer private clouds.
● Hyperconverged Ceph + Compute for OpenStack
● Few things to keep in mind:
○ We use SSD storage only, don’t think this would make sense for non-SSD storage.
○ SSD specific note: Don’t cheap out on drives. Seriously. Intel SSDs work best in our experience.
○ Keep your OSD per machine count small. Don’t run 24 OSDs and 400 VMs. Be a bit realistic.
○ CPU pinning sounds good on paper, until your cluster sustains high load, pinned CPUs are waiting for IO
and the OSD can’t do anything.
○ Use `cgroups` instead to give OSDs fully dedicated cores on the machine.
● This architecture minimizes the hardware you have to setup and makes the cloud a lot easier to
scale.
● Everything is going great, except your Linux distro is being a PITA?
Switching operating systems
● We originally built our cloud off of the Ubuntu Cloud Archives
● David Moreau Simard presented RDO which is (IMHO) the best way to package OpenStack
● Exciting. We can fix bugs now!
● So now you got a hyperconverged infrastructure with open source technology and live migration?
○ Deploy a few CentOS based nodes using RDO packaging with Ceph installed (side note: use the SIG pkgs!)
○ The RPC layer of OpenStack communicates with no problems, it doesn’t care about the operating system.
○ Run a live migration and VMs will be populated on the new operating system and everything will be great.
○ Just kidding, you didn’t think it was easy?
■ It’s a mess. AppArmor is security model on Ubuntu, SELinux is the model on CentOS, many patches
later and questionable decisions were done to make live migrations successful! I’d recommend cold
migrations instead (but downtime, sigh!)
● Now you want to migrate your control plane?
Migrating OpenStack control plane
● Many reasons you want to do this: change deployment tool, upgrades, hardware replacement
● For most of the OpenStack services which are stateless, you just run the binaries and update your
load balancers
● The “fun” part comes with the underlying stateful infrastructure of OpenStack such as RabbitMQ
and Galera
● Fun stories (aka: “that one time at band camp”):
○ That time some clients interface order were reversed after we migrated our Galera cluster
○ Orchestrating a RabbitMQ cluster cutover with the least downtime possible (chicken and egg problem)
○ RabbitMQ melt down featuring Neutron self destruction
● Closing thought: Remove your old services. Seriously. Customer once started up a
decommissioned controller node so Nova compute and Neutron agents were getting fed different
information from two different databases with a race condition.
Everything works in
OpenStack, except
for when it doesn’t.
Thanks!
Reach out:
Email: mnaser@vexxhost.com
IRC: mnaser @ freenode
Twitter: @_mnaser
LinkedIn: linkedin.com/in/mdnaser
Break Time!
Stacker Talk
Let’s Talk OpenContrail!
Stuart Mackie - Juniper Networks
OpenContrail
Presentation from Stuart Mackie
● https://www.slideshare.net/StacyVronneau/openstack-meetup-opencontrail-presentation
Thanks Everyone! --- See you in Q3

OpenStack Ottawa Q2 MeetUp - May 31st 2017

  • 1.
  • 2.
    Agenda 17:30-18:15 - Check-in,food, drinks and networking 18:15-18:30 - Intro, Summit Recap and Reminders 18:30-19:10 - Operational War Stories 19:10-19:25 - Break 19:25-20:05 - Let’s Talk OpenContrail
  • 3.
    Let’s Say ThanksTo Our Sponsors
  • 4.
  • 5.
    Stacy Véronneau -EGO Slide ● Director of OpenStack Solutions and Lead OpenStack Architect at CloudOps. ● Using public cloud resources since 2007 ○ When AWS only had 3 options :) ● Started ‘exploring’ OpenStack at Folsom ● OpenStack MeetUp organizer ○ Montreal, Ottawa, Edmonton and Toronto (Co-Org) ● Speaker and Mentor at OpenStack Summit ○ Austin, Barcelona, Boston ● Lazy Man repo ○ github.com/sveronneau
  • 6.
    A word fromthe OpenStack Ottawa UG Lead Noura Daadaa Solutions Architect at Nokia
  • 7.
    OpenStack Day Canada NouraDaadaa Introduction to OpenStack User Groups Canada - Ottawa
  • 8.
    •OpenStack Day CanadaObjective and goals are to educate and raise the interest in OpenStack, Bring new members and individuals into the OpenStack community , support OpenStack and cloud community in our Country, Community and Globally. •OpenStack Canada – Ottawa User Group : Created on October 2016
  • 10.
    Save the Date! October 2017 .
  • 11.
    https://groups.openstack.org/groups/canada-ottawa Audience : Cloud Architects Engineers CloudApplication Developer IT Managers OpenStack User / Operator, SysAdmin Upstream Developer ,, etc Around 250 -300 Attendees OpenStack Days Canada – Ottawa : Lead by OpenStack Ottawa User Group
  • 12.
    OpenStack Day Canada– Ottawa 2017 OpenStack Canada Day – Ottawa Organization Committee Noura Daadaa : OpenStack User Group Ottawa Leader – Technical Organization & Steering Committee Kevin Gray : Technical Organization & Steering Committee Jason Sones : Technical Organization & Steering Committee Adam Nadeau : Program Committee Tom Brewer : Program Committee
  • 13.
    For information pleasecontact Noura Daadaa: openstackottawa@gmail.com n.daadaa.openstack@gmail.com https://openstackid.org/accounts/user/login Sign Up Today ! Keep up to date !
  • 14.
    Now Let’s TalkOpenStack!
  • 15.
    Summit Recap andReminders Stacy Véronneau - CloudOps
  • 16.
    Summit Recap ● 5000+Attendees ● 1014 Companies Represented ● 63 Countries Represented ● 750+ Sessions ● Average Day 1 Keynotes ○ Except the following: ■ U.S. Army Cyber School ● Going from 3 borrowed servers connected to users via CAT6 cables running through a drop ceiling to a 2000-core cluster backed by a 4PB Ceph array that is 100% code-driven.
  • 17.
    Summit Recap ● SuperUseraward goes to ○ Paddy Power Betfair (Online gambling) ■ Built a true dev/0ps and continuous delivery model for developers using OpenStack as the middleware. They’ve also migrated 25 percent of production applications onto OpenStack in a year, for over 100 applications total. They grew from 500 deployments a week to over 1,000 deployments a day using the OpenStack APIs to increase time-to-market. (https://betsandbits.com/2016/10/25/openstack-reference-architecture/) ○ UKCloud (Government) ■ The leading infrastructure-as-a-service (IaaS) provider to the United Kingdom public sector with a 38 percent share on G-Cloud–the framework set up by the UK government for IT procurement. From online tax returns with HMRC, complex data analytics at Genomics England and integrated vehicle and driver records at the Driver and Vehicle Licensing Agency (all hosted on UKCloud), pooling central government resources and moving them to the cloud has resulted in £600 million in savings, leading to the UK being recognized by the United Nations as the most digitally advanced government in the world.
  • 18.
    Summit Recap ● Day2 Keynotes and demos saved the day! ○ Great demos (some working, some not) ○ Interop challenge ■ A big ‘Woot!’ to MO from Vexxhost ○ Edward Snowden live ■ Privacy in an always connected world
  • 19.
    Summit Recap ● Day2 is still a challenge for many OpenStack deployments. ● Almost more monitoring vendors than storage one at this summit.
  • 20.
    Summit Recap ● Whatapplication tools run on your OpenStack??? ○ K8s-45% ; OpenShift-18% ; CloudFoundry-18% ; Built Our Own-17% ; Mesos-14% ; Docker Swarm 14% ; Other 17% ● PCaaS is getting more and more traction ○ Hosted private or remotely managed private clouds ● All the cool videos from the Summit can be found at: ○ https://www.openstack.org/videos/summits/boston-2017
  • 21.
    Summit Recap ● OpenStackCanada Slack Channel Gathering and Party Crashing :)
  • 22.
  • 23.
    Reminders ● Next MeetUpwill be in September ○ Paul Belanger will talk CI/CD and the life of an upstreamer ○ But we need more! ● Submit your talk proposal via MeetUp page
  • 24.
    Reminders ● Link totoday’s presentations will be shared on the Meetup page. ● Join your fellow Stackers on ○ http://openstack-canada-slack-invite.herokuapp.com ○ Pages section on the MeetUp site
  • 25.
    New tools additionfor the community ● GitHub Repo ○ https://github.com/openstack-canada-repo ● Etherpad ○ https://etherpad.openstack.org/p/openstack-canada
  • 26.
    Stacker Talk Operational WarStories Mohammed Naser - VEXXHOST
  • 27.
    ● Mohammed Naser ●Deploying and contributing since 2011 ● CEO @ VEXXHOST, Inc. ● OpenStack Corporate Sponsor & Infrastructure Donor ● Architected many cloud deployments, migrations ● Find me on Twitter: @_mnaser ● Always accessible on Freenode IRC: mnaser (or the Canadian users Slack!) So, who are you again?
  • 28.
    No clouds wereharmed in the production of this presentation
  • 29.
    Upgrades ● Production installsabsolutely need 3 controllers ● Load balancers are your life line ● Automation is an absolute must (I personally recommend Ansible)! ● A single controller should have enough capacity to run a single OpenStack service ● The magic mix: ○ Backups, backups, backups! ○ Shutdown a specific OpenStack service from all nodes except a single one ○ Upgrade service on that single node with automation (you tested this before, right?) ■ If upgrade is success, complete process on other controllers, re-add to load balancers ■ If upgrade is a fail, collect all logs, shutdown upgrade controller, restore database, startup old controllers, leave upgrade controller in place for further inspection ○ Go out and celebrate, only to be paged in that something stopped working.
  • 30.
    Testing & Monitoring ●Upgrade went great, or so you thought. ● OpenStack is a huge system, a simple smoke test of “hey I can create a VM” will likely not cover the most common use cases (or it could in a private cloud, but context of a public cloud differs). ● Take advantage of Tempest ● Tempest is the most advanced set of OpenStack tests covering many different tools. Run that before and after upgrades and check on any failures. ● Take it a step further and run it on a nightly basis to see what sort of things might have changed in your environment ● Really have some spare time while running OpenStack (maybe you should speak about how you do that next time?), run Rally against your cloud to benchmark it or run smoke tests. ● Now, you think you can get some peace as everything is monitored, but management wants centralized storage...
  • 31.
    Centralized Storage ● Thereare many great open source centralized storage systems, I Ceph ● You might have the luxury of redeploying your cloud. We can’t really do that. ● With local storage, instances are stored as qcow2 files in /var/lib/nova/instances/<uuid>/.. ● With Ceph, instances are stored in <pool_name>/<uuid>_disk ● Quick guide to changing the entire architecture of your OpenStack cloud (bonne chance!): ○ Deploy a Ceph cluster to host all your infrastructure ○ Configure new compute nodes which use that Ceph cluster for storage ○ Disable all the compute nodes not using Ceph ○ Convert all your images from QCOW2 to RAW in Glance ○ WARNING: close your eyes and breath heavily now ■ Shut down VMs, use qemu-img to convert qcow2 disks directly to Ceph matching the correct name, edit Nova database to point to a new compute node that fits, start up the VM again. ● Now you have centralized storage but now you have a bunch of machines with unused drives.
  • 32.
    Hyperconverged infrastructure ● Itis possible! We run it on our public cloud and all of our customer private clouds. ● Hyperconverged Ceph + Compute for OpenStack ● Few things to keep in mind: ○ We use SSD storage only, don’t think this would make sense for non-SSD storage. ○ SSD specific note: Don’t cheap out on drives. Seriously. Intel SSDs work best in our experience. ○ Keep your OSD per machine count small. Don’t run 24 OSDs and 400 VMs. Be a bit realistic. ○ CPU pinning sounds good on paper, until your cluster sustains high load, pinned CPUs are waiting for IO and the OSD can’t do anything. ○ Use `cgroups` instead to give OSDs fully dedicated cores on the machine. ● This architecture minimizes the hardware you have to setup and makes the cloud a lot easier to scale. ● Everything is going great, except your Linux distro is being a PITA?
  • 33.
    Switching operating systems ●We originally built our cloud off of the Ubuntu Cloud Archives ● David Moreau Simard presented RDO which is (IMHO) the best way to package OpenStack ● Exciting. We can fix bugs now! ● So now you got a hyperconverged infrastructure with open source technology and live migration? ○ Deploy a few CentOS based nodes using RDO packaging with Ceph installed (side note: use the SIG pkgs!) ○ The RPC layer of OpenStack communicates with no problems, it doesn’t care about the operating system. ○ Run a live migration and VMs will be populated on the new operating system and everything will be great. ○ Just kidding, you didn’t think it was easy? ■ It’s a mess. AppArmor is security model on Ubuntu, SELinux is the model on CentOS, many patches later and questionable decisions were done to make live migrations successful! I’d recommend cold migrations instead (but downtime, sigh!) ● Now you want to migrate your control plane?
  • 35.
    Migrating OpenStack controlplane ● Many reasons you want to do this: change deployment tool, upgrades, hardware replacement ● For most of the OpenStack services which are stateless, you just run the binaries and update your load balancers ● The “fun” part comes with the underlying stateful infrastructure of OpenStack such as RabbitMQ and Galera ● Fun stories (aka: “that one time at band camp”): ○ That time some clients interface order were reversed after we migrated our Galera cluster ○ Orchestrating a RabbitMQ cluster cutover with the least downtime possible (chicken and egg problem) ○ RabbitMQ melt down featuring Neutron self destruction ● Closing thought: Remove your old services. Seriously. Customer once started up a decommissioned controller node so Nova compute and Neutron agents were getting fed different information from two different databases with a race condition.
  • 36.
    Everything works in OpenStack,except for when it doesn’t.
  • 37.
    Thanks! Reach out: Email: mnaser@vexxhost.com IRC:mnaser @ freenode Twitter: @_mnaser LinkedIn: linkedin.com/in/mdnaser
  • 38.
  • 39.
    Stacker Talk Let’s TalkOpenContrail! Stuart Mackie - Juniper Networks
  • 40.
    OpenContrail Presentation from StuartMackie ● https://www.slideshare.net/StacyVronneau/openstack-meetup-opencontrail-presentation
  • 41.
    Thanks Everyone! ---See you in Q3