Attilla de Groot | Sr. Systems Engineer
Cumulus Networks
Network change before beer
after
CONFIDENTIAL© 2019 Cumulus Networks 2
What problem are we solving?
• Human error
• Software bugs features
• Slow change cycle
• Resource issues
• Scalability constraints
CONFIDENTIAL© 2019 Cumulus Networks 3
What is CI/CD
Cumulus Networks Confidential
Continuous Integration (CI)
A system where all changes are
automatically tested before being
pushed to production or seen by others
Continuous Deployment (CD)
Built on a CI system where changes
are automatically pushed to production
after tests past, often multiple times per
day
Why aren’t you
doing this?
Not for
everyone
Continuous Deployment (CD)
4
Automated testing
Validation
Cumulus Networks Confidential
Linting tests
• Code validation
• Test Yaml / Jinja2
• Enforce a style
• Easy troubleshooting
Unit and System tests
• Individual tests
• Is the intended configuration
deployed?
• Unit alone testing not useful
• Combine unit and system tests
• Is my configured BGP session up?
• Am I receiving routes?
AND
• Is my service redundant?
• Does my application still work?
CONFIDENTIAL© 2019 Cumulus Networks 5
Cumulus Linux Network Operating System
CLI
Cumulus Linux is Linux
STP +
MLAG
OSPF +
BGP
SNMP +
Telemetry
Ansible +
Salt
VMs +
Containers
CONFIDENTIAL© 2019 Cumulus Networks 6
Cumulus Linux Network Operating System
CLI
Cumulus Vx: Modern Network Simulation
STP +
MLAG
OSPF +
BGP
SNMP +
Telemetry
Ansible +
Salt
VMs +
Containers
 Fully featured, no hardware limitations
 Layer 2, Multicast, VxLAN, ACLs
 Scale to 128+ ports per VM
 Low 768mb memory footprint
 Fast boot (1-2 minutes average)
 Rename interfaces to match production
 Easily build 1:1 production replica
 Plan, test, train and troubleshoot
ESX / VirtualBox / KVM
Cumulus
Vx
CONFIDENTIAL© 2019 Cumulus Networks 7
Automated testing
Testing infrastructure
Traditional networking
•Testing environment
•Physical lab
•Virtualization environment
•DevOPS integration
•Proprietary modules
•Vendor tools
Cumulus Networks Confidential
Cumulus Linux
•Testing environment
•Physical lab
•Cumulus VX
•DevOPS integration
•Native modules
•Vagrant, Virtualbox, Libvirt / KVM
CONFIDENTIAL© 2019 Cumulus Networks 8
Infrastructure as Code
Implementing CI/CD
•Home Grown
•Python / Vagrant
•Common tools
•Jenkins
•Travis CI
•Bamboo
•Gitlab
Cumulus Networks Confidential
GitLab
CONFIDENTIAL© 2019 Cumulus Networks 9
Infrastructure validation
Tools
•Homegrown
•Python
•Behave
•Ansible
•Prometheus / Zabbix / xxx
•Vendor tools
•Stackstorm
•Batfish
•Veriflow
•Cumulus NetQ
Cumulus Networks Confidential
CONFIDENTIAL© 2019 Cumulus Networks 10
End-to-end visibility with Cumulus NetQ
Cumulus NetQ is a Linux streaming telemetry agent and NoSQL database
NetQ Server
Records and streams Linux kernel events (+ more)
Supported on Cumulus Linux, Ubuntu, Debian, RHEL and CentOS
Data, events, validation across compute + network
CONFIDENTIAL© 2019 Cumulus Networks 11
Cumulus NetQ: Fabric API
One database,
many interfaces
CLI
REST API
GUI
CONFIDENTIAL© 2019 Cumulus Networks 12
Infrastructure as Code
CI/CD Pipeline
Cumulus Networks Confidential
Infrastructure
1.0
Infrastructure
1.1
Git change
push
Automated
Testing
Build tool pipeline
Successful
validation
Validation
failed
Infrastructure
1.0
Successful
validation
CONFIDENTIAL© 2019 Cumulus Networks 13
Infrastructure as Code
Gitlab Pipeline
•Stages, before/after script
•Stages run in order on success
•*_script run at each stage
•Script defines testing steps
•Git repository cloned on build
•Script step run in environment
•If step fails, validation fails
•Simple bash scripts can be added
•Before/After scripts for setup
•Vagrant up / vagrant destroy
Cumulus Networks Confidential
stages:
- staging
- production
staging:
tags:
- staging
before_script:
- cd automation
stage: staging
script:
- 'ansible-playbook deploy.yml'
- sleep 25
- netq check bgp
- netq check mtu
- netq check vxlan
production:
tags:
- production
before_script:
- cd automation
stage: production
when: manual
script:
- 'ansible-playbook deploy.yml'
- sleep 10
- netq check bgp
CONFIDENTIAL© 2019 Cumulus Networks 14
Future work
More Infra as code
More Possibility
 Single source of truth
 RBAC with CI/CD
 More git magic
 Pre- and post checks
CONFIDENTIAL© 2019 Cumulus Networks 15
CI/CD Demo
Cumulus Networks Confidential
CONFIDENTIAL© 2019 Cumulus Networks 16
Thank you!
VISIT US AT
cumulusnetworks.com
FOLLOW US
@cumulusnetworks

PLNOG23 - Attilla De Groot - Network change before beer

  • 1.
    Attilla de Groot| Sr. Systems Engineer Cumulus Networks Network change before beer after
  • 2.
    CONFIDENTIAL© 2019 CumulusNetworks 2 What problem are we solving? • Human error • Software bugs features • Slow change cycle • Resource issues • Scalability constraints
  • 3.
    CONFIDENTIAL© 2019 CumulusNetworks 3 What is CI/CD Cumulus Networks Confidential Continuous Integration (CI) A system where all changes are automatically tested before being pushed to production or seen by others Continuous Deployment (CD) Built on a CI system where changes are automatically pushed to production after tests past, often multiple times per day Why aren’t you doing this? Not for everyone Continuous Deployment (CD)
  • 4.
    4 Automated testing Validation Cumulus NetworksConfidential Linting tests • Code validation • Test Yaml / Jinja2 • Enforce a style • Easy troubleshooting Unit and System tests • Individual tests • Is the intended configuration deployed? • Unit alone testing not useful • Combine unit and system tests • Is my configured BGP session up? • Am I receiving routes? AND • Is my service redundant? • Does my application still work?
  • 5.
    CONFIDENTIAL© 2019 CumulusNetworks 5 Cumulus Linux Network Operating System CLI Cumulus Linux is Linux STP + MLAG OSPF + BGP SNMP + Telemetry Ansible + Salt VMs + Containers
  • 6.
    CONFIDENTIAL© 2019 CumulusNetworks 6 Cumulus Linux Network Operating System CLI Cumulus Vx: Modern Network Simulation STP + MLAG OSPF + BGP SNMP + Telemetry Ansible + Salt VMs + Containers  Fully featured, no hardware limitations  Layer 2, Multicast, VxLAN, ACLs  Scale to 128+ ports per VM  Low 768mb memory footprint  Fast boot (1-2 minutes average)  Rename interfaces to match production  Easily build 1:1 production replica  Plan, test, train and troubleshoot ESX / VirtualBox / KVM Cumulus Vx
  • 7.
    CONFIDENTIAL© 2019 CumulusNetworks 7 Automated testing Testing infrastructure Traditional networking •Testing environment •Physical lab •Virtualization environment •DevOPS integration •Proprietary modules •Vendor tools Cumulus Networks Confidential Cumulus Linux •Testing environment •Physical lab •Cumulus VX •DevOPS integration •Native modules •Vagrant, Virtualbox, Libvirt / KVM
  • 8.
    CONFIDENTIAL© 2019 CumulusNetworks 8 Infrastructure as Code Implementing CI/CD •Home Grown •Python / Vagrant •Common tools •Jenkins •Travis CI •Bamboo •Gitlab Cumulus Networks Confidential GitLab
  • 9.
    CONFIDENTIAL© 2019 CumulusNetworks 9 Infrastructure validation Tools •Homegrown •Python •Behave •Ansible •Prometheus / Zabbix / xxx •Vendor tools •Stackstorm •Batfish •Veriflow •Cumulus NetQ Cumulus Networks Confidential
  • 10.
    CONFIDENTIAL© 2019 CumulusNetworks 10 End-to-end visibility with Cumulus NetQ Cumulus NetQ is a Linux streaming telemetry agent and NoSQL database NetQ Server Records and streams Linux kernel events (+ more) Supported on Cumulus Linux, Ubuntu, Debian, RHEL and CentOS Data, events, validation across compute + network
  • 11.
    CONFIDENTIAL© 2019 CumulusNetworks 11 Cumulus NetQ: Fabric API One database, many interfaces CLI REST API GUI
  • 12.
    CONFIDENTIAL© 2019 CumulusNetworks 12 Infrastructure as Code CI/CD Pipeline Cumulus Networks Confidential Infrastructure 1.0 Infrastructure 1.1 Git change push Automated Testing Build tool pipeline Successful validation Validation failed Infrastructure 1.0 Successful validation
  • 13.
    CONFIDENTIAL© 2019 CumulusNetworks 13 Infrastructure as Code Gitlab Pipeline •Stages, before/after script •Stages run in order on success •*_script run at each stage •Script defines testing steps •Git repository cloned on build •Script step run in environment •If step fails, validation fails •Simple bash scripts can be added •Before/After scripts for setup •Vagrant up / vagrant destroy Cumulus Networks Confidential stages: - staging - production staging: tags: - staging before_script: - cd automation stage: staging script: - 'ansible-playbook deploy.yml' - sleep 25 - netq check bgp - netq check mtu - netq check vxlan production: tags: - production before_script: - cd automation stage: production when: manual script: - 'ansible-playbook deploy.yml' - sleep 10 - netq check bgp
  • 14.
    CONFIDENTIAL© 2019 CumulusNetworks 14 Future work More Infra as code More Possibility  Single source of truth  RBAC with CI/CD  More git magic  Pre- and post checks
  • 15.
    CONFIDENTIAL© 2019 CumulusNetworks 15 CI/CD Demo Cumulus Networks Confidential
  • 16.
    CONFIDENTIAL© 2019 CumulusNetworks 16 Thank you! VISIT US AT cumulusnetworks.com FOLLOW US @cumulusnetworks

Editor's Notes

  • #3 The way that networking has always worked is that the hardware and software are welded together. There are no options to pick what is best in class. No way to run JunOS on Nexus hardware. No way to run Arista EOS on Juniper QFX switches. The idea of open networking is to break these components apart and provide best in class at each layer of the stack: hardware, software and applications. Cumulus Linux provides the best in class network operating system with full routing and switching features and Cumulus NetQ provides next generation streaming telemetry for all Linux systems, including Cumulus Linux.
  • #6 Diving deeper into the architecture of Cumulus Linux, we start with the switch hardware: the CPU, memory, fans and most importantly the Switch Silicon, or ASIC. This ASIC is what gives a switch it’s line rate performance, so it can pass 32 ports of 100g or 48 ports of 25g without breaking a sweat. On top of this hardware sits Cumulus Linux. We speak directly to the hardware. We spin the fans, flash the lights and program the forwarding information into the switch silicon. In order to have this level of control over the hardware we have incredibly tight relationships with all of our hardware vendors. We write device drivers to program each component of the switch. Now on top of Cumulus Linux are a number of applications to do different functions, like a full CLI, provide Layer 2 and Layer 3 connectivity, and enable automation engines like Ansible or Salt. Each of these applications only have to talk to Linux, they don’t need any knowledge of the underlying switch hardware. If an application and provide configuration or put routing information into the Linux kernel we handle putting that programming into switch Silicon. You can take something off the shelf like Docker Engine or KVM for virtual machines and run them directly on Cumulus Linux, just like a server. You can use the system as a traditional network device, with a CLI and SNMP or explore new operational workflows like streaming telemetry or automation. Cumulus Linux provides the power and flexibility for as much or as little change as you’d like.
  • #7 Since Cumulus Vx is just another Linux distro running on standard hypervisors there are almost no limits. We have a small memory footprint of only 768 MEGS per switch, when combined with the ability to scale to 128+ interfaces you can replicate the exact devices deployed in your datacenter. Each of these VMs boots quickly and you can rename the interfaces to match exactly what’s in production. If you think about legacy vendor VMs if you provision 4 interfaces you are given Ethernet 1, 2, 3 and 4, but this isn’t how you cabled your environment. You may have two 10G server ports and 2 100G uplinks. Cumulus Linux allows you to rename the interfaces to match exactly what you have in production. This means you can can build and test production configurations against the virtual environment without the need to translate between lab and production configs. When it’s all put together, for the first time in the industry, you can make an exact simulation of your production environments for testing, change management, training or any other use case. We even have customers, some with the assistance of our consulting services, doing automated testing as part of their change management system, sometimes called Network CI/CD (Continuous Integration/Continuous Delivery)
  • #11 The way NetQ works is a central server, either on prem or in the cloud, runs a modern webscale timeseries database
  • #12 As mentioned there is a GUI, CLI and API interface into NetQ. However you wish to consume the data from NetQ we have an option for you.
  • #15 If there is the requirement for inter tenant traffic … through router on a stick or firewall “Service” tenant for services like DNS or general tenant services Not yet implemented in FRR Micro segmentation inside a tenant for application security Kernel development for doing filtering with BPF on hosts Using flowspec for distributing the ACLs