Successfully reported this slideshow.
Your SlideShare is downloading. ×

Scaling an Extreme Temporary Event Network for Burning Man

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
Au ama 10.15.14
Au ama 10.15.14
Loading in …3
×

Check these out next

1 of 11 Ad

More Related Content

Advertisement

Recently uploaded (20)

Scaling an Extreme Temporary Event Network for Burning Man

  1. 1. Burning Man Scaling an extreme temporary event network Matt Peterson matt@burningman.com Matt Peterson
  2. 2. Terry Ratcliff Reuters/Jim Urquhart Black Rock City, NV duncan.co
  3. 3. 2 01 4 B ur n i ng M an IP N etwo r k Team I.T.S. is Backbone, Camera Girl, Cat, Domo, Huckleberry, Little Meat, MattStep, Mushroom, Prof. Fox, PornStar, Ralf, Reset, Sawdust, Spank Me, Taz, Whiskey Devil, and Wild Card gw-noc lo0 - 162.212.145.252 Primary Cost 100 Tertiary Cost 400 Secondary Cost 150 Channel 5 TX 11.385GHz RX 10.895GHz gw-depot lo0 - 162.212.145.254 Channel 2 TX 11.265GHz RX 10.775GHz Secondary Cost 150 Channel 150 5.750GHz Channel 140 TX 5.700GHz RX 5.750GHz dw-tower-noc port1 - 162.212.144.26 dw-noc-tower port1 - 162.212.144.27 High Desert Internet dw-tower-depot port1 - 162.212.144.10 dw-depot-tower port1 - 162.212.144.11 dw-noc-depot port1 - 162.212.144.75 dw-depot-noc port1 - 162.212.144.74 ap-tower br0.2 - 162.212.144.2 st-noc-tower br0.6 - 162.212.144.34 Poito Eagle Ridge st-depot-tower br0.4 - 162.212.144.18 ge0/3 - 162.212.144.25 ge0/2 - 162.212.144.9 Primary Cost 100 Tertiary Cost 400 m a t t @ b u r n i n g m a n . c o m 2 0 1 4 - 1 0 - 0 2 1 8 : 1 0 Channel 5 TX 10.895GHz RX 11.385GHz Channel 2 TX 10.775GHz RX 11.265GHz Channel 1 TX 17.765GHz RX 19.325GHz Channel 1 TX 19.325GHz RX 17.765GHz sw-noc2 vl7 - 162.212.144.123 sw-noc1 vl7 - 162.212.144.122 v94 - NOC LAN v111 - First Camp v114 - Media Mecca v115 - NOC Inside v121 - Webcast v94 - NOC LAN v103 - Artery v210 - Digerati & Devas ap-noc2 br0.25 - 162.212.144.146 ap-noc10 br0.25 - 162.212.144.149 ap-noc7 br0.25 - 162.212.144.148 sw-depot vl8 - 162.212.144.130 Channel 6 2.437GHz apb-noc-local br0 - 100.96.94.2 Channel 1 2.412GHz apb-depot-local br0 - 100.96.91.2 Channel 1 2.412GHz apb-noc1-inside br0 - 100.96.115.2 v122 - OC1/West Wing Radio Frequency UTP Ethernet Licensed Unlicensed OSPF weight Transit Channel 147 5.735GHz Channel 153 5.765GHz Channel 151 5.755GHz Channel 157 5.785GHz apa-noc br0.33 - 162.212.145.35 Channel 165 5.825GHz v91 - Depot, Logistics st-light-depot br0.17 - 162.212.144.210 st-light-noc br0.27 - 162.212.144.202 st-airport br0.101 - 100.96.101.2 v107 - Commissary Office v113 - Laminates v125 - Big Office v126 - Human Resources v127 - Container Office gw-light lo0 - 162.212.145.248 v100 - Accounting v197 - Commissary Public v215 - GPE Camp v221 - Ticketfly v224 - Boob sw-commissary mgmt.34 - 162.212.146.142 ap-depot-omni br0.10 - 162.212.144.66 Channel 163 5.815GHz Channel 11 2.462GHz apb-commissary-local br0 - 100.96.107.2 st-poopdudes br0.216 - 100.96.216.2 st-quad4-depot br0.15 - 162.212.144.106 st-quad4-noc br0.23 - 162.212.144.170 v104 - BMIR v108 - DMV v120 - Ranger HQ gw-quad4 lo0 - 162.212.145.249 v118 - Playa Info v120 - Ranger HQ sw-quad4 mgmt.35 - 162.212.146.130 Channel 6 2.437GHz apb-quad4-local br0 - 100.96.95.2 st-cafe br0.216 - 100.96.216.2 Channel 11 2.462GHz apb-cafe-local br0 - 100.96.106.3 st-pgepoint1 br0.117 - 100.96.117.2 Channel 6 2.437GHz apb-pgepoint1 br0 - 100.96.117.4 st-ghetto br0.211 - 100.96.211.2 Channel 1 2.412GHz apb-ghetto-local br0 - 100.96.211.3 br0.22 - 162.212.144.162 gw-esd st-esd-depot br0.14 - 162.212.144.98 st-esd-noc lo0 - 162.212.145.253 v20 - ESD Comm. v111 - Incident Comm. Post st-box-depot br0.13 - 162.212.144.90 st-box-noc br0.21 - 162.212.144.154 gw-box lo0 - 162.212.145.250 v105 - Box Office v116 - GPE st-power-depot br0.18 - 162.212.144.218 st-power-noc br0.28 - 162.212.144.226 gw-power lo0 - 162.212.145.247 v119 - Power Camp Channel 11 2.462GHz apb-power-local br0 - 100.96.97.2 st-heavy-depot br0.16 - 162.212.144.114 st-heavy-noc br0.24 - 162.212.144.178 gw-heavy lo0 - 162.212.145.251 v96 - Heavy Camp v109 - HGH "Rampart" v112 - Heavy Office Channel 6 2.437GHz apb-heavy-local br0 - 100.96.96.2 st-esdstation9 br0.130 - 100.96.130.2 st-esdstation3 br0.129 - 100.96.129.2 ap-noc5 br0.25 - 162.212.144.147 ap-depot-sector br0.10 - 162.212.144.67 Channel 153 5.795GHz Channel 6 2.437GHz apb-depot-dispatch br0 - 100.96.91.3
  4. 4. Where We Started • The network worked, but it wasn’t easy – Large L2 bridged architecture, minimal L3 segmentation, multiple NAT layers – Two distinct “business units” – Manual configuration, “tribal knowledge” – Numerous single points of failure
  5. 5. Where We Went • Needed to operate as a unified team – Consistent support experience, improved RF spectral efficiency, coordinated IP allocations • Standardized COTS equipment – “CCIE off the street” factor, escalation path • Standardized service offerings – Org department handoff’s always wired gigE; as aggregated “islands” or single demarc – Participant camps supply very prescriptive equipment, “self-install” provisioning
  6. 6. Where We Went • Route, always – No L2 segments past a single device – OSPF everywhere, core backbone & “islands” – Segment where possible, even over WiFi • Automation – Initially covering all routers & switches – Target goal to cover any device with a config or supplemental service (DNS, monitoring)
  7. 7. Automation! • Held bakeoff (mid 2011 evaluation) – Homegrown YAML  config templates – Prototyped NCG (see NANOG49 Tutorial “Automating Network Configuration”) • NCG won (3 yrs ago) – Open source, vendor agnostic – Initial steep curve, very easy to embrace – Principal developer already a team member
  8. 8. Actual git example
  9. 9. Summary Overview • {Automation} data modeling isn’t easy – Imagine all your inputs & outputs (device configs, DNS, monitoring, billing, etc.) • Single source of truth – Git, a wiki, fancy IPAM: choose what fits your organization’s workflow, stress level, & budget • Start at L8 and L1, meet in the middle – People + physical layer = organic processes – End goal is to be efficient, not become a SW dev
  10. 10. More.. unconference? Matt Peterson matt@burningman.com

Editor's Notes

  • Thank you for the opportunity this morning
    Originally just automation - background on design, technology
  • Burning Man is an event … held on federal land … northern Nevada, 2hrs north-east of Reno
    Called the playa or Black Rock City
    Just under 70k participants
    Described as social experiment, festival, party, … - it’s a city
    Leave no trace event
    Zero infrastructure before & after event
    ~3 weeks to build out & 3 weeks to tear down
    BM provides basic life safety (port-a-potties, medical care) along with “guard rails” (mutual aid – law enforcement, ice for purchase)
    Everything else – water, food, shelter, Bring Your Own
  • Everyone city has infrastructure
    This is what BM looks like to me
  • When I returned, took the approach of
    Toyota manufacturing, Six Sigma, consultant evaluation phase
    Team = sysadmin/helpdesk, not network engineers or architects
    Some level of routing, done with shell scripts that adjusted local routing tables
    Limited investment to add redundancy, power outages and physical stability
    For historical reasons, two different customer bases
    Departments pre-event, camps during = staff exhausted
  • Common team with common goal of an effective service, regardless if the end-user is a department OR camp
    COTS is an old term, “commercial off the shelf” – products that offer technical support, warranty, known best practices
    For switching - wanted fanless, active PoE, gigE
    Handoff isn’t known to the customer, be it wired or wireless backhaul
    A truck roll is incredibly expensive
  • Same L2 across playa, bring your own VPN or tunnel mechanism
    OSPF just works, most devices L2 bridges – no need for exotic mesh networking
    Tired of sitting in a shipping container for a week, too much hands-on L1 work to be done
    Can’t afford to manually config
  • Consistent service, standardized equipment = made it easier to automate!
    You don’t have to buy all the same equipment, but in our circumstances, it helped
  • Only did switch + router configs, then added DNS, monitoring
    Added wireless equipment later
    Take baby steps
    NCG + static configs in Git very powerful, offline distributed database
    Change is difficult for everyone to handle, tackle people and physical layer first = automation becomes a natural extension

×