4. Why should I listen to you?
Just a guy who’s been in the trenches a long time.
• Learned to code in C long ago. BSD kernel hacking, secure
messaging, managed security appliances, nomadic file systems.!
• >1000 wireless access points deployed to 14 cruise ships!
• 6 Cisco core network replacements from Nortel Passport!
• First live-voyage core network replacement (Diamond Princess)!
• Built 22 broadband wireless towers (of 75)!
• Regional Voice-over-IPX (DSP on OS/2 over Novell !)!
5. Why HootSuite went physical
“unique” workload:
• 95% write
• 12TB dimension
• I/O bound
• Noisy
neighbours
• pre- PIOPS
(AWS 100io/vol)
• Need >68GB
• No lock-in
6. What is “cloud”
Not a cloud definition slide!
• Just datacenter best
practices from 1998
(infrastructures.org)!
• Gold disk deploy - AMI!
• Version Control - config mgmt!
• Automate everything - APIs!
Cloud is like cutting your legs off at the knee - stop trying to walk
somewhere, just clone a new server in place – me.!
7. Compromising
Balancing best vs. budget
• We chose software routers. OpenBSD + OpenBGPD on Dell!
• We chose Cisco core switching!
• We chose software firewalls. OpenBSD + PF on Dell!
• We chose CloudStack on VMware!
• We chose SAN + iSCSI!
8. Compromising
We chose software routers. OpenBSD + OpenBGPD on Dell
• OpenBSD is secure, OpenBGPD is stable!
• Scales to 1.5-2 Gbps per host, depending on packet size!
• Redundant pairs instead of internally redundant (live upgrades!)!
• Ops team understands BSD tools!
• Added support for Intel 520 (82599) 10GE NICs!
• Much lower cost than hardware routers!
9. Compromising
We chose Cisco core switching
• Cisco is solid. Cisco engineers can be hired!
• OSPF with millisecond timers = sub-second convergence!
• Wanted 10Gig in the network core!
• Needed minimal port count!
• Ops team has Cisco experience.!
!
10. Compromising
We chose software firewalls. OpenBSD + PF on Dell
• OpenBSD is secure, PF is stable!
• Scales to 1-1.5 Gbps per host, depending on states/rules (~300k)!
• CARP + Pfsync is great! We run Active+Standby, alternating
Masters.!
• Redundant pairs instead of internally redundant (live upgrades!)!
• Ops team understands BSD tools. Scripts sync security groups
from AWS to PF tables.!
!
11. Compromising
We chose CloudStack on VMware
• 2012: CloudStack more mature than OpenStack!
• Wanted VMware hypervisor for core data services (MySQL,
Mongo)!
• We use vMotion + HA on core services!
• Did not want vendor lock-in, layered CloudStack for future options!
• Original plan was mixed VMware + XenServer, but small Ops team!
12. Compromising
We chose SAN + iSCSI
• We chose iSCSI for flexibility:!
• We need snapshots. Most backups are sync+snap!
• We like live migration of virtual machines!
• We tolerate latency penalty of SAN for snapshot flexibility!
• We run RAID-6 (2 parity disks)!
Tolerate 2 disk failures per slice before data loss!
Painful on write – 5,000 writes è 30,000 read + write!
Remote equipment – time to replacement is not instant!
!
13. SJC Stack – Core Network
BGP, OSPF, PF, on OpenBSD and Cisco!
Routers, switches, firewalls