9. Atlassian Values
They guide what we do, why we create, and who we hire.
Open company,
no bullshit
Build with heart
& balance
Be the change
you seek
Play,
as a team
Don’t #@!%
the customer
20. WHAT IS IN A NETWORK POP?
Internet
Networking
•Multiple Redundant 10GB+ AWS Direct Connect
•10GB+ Public Connectivity (Tier 1) + CloudFront
•Akamai Prolexic or AWS DDOS protection
•Links to BU or Components
Global Edge
•Load Balancers
25. Before
Gigs of logs going into Splunk
Nagios/SNMP triggering PagerDuty
Loads of Custom Checks
Auto-discovery of links and devices
sFlow and NetFlow
MTR and hping
Grafana graphs (so many!)
Pingdom
.. but missing actual end-to-end distributed
visibility
26. • Offices (double as cloud agents)
• On x86 in our cages / PoPs
• VPCs in AWS (including sometimes one per route table!)
• Using puppet, ansible, as containers - depending on the
environment
Deployed Agents everywhere!
27. • Each other (agent-to-agent)
• Public and Private loopbacks
• Key Servers and Systems
• Anycast IPs (DNS)
• Lots of AWS endpoints
Monitoring all-the-things!
• IPsec tunnels
• MPLS Backbone
• Devices
• Offices
• Internet Links + Load Balancers
• VPCs and AWS indirectly
28. Wide-shallow, Tall-Thin alerting
• Alert to PagerDuty, HipChat, Stride, email
• Alert when loss is high for a few agents
• Alert when loss is low for many agents
29. Scrape API with 1
minute Lambda
(will open source on Bitbucket…)
Correlating with Datadog
Correlate and Ingest
with Datadog
Monitor with
Thousandeyes
48. After
40+ Agents (FP&A wtf?)
Thousandeyes triggers a HipChat/Stride
alert + PagerDuty
We open TE Dashboard + Datadog
Correlate and (most cases) identify
immediate problem
Fix the issue or notify the vendor
Vendors and Providers still an issue..
Thousandeyes saved our sanity + time
Before
Gigs of logs going into Splunk
Nagios/SNMP triggering PagerDuty
Custom Checks
Auto-discovery of links and devices
sFlow and NetFlow
MTR and hping
Grafana graphs (so many!)
Pingdom
.. but missing actual end-to-end distributed
visibility
49. BENJAMIN MCALARY | PRINCIPLE NETWORK ENGINEER
BMCALARY@ATLASSIAN.COM
Thank you!