In recent years it’s become evident that alerting is one of the biggest challenges facing modern Operations Engineers. Conference talks, hallways tracks, meetups, etc are rife with discussions about poor signal/noise in alerts, fatigue from false positives, and general lack of actionability. Our talk (informed by real-world experience designing, building and maintaining our distributed, multi-tenant metrics/alerting service) takes a fundamental approach and examines alerting requirements and practices in the abstract. We put forth a comprehensive abstract model with best practices that should be followed and implemented by your team regardless of your tool of choice. This talk is equal parts cultural and technical, encompassing both computational capabilities as well as social practices, like: Defining organizational policy about where and when to set alerts. Ensuring the on-call engineer is armed with the proper information to take action Best practices for configuring an alert Fire-fighting after an alert has triggered Performing analysis across your organization wide history of alerts