SRE (Site Reliability Engineering) is responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of services. An SRE team uses an "error budget" approach where new features can be launched if the service is within its agreed SLA, but launches are frozen if the SLA is not being met until enough of the error budget is earned back. SRE teams hire only coders who can speak the same language as developers and rotate developers into operations work. The goal of SRE is to minimize impact and prevent recurrence of outages through practices like post-mortem analysis and constant improvement of processes.