Be the first to like this
Many organisations already possess a vast amount of existing data about production systems. As customer expectations evolve, organisations are often challenged to find more proactive ways of dealing with traditionally reactive incident response activity. In this talk, we discuss approaches to unlock value from this data by making it truly actionable. Understanding production failure modes better, enriching technical and business context effectively, decomposing response activity into shared primitives, actions and workflows, and overall, sharing and augmenting this active knowledge repository on a continuous basis are key takeaways. Through case studies, we'll discuss how we can accomplish this by engineering your observability processes and tooling to work for human-in-the-loop interpretation and response rather than a purely human-reliant strategy.