1. Anomaly Detection in the Data CenterClosing the DevOps Feedback LoopToufic Boubez, Ph.D.Co-Founder, CTOMetafor Software
2. 2Toufic Intro – who I am• Co-Founder/CTO Metafor Software• Co-Founder/CTO Layer 7 Technologies– API Management and Security– Acquired by Computer Associates in 2013– But I escaped • Building large scale software systems for 20years (I’m older I look, I know!)
3. 3What is Anomaly Detection in the data center?• One of these things is not like the other ones!– Anomalies in environment– Anomalies in behaviour– How do you find out!?
4. 4Simple file diffing
5. 5Distributed diffing?• What about:– Files (code, config, etc)– Directories– Packages– Date/Time– Running services– Open ports– Other stuff?
6. 6Distributed historical diffing? Anyone?Time
7. 7Open Loop Control System:Heating your house – the wrong way!• Steps:– Tweak heater input– Get to ideal temperature– Lock gas valve– Hope nothing changesController(gas valve)System(heater)Sensor(thermometer)
8. 8Controller(gas valve)System(heater)Sensor(thermometer)+-deltadesiredtemperaturecurrenttemperatureOpen Loop Control System:Heating your house – the right way• Steps:– Set the desired temperature– Sit back and let the system deal with changes
9. 9Controller SystemSensor+-PuppetChefCFEngine…MyInfrastrucutreNagiosCactiZabbix…?desiredstatecurrentstateWhat’s missing to get to self-healing systemsdelta• We have most of the tools already• Need to add:– Error tracking (anomaly detection)– Corrective action
10. 10Metafor demo: environmental anomaly detection• Agent-based SaaS model– Agents installed on user servers (a couple of minutes)– Agents collect data– Metafor servers perform analytics and detect anomalies– Send reports/alerts– API• Currently in open beta– Get it NOW!– http://metaforsoftware.com/• Currently static anomalies– Coming soon: Behaviour anomalies• Being used by cloud provider as the error tracking component in aself-healing system