Successfully reported this slideshow.
Your SlideShare is downloading. ×

Highly responsive round the clock cloud operations with InfluxCloud

Ad

©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 1
1
Highly responsive round the clock cloud operations wit...

Ad

©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 2
Coupa Is the Cloud Platform for Business Spend Managemen...

Ad

©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 3
Savings Leads to Givings

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 13 Ad
1 of 13 Ad

Highly responsive round the clock cloud operations with InfluxCloud

Download to read offline

In this talk, Sanket Naik, VP of Cloud Operations and Security and Hans Gustavson, Director of Site Reliability Engineering at Coupa, will share how they use InfluxData as a key component to derive operational metrics of their Spend management platform. In particular, they share their team’s best practices with using InfluxData that helped them achieve a consistent track record of delivering close to 100% uptime SLA across 13 major product releases and 5 major product module offerings.

In this talk, Sanket Naik, VP of Cloud Operations and Security and Hans Gustavson, Director of Site Reliability Engineering at Coupa, will share how they use InfluxData as a key component to derive operational metrics of their Spend management platform. In particular, they share their team’s best practices with using InfluxData that helped them achieve a consistent track record of delivering close to 100% uptime SLA across 13 major product releases and 5 major product module offerings.

More Related Content

Similar to Highly responsive round the clock cloud operations with InfluxCloud

Highly responsive round the clock cloud operations with InfluxCloud

  1. 1. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 1 1 Highly responsive round the clock cloud operations with Influx Nov 7, 2018 Sanket Naik, VP, Cloud Engineering, Operations & Security Hans Gustavson, Senior Director, Site Reliability Engineering
  2. 2. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 2 Coupa Is the Cloud Platform for Business Spend Management 2©2018 Coupa Software, Inc. – Confidential – All Rights Reserved Comprehensive | Designed for Everyone Sourcing Suppliers Payments Procurement Invoicing Expenses 4M+ Suppliers 717* Customers 100+ Countries $840Bn+ Spend Under Management
  3. 3. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 3 Savings Leads to Givings
  4. 4. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 4 Collect Correlate & Triage Trend Forecast Auto root cause identification Auto remediation (Self- healing) Predictive & What-if analysis 7 Step Monitoring Maturity Model
  5. 5. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 5 Global Region Server Pod Correlation & Triage – Finding a Needle in the Haystack Customer and Queue
  6. 6. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 6 Trend & Forecast – Plan and solve for the future Customer A Bank Customer B Healthcare Customer C Retail Aggregate ServerPod1
  7. 7. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 7 Predictive & What-if analysis – Get ML based Insights from the Data InfluxDB Kapacitor Platform Application System Capacity Event Product Features Incident Event
  8. 8. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 8 WHAT DO YOU THINK? WE WOULD LOVE FEEDBACK! sanket.naik@coupa.com hans.gustavson@coupa.com
  9. 9. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 9
  10. 10. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 10©2018 Coupa Software, Inc. – Confidential – All Rights Reserved BACKUP SLIDES
  11. 11. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 11 What the future holds • Trigger automation such as auto-scaling, auto-remediation based on Kapacitor events • Use Chronograf to create Kapacitor TICK scripts; more self-service • Enrich metrics and dashboards with Annotations and Markers • Conditional alert routing based on host state • Traceability • Expose KPI based metrics with customers
  12. 12. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 12 SOME VALUABLE LESSONS WERE LEARNT FROM THE PROTOTYPE Benefits • Telegraf plugins provided 10x as many metrics than compared to our legacy solution • High resolution of metric data • Easy to create use case specific dashboards • Big improvement with capacity planning Challenges • Be careful with queries; performance impact • Too many Grafana endpoints • Dashboard proliferation • Learning Kapacitor TICK script language
  13. 13. ©2018 Coupa Software, Inc. – Confidential – All Rights Reserved 13 WE ARE GETTING GOOD MILEAGE IN A SHORT PERIOD OF TIME SO FAR FROM OUR INVESTMENT IN TICK + GRAFANA… Benefits • Significant performance improvements • Ability to overlay application, platform and system metrics • Development adding roadmap items to start using TICK (builds, application errors, security events) • Teams solving more advanced use cases using various algorithms and models • Extended to more teams Challenges • Developer guide with standards and best practices • Wrestling with statsd and application metrics • Learning Kapacitor TICK script language • Using standard static threshold based monitors do not work at scale; too many alerts

×