Prometheus Overview

Founder at Robust Perception
Oct. 1, 2015

More Related Content

Similar to Prometheus Overview(20)


Prometheus Overview

  1. Prometheus Overview The Promethean Ideal of Monitoring
  2. Why monitor? ● Know when things go wrong ○ To call in a human to prevent a business-level issue, or prevent an issue in advance ● Be able to debug and gain insight ● Trending to see changes over time, and drive technical/business decisions ● To feed into other systems/processes (e.g. QA, security, automation)
  3. Common Monitoring Challenges Themes common among companies I’ve talk to: ● Monitoring tools are limited, both technically and conceptually ● Tools don’t scale well and are unwieldy to manage ● Operational practices don’t align with the business For example: Your customers care about increased latency and it’s in your SLAs. You can only alert on individual machine CPU usage. Result: Engineers continuously woken up for non-issues, get fatigued
  4. Fundamental Challenge is Limited Visibility
  5. Prometheus Inspired by Google’s Borgmon monitoring system. Started in 2012 by ex-Googlers working in Soundcloud as an open source project, mainly written in Go. Publically launched in early 2015, and continues to be independent of any one company. Over 100 companies have started relying on it since then.
  6. What does Prometheus offer? ● Inclusive Monitoring ● Powerful data model ● Powerful query language ● Manageable and Reliable ● Efficient ● Scalable ● Easy to integrate with ● Dashboards
  7. Services have Internals
  8. Monitor the Internals
  9. Monitor as a Service, not as Machines
  10. Inclusive Monitoring Don’t monitor just at the edges: ● Instrument client libraries ● Instrument server libraries (e.g. HTTP/RPC) ● Instrument business logic Library authors get information about usage. Application developers get monitoring of common components for free. Dashboards and alerting can be provided out of the box, customised for your organisation!
  11. Powerful Data Model All metrics have arbitrary multi-dimensional labels. No need to force your model into dotted strings. Can aggregate, cut, and slice along them. Supports any double value, labels support full unicode.
  12. Powerful Query Language Can multiply, add, aggregate, join, predict, take quantiles across many metrics in the same query. Can evaluate right now, and graph back in time. Answer questions like: ● What’s the 95th percentile latency in the European datacenter? ● How full will the disks be in 4 hours? ● Which services are the top 5 users of CPU? Can alert based on any query.
  13. Manageable and Reliable Core Prometheus server is a single binary. Doesn’t depend on Zookeeper, Consul, Cassandra, Hadoop or the Internet. Only requires local disk (SSD recommended). No potential for cascading failure. Pull based, so easy to on run a workstation for testing and rogue servers can’t push bad metrics. Advanced service discovery finds what to monitor.
  14. Efficient Instrumenting everything means a lot of data. Prometheus is best in class for lossless storage efficiency, 3.5 bytes per datapoint. A single server can handle: ● millions of metrics ● hundreds of thousands of datapoints per second
  15. Scalable Prometheus is easy to run, can give one to each team in each datacenter. Federation allows pulling key metrics from other Prometheus servers. When one job is too big for a single Prometheus server, can use sharding+federation to scale out. Needed with thousands of machines.
  16. Easy to integrate with Many existing integrations: Java, JMX, Python, Go, Ruby, .Net, Machine, Cloudwatch, EC2, MySQL, PostgreSQL, Haskell, Bash, Node.js, SNMP, Consul, HAProxy, Mesos, Bind, CouchDB, Django, Mtail, Heka, Memcached, RabbitMQ, Redis, RethinkDB, Rsyslog, Meteor.js, Minecraft... Graphite, Statsd, Collectd, Scollector, Munin, Nagios integrations aid transition. It’s so easy, most of the above were written without the core team even knowing about them!
  17. Dashboards
  18. What do we do? Robust Perception is an independent provider of Prometheus-related services. We can help you: ● Decide if Prometheus is for you ● Manage your transition to Prometheus ● Resolve issues that arise ● Use Prometheus to run and scale your production systems efficiently We are proud to be among the core contributors to the Prometheus project.
  19. Resources Official Project Website: Official Mailing List: Demo: Robust Perception Website: Queries: