This document discusses strategies for scaling Prometheus monitoring to millions of machines. It describes the stages of scaling from manual Prometheus configuration to per-region Prometheus instances. Issues with input/output and retention led to further scaling techniques like sharding, using Kafka as a message bus, downsampling metrics, and developing an agent and query API. The talk introduces "Vulcan" which implements Prometheus APIs with Cassandra storage and Kafka input to enable monitoring at large scale.