This document summarizes Coveo's cloud infrastructure, which runs primarily on Kubernetes and uses Prometheus and Thanos for monitoring. It discusses how Prometheus scrapes metrics and Thanos stores and shards the time series data across multiple regions. Grafana is used to query the global Thanos database and display dashboards, while Alertmanager handles alerts. The infrastructure has grown to include 12 Kubernetes clusters across 7 regions managing over 12,500 pods on 370 nodes.