1
Building a Centralized
Observability Platform
Gabriel Moskovicz
Principal Architect - LATAM & US
Role of Centralized Observability Platform Team
Builds and maintains solutions that make getting insights from this data
turnkey and efficient
Builds the tools and systems that every engineering team uses to
develop, scale, understand & monitor their systems. These systems are
absolutely critical to Uber.
Give our internal customers a simple and reliable set of tools to process,
store and ship logs.
3
Self Service observability to
Dev and SREs in minutes
Helping drive MTTR/D
towards zero
Productivity
Simplify operations
(upgrades, security, etc)
Standardize practices and
processes to achieve
software reliability goals
Standardization
Reduce license, deployment,
and operations costs
Smaller platform team
supports observability needs
for entire org
Consolidation
Benefits of a Centralized Observability Platform
Centralized Observability Platform Needs
5
Centralized Observability
with Elastic
Built for scale
(and speed)
Powered by Elasticsearch that is
highly available, distributed by design,
and scales with ease
5M
metrics/sec
1.73M
events/min
3M
events/min
2 PB
events/day
Operational Ease
Single observability platform
One platform to deploy & manage for all your
observability needs
Observe the observability platform
Rich monitoring features enable you to provide a
reliable and performant observability platform
Multi-hybrid platform lifecycle made easy
Simplify orchestration with Elastic Cloud, Elastic
Cloud Enterprise, and Elastic Cloud on Kubernetes
Stack monitoring
ECE -
Operational Ease
Single observability platform
One platform to deploy & manage for all your
observability needs
Observe the observability platform
Rich monitoring features enable you to provide a
reliable and performant observability platform
Multi-hybrid platform lifecycle made easy
Simplify orchestration with Elastic Cloud, Elastic
Cloud Enterprise, and Elastic Cloud on Kubernetes
Operational Ease
Single observability platform
One platform to deploy & manage for all your
observability needs
Observe the observability platform
Rich monitoring features enable you to provide a
reliable and performant observability platform
Multi-hybrid platform lifecycle made easy
Simplify orchestration with Elastic Cloud, Elastic
Cloud Enterprise, and Elastic Cloud on Kubernetes
Operational Ease
Single observability platform
One platform to deploy & manage for all your
observability needs
Observe the observability platform
Rich monitoring features enable you to provide a
reliable and performant observability platform
Multi-hybrid platform lifecycle made easy
Simplify orchestration with Elastic Cloud, Elastic
Cloud Enterprise, and Elastic Cloud on Kubernetes
Multitenancy
Cluster per tenant made easy
Easily isolate teams without increasing admin
load with ECE / ECK / Elastic Cloud
Tenants in a single cluster simplified
Handle multiple users on a single cluster with
Kibana Spaces, RBAC, background jobs, etc
Deliver on mixed needs
Balance speed vs. cost with flexible data tiering
and lifecycle management policy
Multitenancy
Cluster per tenant made easy
Easily isolate teams without increasing admin
load with ECE / ECK / Elastic Cloud
Tenants in a single cluster simplified
Handle multiple users on a single cluster with
Kibana Spaces, RBAC, background jobs, etc
Deliver on mixed needs
Balance speed vs. cost with flexible data tiering
and lifecycle management policy
Multitenancy
Cluster per tenant made easy
Easily isolate teams without increasing admin
load with ECE / ECK / Elastic Cloud
Tenants in a single cluster simplified
Handle multiple users on a single cluster with
Kibana Spaces, RBAC, background jobs, etc
Deliver on mixed needs
Balance speed vs. cost with flexible data tiering
and lifecycle management policy
Multitenancy
Cluster per tenant made easy
Easily isolate teams without increasing admin
load with ECE / ECK / Elastic Cloud
Tenants in a single cluster simplified
Handle multiple users on a single cluster with
Kibana Spaces, RBAC, background jobs, etc
Deliver on mixed needs
Balance speed vs. cost with flexible data tiering
and lifecycle management policy
Async/Background searches
Multitenancy
Cluster per tenant made easy
Easily isolate teams without increasing admin
load with ECE / ECK / Elastic Cloud
Tenants in a single cluster simplified
Handle multiple users on a single cluster with
Kibana Spaces, RBAC, background jobs, etc
Deliver on mixed needs
Balance speed vs. cost with flexible data tiering
and lifecycle management policy
Cost Management
Don’t wait to buy to prove value
Start (or expand) with the forever free (and open)
tier, and validate value — when budgets are tight
Scale and evolve, without exploding costs
Control spend without compromising visibility or
engineering goals with resource-based pricing
Optimize resource utilization
Provide low latency and low network egress costs
through local data clusters and a central analysis
cluster
Measure usage and chargeback
Telemetry to understand usage across all use cases
to identify growing or underused areas and also for
chargeback purposes
GOLD PLATINUM ENTERPRISE
Logs app
APM app
Metrics app
Integrations / Beats
APM agents
Distributed tracing
OpenTelemetry/Jaeger
Alerting
Machine learning
Service maps
FREE & OPEN
Uptime app
Cost Management
Don’t wait to buy to prove value
Start (or expand) with the forever free (and open)
tier, and validate value — when budgets are tight
Scale and evolve, without exploding costs
Control spend without compromising visibility or
engineering goals with resource-based pricing
Optimize resource utilization
Provide low latency and low network egress costs
through local data clusters and a central analysis
cluster
Measure usage and chargeback
Telemetry to understand usage across all use cases
to identify growing or underused areas and also for
chargeback purposes
hosts
agents
functions
containers
metrics
logs ingested
No pricing per
Our resource-based pricing is based only
on the underlying infrastructure
resources used by any given deployment
— across all data types and use cases.
Cost Management
Don’t wait to buy to prove value
Start (or expand) with the forever free (and open)
tier, and validate value — when budgets are tight
Scale and evolve, without exploding costs
Control spend without compromising visibility or
engineering goals with resource-based pricing
Optimize resource utilization
Provide low latency and low network egress costs
through local data clusters and a central analysis
cluster
Measure usage and chargeback
Telemetry to understand usage across all use cases
to identify growing or underused areas and also for
chargeback purposes
DC1 DC2
logs /
metrics
Azure
logs /
metrics
logs /
metrics
App / Infra App / Infra App / Infra
Central Cluster
Cost Management
Don’t wait to buy to prove value
Start (or expand) with the forever free (and open)
tier, and validate value — when budgets are tight
Scale and evolve, without exploding costs
Control spend without compromising visibility or
engineering goals with resource-based pricing
Optimize resource utilization
Provide low latency and low network egress costs
through local data clusters and a central analysis
cluster
Measure usage and chargeback
Telemetry to understand usage across all use cases
to identify growing or underused areas and also for
chargeback purposes
Self-Served Onboarding
Self-service enablement for Devs
Create self service mechanism teams to
create clusters, log parsing pipelines,
enable APM
Self-service enablement for SREs
OOTB integrations that include dashboards
and built-in alerts and anomaly detection jobs
Education
SRE training, observability engineer, SRE
practices enablement
Self-Served Onboarding
Self-service enablement for Devs
Create self service mechanism teams to
create clusters, log parsing pipelines,
enable APM
Self-service enablement for SREs
OOTB integrations that include dashboards
and built-in alerts and anomaly detection jobs
Education
SRE training, observability engineer, SRE
practices enablement
Self-Served Onboarding
Self-service enablement for Devs
Create self service mechanism teams to
create clusters, log parsing pipelines,
enable APM
Self-service enablement for SREs
OOTB integrations that include dashboards
and built-in alerts and anomaly detection jobs
Education
SRE training, observability engineer, SRE
practices enablement
23
Self Service observability to
Dev and SREs in minutes
Helping drive MTTR/D
towards zero
Productivity
Simplify operations
(upgrades, security, etc)
Standardize practices and
processes to achieve
software reliability goals
Standardization
Reduce license, deployment,
and operations costs
Smaller platform team
supports observability needs
for entire org
Consolidation
Centralized Observability Platform with Elastic
Thank you!

Creación de una plataforma de observabilidad centralizada

  • 1.
    1 Building a Centralized ObservabilityPlatform Gabriel Moskovicz Principal Architect - LATAM & US
  • 2.
    Role of CentralizedObservability Platform Team Builds and maintains solutions that make getting insights from this data turnkey and efficient Builds the tools and systems that every engineering team uses to develop, scale, understand & monitor their systems. These systems are absolutely critical to Uber. Give our internal customers a simple and reliable set of tools to process, store and ship logs.
  • 3.
    3 Self Service observabilityto Dev and SREs in minutes Helping drive MTTR/D towards zero Productivity Simplify operations (upgrades, security, etc) Standardize practices and processes to achieve software reliability goals Standardization Reduce license, deployment, and operations costs Smaller platform team supports observability needs for entire org Consolidation Benefits of a Centralized Observability Platform
  • 4.
  • 5.
  • 6.
    Built for scale (andspeed) Powered by Elasticsearch that is highly available, distributed by design, and scales with ease 5M metrics/sec 1.73M events/min 3M events/min 2 PB events/day
  • 7.
    Operational Ease Single observabilityplatform One platform to deploy & manage for all your observability needs Observe the observability platform Rich monitoring features enable you to provide a reliable and performant observability platform Multi-hybrid platform lifecycle made easy Simplify orchestration with Elastic Cloud, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes Stack monitoring ECE -
  • 8.
    Operational Ease Single observabilityplatform One platform to deploy & manage for all your observability needs Observe the observability platform Rich monitoring features enable you to provide a reliable and performant observability platform Multi-hybrid platform lifecycle made easy Simplify orchestration with Elastic Cloud, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes
  • 9.
    Operational Ease Single observabilityplatform One platform to deploy & manage for all your observability needs Observe the observability platform Rich monitoring features enable you to provide a reliable and performant observability platform Multi-hybrid platform lifecycle made easy Simplify orchestration with Elastic Cloud, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes
  • 10.
    Operational Ease Single observabilityplatform One platform to deploy & manage for all your observability needs Observe the observability platform Rich monitoring features enable you to provide a reliable and performant observability platform Multi-hybrid platform lifecycle made easy Simplify orchestration with Elastic Cloud, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes
  • 11.
    Multitenancy Cluster per tenantmade easy Easily isolate teams without increasing admin load with ECE / ECK / Elastic Cloud Tenants in a single cluster simplified Handle multiple users on a single cluster with Kibana Spaces, RBAC, background jobs, etc Deliver on mixed needs Balance speed vs. cost with flexible data tiering and lifecycle management policy
  • 12.
    Multitenancy Cluster per tenantmade easy Easily isolate teams without increasing admin load with ECE / ECK / Elastic Cloud Tenants in a single cluster simplified Handle multiple users on a single cluster with Kibana Spaces, RBAC, background jobs, etc Deliver on mixed needs Balance speed vs. cost with flexible data tiering and lifecycle management policy
  • 13.
    Multitenancy Cluster per tenantmade easy Easily isolate teams without increasing admin load with ECE / ECK / Elastic Cloud Tenants in a single cluster simplified Handle multiple users on a single cluster with Kibana Spaces, RBAC, background jobs, etc Deliver on mixed needs Balance speed vs. cost with flexible data tiering and lifecycle management policy
  • 14.
    Multitenancy Cluster per tenantmade easy Easily isolate teams without increasing admin load with ECE / ECK / Elastic Cloud Tenants in a single cluster simplified Handle multiple users on a single cluster with Kibana Spaces, RBAC, background jobs, etc Deliver on mixed needs Balance speed vs. cost with flexible data tiering and lifecycle management policy Async/Background searches
  • 15.
    Multitenancy Cluster per tenantmade easy Easily isolate teams without increasing admin load with ECE / ECK / Elastic Cloud Tenants in a single cluster simplified Handle multiple users on a single cluster with Kibana Spaces, RBAC, background jobs, etc Deliver on mixed needs Balance speed vs. cost with flexible data tiering and lifecycle management policy
  • 16.
    Cost Management Don’t waitto buy to prove value Start (or expand) with the forever free (and open) tier, and validate value — when budgets are tight Scale and evolve, without exploding costs Control spend without compromising visibility or engineering goals with resource-based pricing Optimize resource utilization Provide low latency and low network egress costs through local data clusters and a central analysis cluster Measure usage and chargeback Telemetry to understand usage across all use cases to identify growing or underused areas and also for chargeback purposes GOLD PLATINUM ENTERPRISE Logs app APM app Metrics app Integrations / Beats APM agents Distributed tracing OpenTelemetry/Jaeger Alerting Machine learning Service maps FREE & OPEN Uptime app
  • 17.
    Cost Management Don’t waitto buy to prove value Start (or expand) with the forever free (and open) tier, and validate value — when budgets are tight Scale and evolve, without exploding costs Control spend without compromising visibility or engineering goals with resource-based pricing Optimize resource utilization Provide low latency and low network egress costs through local data clusters and a central analysis cluster Measure usage and chargeback Telemetry to understand usage across all use cases to identify growing or underused areas and also for chargeback purposes hosts agents functions containers metrics logs ingested No pricing per Our resource-based pricing is based only on the underlying infrastructure resources used by any given deployment — across all data types and use cases.
  • 18.
    Cost Management Don’t waitto buy to prove value Start (or expand) with the forever free (and open) tier, and validate value — when budgets are tight Scale and evolve, without exploding costs Control spend without compromising visibility or engineering goals with resource-based pricing Optimize resource utilization Provide low latency and low network egress costs through local data clusters and a central analysis cluster Measure usage and chargeback Telemetry to understand usage across all use cases to identify growing or underused areas and also for chargeback purposes DC1 DC2 logs / metrics Azure logs / metrics logs / metrics App / Infra App / Infra App / Infra Central Cluster
  • 19.
    Cost Management Don’t waitto buy to prove value Start (or expand) with the forever free (and open) tier, and validate value — when budgets are tight Scale and evolve, without exploding costs Control spend without compromising visibility or engineering goals with resource-based pricing Optimize resource utilization Provide low latency and low network egress costs through local data clusters and a central analysis cluster Measure usage and chargeback Telemetry to understand usage across all use cases to identify growing or underused areas and also for chargeback purposes
  • 20.
    Self-Served Onboarding Self-service enablementfor Devs Create self service mechanism teams to create clusters, log parsing pipelines, enable APM Self-service enablement for SREs OOTB integrations that include dashboards and built-in alerts and anomaly detection jobs Education SRE training, observability engineer, SRE practices enablement
  • 21.
    Self-Served Onboarding Self-service enablementfor Devs Create self service mechanism teams to create clusters, log parsing pipelines, enable APM Self-service enablement for SREs OOTB integrations that include dashboards and built-in alerts and anomaly detection jobs Education SRE training, observability engineer, SRE practices enablement
  • 22.
    Self-Served Onboarding Self-service enablementfor Devs Create self service mechanism teams to create clusters, log parsing pipelines, enable APM Self-service enablement for SREs OOTB integrations that include dashboards and built-in alerts and anomaly detection jobs Education SRE training, observability engineer, SRE practices enablement
  • 23.
    23 Self Service observabilityto Dev and SREs in minutes Helping drive MTTR/D towards zero Productivity Simplify operations (upgrades, security, etc) Standardize practices and processes to achieve software reliability goals Standardization Reduce license, deployment, and operations costs Smaller platform team supports observability needs for entire org Consolidation Centralized Observability Platform with Elastic
  • 24.