SlideShare a Scribd company logo
1 of 22
Download to read offline
Ticketmaster confidential. Do not distribute.
Ticketmaster confidential. Do not distribute.
Prometheus +
Thanos
Thanos long term storage for Prometheus
Ticketmaster confidential. Do not distribute.
About me...
• Rocking it at Ticketmaster
• Student in Master of Business Administration (MBA) Strategic project management
• Bachelor of Applied Science (B.A.Sc.) in Computer Science
• Devops, Cloud, Kubernetes, Kafka, etc.
• Sport lover (Hockey, Dek Hockey, Ultimate Frisbee, Football…)
2
SECTION
Ticketmaster confidential. Do not distribute.
Where Prometheus & Thanos are in the CNCF landscape...
3
Prometheus
Ticketmaster confidential. Do not distribute.
Insert photography and crop to size of grey box.
(24.25 in. x 7.21 in.)
What is Prometheus?
4
Prometheus
Prometheus was the second project to be graduated of the Cloud Native Computing Foundation.
Ticketmaster confidential. Do not distribute.
Prometheus Architecture
Ticketmaster confidential. Do not distribute.
How we are using Prometheus at Ticketmaster?
• We are using the Prometheus-Operator made by CoreOS (RedHat (IBM))
• We have created a Helm chart that created common pulling jobs, settings and exporters
• Scrape EC2 instances based on Tags, Kubernetes
• Exporters like cloudwatch-metrics, kafka, blackbox
• Ingresses
• Thanos
• Federations
6
Prometheus
Ticketmaster confidential. Do not distribute.
Why moving away from the federation?
• Calculate the right ingestion rate of our scrape job is not really easy
• When the disk is full
• Prometheus stop working
• We needed to delete the PVC to recreated it bigger.
• Long term storage is costly SSD == $$$$
• Single point of failure
• Availability
• Operator error
• Hardware failure
• Rollout
7
Prometheus
Ticketmaster confidential. Do not distribute.
Solution
Ticketmaster confidential. Do not distribute.
Goals
9
Thanos
- Easy Deployment model
- Minimal number of dependencies
- Minimal baseline cost
Have a global view Seamless
integration with
Prometheus
Increase retentionHave a HA in place
Ticketmaster confidential. Do not distribute.
Global view
10
Thanos
Ticketmaster confidential. Do not distribute.
Global view + HA
11
Thanos
Ticketmaster confidential. Do not distribute.
Increase retention (Persist data)
12
Thanos
Ticketmaster confidential. Do not distribute.
Increase retention (Querying)
• A series is made up of one or more “chunks”
• A chunk contains ~120 samples each
• Chunks can be retrieved through HTTP byte range queries
Example:
• 1000 series @ 30s scrape interval
• Query 1 year
• 8.7 million chunks/range queries
• Chunks of the same series are aligned
• Similar series are aligned due to same metric name
This reduce request count by 4=6 orders of magnitude.
8.7 million requests turned into O(20) requests
•
13
Thanos
Ticketmaster confidential. Do not distribute.
Increase retention (Querying)
14
Thanos
group chunk same series group by same metric name
Ticketmaster confidential. Do not distribute.
Compaction
15
Thanos
Ticketmaster confidential. Do not distribute.
Full Architecture
16
Thanos
Ticketmaster confidential. Do not distribute.
Deployment Model( Example)
• Federation through Store API
17
Thanos
Ticketmaster confidential. Do not distribute.
Cost
• Store + Query node + Compaction ~ Savings on Prometheus side (+/- 0)
• Fewer SSD space on Prometheus side (Savings)
• Basically we are only paying for your data stored in S3/GCS/etc + requests
18
Thanos
Ticketmaster confidential. Do not distribute.
Cortex
19
Alternatives
Ticketmaster confidential. Do not distribute.
Cortex - Reason that we didn’t choose Cortex
• Documentation is not really good
• Need to maintain another dataset (NoSQL DB)
• Interfere with the datapath
20
Alternatives
Ticketmaster confidential. Do not distribute.
Question?
Ticketmaster confidential. Do not distribute.

More Related Content

What's hot

Distributed Tracing with Jaeger
Distributed Tracing with JaegerDistributed Tracing with Jaeger
Distributed Tracing with JaegerInho Kang
 
Docker and the Linux Kernel
Docker and the Linux KernelDocker and the Linux Kernel
Docker and the Linux KernelDocker, Inc.
 
Container Orchestration
Container OrchestrationContainer Orchestration
Container Orchestrationdfilppi
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaArvind Kumar G.S
 
Storing 16 Bytes at Scale
Storing 16 Bytes at ScaleStoring 16 Bytes at Scale
Storing 16 Bytes at ScaleFabian Reinartz
 
Intro to Helm for Kubernetes
Intro to Helm for KubernetesIntro to Helm for Kubernetes
Intro to Helm for KubernetesCarlos E. Salazar
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For OperatorsKevin Brockhoff
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...LibbySchulze
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry IntroDimitrisFinas1
 
Kubernetes
KubernetesKubernetes
KubernetesHenry He
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introductionRico Chen
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backendSebastian Poxhofer
 
Prometheus
PrometheusPrometheus
Prometheuswyukawa
 
Kubernetes: A Short Introduction (2019)
Kubernetes: A Short Introduction (2019)Kubernetes: A Short Introduction (2019)
Kubernetes: A Short Introduction (2019)Megan O'Keefe
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfNETWAYS
 
Github Actions and Terraform.pdf
Github Actions and Terraform.pdfGithub Actions and Terraform.pdf
Github Actions and Terraform.pdfVishwas N
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxRomanKhavronenko
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema
 
Cloud Monitoring with Prometheus
Cloud Monitoring with PrometheusCloud Monitoring with Prometheus
Cloud Monitoring with PrometheusQAware GmbH
 

What's hot (20)

Distributed Tracing with Jaeger
Distributed Tracing with JaegerDistributed Tracing with Jaeger
Distributed Tracing with Jaeger
 
Docker and the Linux Kernel
Docker and the Linux KernelDocker and the Linux Kernel
Docker and the Linux Kernel
 
Container Orchestration
Container OrchestrationContainer Orchestration
Container Orchestration
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
Storing 16 Bytes at Scale
Storing 16 Bytes at ScaleStoring 16 Bytes at Scale
Storing 16 Bytes at Scale
 
Intro to Helm for Kubernetes
Intro to Helm for KubernetesIntro to Helm for Kubernetes
Intro to Helm for Kubernetes
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
 
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...Understand your system like never before with OpenTelemetry, Grafana, and Pro...
Understand your system like never before with OpenTelemetry, Grafana, and Pro...
 
Meetup OpenTelemetry Intro
Meetup OpenTelemetry IntroMeetup OpenTelemetry Intro
Meetup OpenTelemetry Intro
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 
Opentelemetry - From frontend to backend
Opentelemetry - From frontend to backendOpentelemetry - From frontend to backend
Opentelemetry - From frontend to backend
 
Prometheus
PrometheusPrometheus
Prometheus
 
Kubernetes: A Short Introduction (2019)
Kubernetes: A Short Introduction (2019)Kubernetes: A Short Introduction (2019)
Kubernetes: A Short Introduction (2019)
 
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdfOSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
OSMC 2022 | OpenTelemetry 101 by Dotan Horovit s.pdf
 
Github Actions and Terraform.pdf
Github Actions and Terraform.pdfGithub Actions and Terraform.pdf
Github Actions and Terraform.pdf
 
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptxGrafana Mimir and VictoriaMetrics_ Performance Tests.pptx
Grafana Mimir and VictoriaMetrics_ Performance Tests.pptx
 
MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)MeetUp Monitoring with Prometheus and Grafana (September 2018)
MeetUp Monitoring with Prometheus and Grafana (September 2018)
 
Terraform
TerraformTerraform
Terraform
 
Cloud Monitoring with Prometheus
Cloud Monitoring with PrometheusCloud Monitoring with Prometheus
Cloud Monitoring with Prometheus
 

Similar to Ticketmaster's use of Thanos for long term Prometheus storage

Presto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop MeetupPresto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop MeetupJustin Borgman
 
Implementing a canonical IoT backend in Azure with Azure Stream Analytics
Implementing a canonical IoT backend in Azure with Azure Stream AnalyticsImplementing a canonical IoT backend in Azure with Azure Stream Analytics
Implementing a canonical IoT backend in Azure with Azure Stream AnalyticsMarco Parenzan
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineDataWorks Summit
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Gruter
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeItai Yaffe
 
Token Design as Optimization Design
Token Design as Optimization DesignToken Design as Optimization Design
Token Design as Optimization DesignTrent McConaghy
 
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...Chris Fregly
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Brian Brazil
 
A Production Quality Sketching Library for the Analysis of Big Data
A Production Quality Sketching Library for the Analysis of Big DataA Production Quality Sketching Library for the Analysis of Big Data
A Production Quality Sketching Library for the Analysis of Big DataDatabricks
 
AWS Roadshow Herbst 2013: Datenanalyse und Business Intelligence
AWS Roadshow Herbst 2013: Datenanalyse und Business IntelligenceAWS Roadshow Herbst 2013: Datenanalyse und Business Intelligence
AWS Roadshow Herbst 2013: Datenanalyse und Business IntelligenceAWS Germany
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Raja Chiky
 
SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017Jags Ramnarayan
 
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Databasejavier ramirez
 
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...Amazon Web Services
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleSriram Krishnan
 
Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3Amazon Web Services
 

Similar to Ticketmaster's use of Thanos for long term Prometheus storage (20)

Presto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop MeetupPresto at Tivo, Boston Hadoop Meetup
Presto at Tivo, Boston Hadoop Meetup
 
Implementing a canonical IoT backend in Azure with Azure Stream Analytics
Implementing a canonical IoT backend in Azure with Azure Stream AnalyticsImplementing a canonical IoT backend in Azure with Azure Stream Analytics
Implementing a canonical IoT backend in Azure with Azure Stream Analytics
 
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized EngineApache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
Apache Tajo: Query Optimization Techniques and JIT-based Vectorized Engine
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
 
Big data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real timeBig data serving: Processing and inference at scale in real time
Big data serving: Processing and inference at scale in real time
 
Token Design as Optimization Design
Token Design as Optimization DesignToken Design as Optimization Design
Token Design as Optimization Design
 
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
Practical Data Science Workshop - Recommendation Systems - Collaborative Filt...
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
 
Starboard Solutions - 3PL
Starboard Solutions - 3PLStarboard Solutions - 3PL
Starboard Solutions - 3PL
 
A Production Quality Sketching Library for the Analysis of Big Data
A Production Quality Sketching Library for the Analysis of Big DataA Production Quality Sketching Library for the Analysis of Big Data
A Production Quality Sketching Library for the Analysis of Big Data
 
AWS Roadshow Herbst 2013: Datenanalyse und Business Intelligence
AWS Roadshow Herbst 2013: Datenanalyse und Business IntelligenceAWS Roadshow Herbst 2013: Datenanalyse und Business Intelligence
AWS Roadshow Herbst 2013: Datenanalyse und Business Intelligence
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
 
SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017SnappyData at Spark Summit 2017
SnappyData at Spark Summit 2017
 
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
 
Break out of The Box - Part 2
Break out of The Box - Part 2Break out of The Box - Part 2
Break out of The Box - Part 2
 
Jan 2012 HUG: Storm
Jan 2012 HUG: StormJan 2012 HUG: Storm
Jan 2012 HUG: Storm
 
Your Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic DatabaseYour Timestamps Deserve Better than a Generic Database
Your Timestamps Deserve Better than a Generic Database
 
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
 
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at ScaleData Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
 
Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3Interactively Querying Large-scale Datasets on Amazon S3
Interactively Querying Large-scale Datasets on Amazon S3
 

More from CloudOps2005

Defense in Depth: Securing your new Kubernetes cluster from the challenges th...
Defense in Depth: Securing your new Kubernetes cluster from the challenges th...Defense in Depth: Securing your new Kubernetes cluster from the challenges th...
Defense in Depth: Securing your new Kubernetes cluster from the challenges th...CloudOps2005
 
Human No, Machine Yes: Welcome to the CDF with Incremental Confidence
Human No, Machine Yes: Welcome to the CDF with Incremental ConfidenceHuman No, Machine Yes: Welcome to the CDF with Incremental Confidence
Human No, Machine Yes: Welcome to the CDF with Incremental ConfidenceCloudOps2005
 
The Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with KubernetesThe Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with KubernetesCloudOps2005
 
Own your Destiny in the Cloud - Ian Rae - Cloud Native Day Montreal 2019
Own your Destiny in the Cloud - Ian Rae - Cloud Native Day Montreal 2019Own your Destiny in the Cloud - Ian Rae - Cloud Native Day Montreal 2019
Own your Destiny in the Cloud - Ian Rae - Cloud Native Day Montreal 2019CloudOps2005
 
Plateformes et infrastructure infonuagique natif de ville de Montréall
Plateformes et infrastructure infonuagique natif de ville de MontréallPlateformes et infrastructure infonuagique natif de ville de Montréall
Plateformes et infrastructure infonuagique natif de ville de MontréallCloudOps2005
 
Using Rook to Manage Kubernetes Storage with Ceph
Using Rook to Manage Kubernetes Storage with CephUsing Rook to Manage Kubernetes Storage with Ceph
Using Rook to Manage Kubernetes Storage with CephCloudOps2005
 
Kafka on Kubernetes
Kafka on KubernetesKafka on Kubernetes
Kafka on KubernetesCloudOps2005
 
Kubernetes: Crossing the Chasm
Kubernetes: Crossing the ChasmKubernetes: Crossing the Chasm
Kubernetes: Crossing the ChasmCloudOps2005
 
Distributed Logging with Kubernetes
Distributed Logging with KubernetesDistributed Logging with Kubernetes
Distributed Logging with KubernetesCloudOps2005
 
Kubernetes Security with Calico and Open Policy Agent
Kubernetes Security with Calico and Open Policy AgentKubernetes Security with Calico and Open Policy Agent
Kubernetes Security with Calico and Open Policy AgentCloudOps2005
 
Advanced Deployment Strategies with Kubernetes and Istio
Advanced Deployment Strategies with Kubernetes and IstioAdvanced Deployment Strategies with Kubernetes and Istio
Advanced Deployment Strategies with Kubernetes and IstioCloudOps2005
 
GitOps with ArgoCD
GitOps with ArgoCDGitOps with ArgoCD
GitOps with ArgoCDCloudOps2005
 
Kubernetes Services are sooo Yesterday!
Kubernetes Services are sooo Yesterday!Kubernetes Services are sooo Yesterday!
Kubernetes Services are sooo Yesterday!CloudOps2005
 
Amazon EKS: the good, the bad, and the ugly
Amazon EKS: the good, the bad, and the uglyAmazon EKS: the good, the bad, and the ugly
Amazon EKS: the good, the bad, and the uglyCloudOps2005
 
Kubernetes, Terraform, Vault, and Consul
Kubernetes, Terraform, Vault, and ConsulKubernetes, Terraform, Vault, and Consul
Kubernetes, Terraform, Vault, and ConsulCloudOps2005
 
SIG Multicluster and the Path to Federation
SIG Multicluster and the Path to FederationSIG Multicluster and the Path to Federation
SIG Multicluster and the Path to FederationCloudOps2005
 
To Russia with Love: Deploying Kubernetes in Exotic Locations On Prem
To Russia with Love: Deploying Kubernetes in Exotic Locations On PremTo Russia with Love: Deploying Kubernetes in Exotic Locations On Prem
To Russia with Love: Deploying Kubernetes in Exotic Locations On PremCloudOps2005
 
Operator SDK for K8s using Go
Operator SDK for K8s using GoOperator SDK for K8s using Go
Operator SDK for K8s using GoCloudOps2005
 
How to Handle your Kubernetes Upgrades
How to Handle your Kubernetes UpgradesHow to Handle your Kubernetes Upgrades
How to Handle your Kubernetes UpgradesCloudOps2005
 
Kubernetes and Cloud Native Meetup - March, 2019
Kubernetes and Cloud Native Meetup - March, 2019Kubernetes and Cloud Native Meetup - March, 2019
Kubernetes and Cloud Native Meetup - March, 2019CloudOps2005
 

More from CloudOps2005 (20)

Defense in Depth: Securing your new Kubernetes cluster from the challenges th...
Defense in Depth: Securing your new Kubernetes cluster from the challenges th...Defense in Depth: Securing your new Kubernetes cluster from the challenges th...
Defense in Depth: Securing your new Kubernetes cluster from the challenges th...
 
Human No, Machine Yes: Welcome to the CDF with Incremental Confidence
Human No, Machine Yes: Welcome to the CDF with Incremental ConfidenceHuman No, Machine Yes: Welcome to the CDF with Incremental Confidence
Human No, Machine Yes: Welcome to the CDF with Incremental Confidence
 
The Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with KubernetesThe Salmon Algorithm Spawning with Kubernetes
The Salmon Algorithm Spawning with Kubernetes
 
Own your Destiny in the Cloud - Ian Rae - Cloud Native Day Montreal 2019
Own your Destiny in the Cloud - Ian Rae - Cloud Native Day Montreal 2019Own your Destiny in the Cloud - Ian Rae - Cloud Native Day Montreal 2019
Own your Destiny in the Cloud - Ian Rae - Cloud Native Day Montreal 2019
 
Plateformes et infrastructure infonuagique natif de ville de Montréall
Plateformes et infrastructure infonuagique natif de ville de MontréallPlateformes et infrastructure infonuagique natif de ville de Montréall
Plateformes et infrastructure infonuagique natif de ville de Montréall
 
Using Rook to Manage Kubernetes Storage with Ceph
Using Rook to Manage Kubernetes Storage with CephUsing Rook to Manage Kubernetes Storage with Ceph
Using Rook to Manage Kubernetes Storage with Ceph
 
Kafka on Kubernetes
Kafka on KubernetesKafka on Kubernetes
Kafka on Kubernetes
 
Kubernetes: Crossing the Chasm
Kubernetes: Crossing the ChasmKubernetes: Crossing the Chasm
Kubernetes: Crossing the Chasm
 
Distributed Logging with Kubernetes
Distributed Logging with KubernetesDistributed Logging with Kubernetes
Distributed Logging with Kubernetes
 
Kubernetes Security with Calico and Open Policy Agent
Kubernetes Security with Calico and Open Policy AgentKubernetes Security with Calico and Open Policy Agent
Kubernetes Security with Calico and Open Policy Agent
 
Advanced Deployment Strategies with Kubernetes and Istio
Advanced Deployment Strategies with Kubernetes and IstioAdvanced Deployment Strategies with Kubernetes and Istio
Advanced Deployment Strategies with Kubernetes and Istio
 
GitOps with ArgoCD
GitOps with ArgoCDGitOps with ArgoCD
GitOps with ArgoCD
 
Kubernetes Services are sooo Yesterday!
Kubernetes Services are sooo Yesterday!Kubernetes Services are sooo Yesterday!
Kubernetes Services are sooo Yesterday!
 
Amazon EKS: the good, the bad, and the ugly
Amazon EKS: the good, the bad, and the uglyAmazon EKS: the good, the bad, and the ugly
Amazon EKS: the good, the bad, and the ugly
 
Kubernetes, Terraform, Vault, and Consul
Kubernetes, Terraform, Vault, and ConsulKubernetes, Terraform, Vault, and Consul
Kubernetes, Terraform, Vault, and Consul
 
SIG Multicluster and the Path to Federation
SIG Multicluster and the Path to FederationSIG Multicluster and the Path to Federation
SIG Multicluster and the Path to Federation
 
To Russia with Love: Deploying Kubernetes in Exotic Locations On Prem
To Russia with Love: Deploying Kubernetes in Exotic Locations On PremTo Russia with Love: Deploying Kubernetes in Exotic Locations On Prem
To Russia with Love: Deploying Kubernetes in Exotic Locations On Prem
 
Operator SDK for K8s using Go
Operator SDK for K8s using GoOperator SDK for K8s using Go
Operator SDK for K8s using Go
 
How to Handle your Kubernetes Upgrades
How to Handle your Kubernetes UpgradesHow to Handle your Kubernetes Upgrades
How to Handle your Kubernetes Upgrades
 
Kubernetes and Cloud Native Meetup - March, 2019
Kubernetes and Cloud Native Meetup - March, 2019Kubernetes and Cloud Native Meetup - March, 2019
Kubernetes and Cloud Native Meetup - March, 2019
 

Recently uploaded

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Ticketmaster's use of Thanos for long term Prometheus storage

  • 1. Ticketmaster confidential. Do not distribute. Ticketmaster confidential. Do not distribute. Prometheus + Thanos Thanos long term storage for Prometheus
  • 2. Ticketmaster confidential. Do not distribute. About me... • Rocking it at Ticketmaster • Student in Master of Business Administration (MBA) Strategic project management • Bachelor of Applied Science (B.A.Sc.) in Computer Science • Devops, Cloud, Kubernetes, Kafka, etc. • Sport lover (Hockey, Dek Hockey, Ultimate Frisbee, Football…) 2 SECTION
  • 3. Ticketmaster confidential. Do not distribute. Where Prometheus & Thanos are in the CNCF landscape... 3 Prometheus
  • 4. Ticketmaster confidential. Do not distribute. Insert photography and crop to size of grey box. (24.25 in. x 7.21 in.) What is Prometheus? 4 Prometheus Prometheus was the second project to be graduated of the Cloud Native Computing Foundation.
  • 5. Ticketmaster confidential. Do not distribute. Prometheus Architecture
  • 6. Ticketmaster confidential. Do not distribute. How we are using Prometheus at Ticketmaster? • We are using the Prometheus-Operator made by CoreOS (RedHat (IBM)) • We have created a Helm chart that created common pulling jobs, settings and exporters • Scrape EC2 instances based on Tags, Kubernetes • Exporters like cloudwatch-metrics, kafka, blackbox • Ingresses • Thanos • Federations 6 Prometheus
  • 7. Ticketmaster confidential. Do not distribute. Why moving away from the federation? • Calculate the right ingestion rate of our scrape job is not really easy • When the disk is full • Prometheus stop working • We needed to delete the PVC to recreated it bigger. • Long term storage is costly SSD == $$$$ • Single point of failure • Availability • Operator error • Hardware failure • Rollout 7 Prometheus
  • 8. Ticketmaster confidential. Do not distribute. Solution
  • 9. Ticketmaster confidential. Do not distribute. Goals 9 Thanos - Easy Deployment model - Minimal number of dependencies - Minimal baseline cost Have a global view Seamless integration with Prometheus Increase retentionHave a HA in place
  • 10. Ticketmaster confidential. Do not distribute. Global view 10 Thanos
  • 11. Ticketmaster confidential. Do not distribute. Global view + HA 11 Thanos
  • 12. Ticketmaster confidential. Do not distribute. Increase retention (Persist data) 12 Thanos
  • 13. Ticketmaster confidential. Do not distribute. Increase retention (Querying) • A series is made up of one or more “chunks” • A chunk contains ~120 samples each • Chunks can be retrieved through HTTP byte range queries Example: • 1000 series @ 30s scrape interval • Query 1 year • 8.7 million chunks/range queries • Chunks of the same series are aligned • Similar series are aligned due to same metric name This reduce request count by 4=6 orders of magnitude. 8.7 million requests turned into O(20) requests • 13 Thanos
  • 14. Ticketmaster confidential. Do not distribute. Increase retention (Querying) 14 Thanos group chunk same series group by same metric name
  • 15. Ticketmaster confidential. Do not distribute. Compaction 15 Thanos
  • 16. Ticketmaster confidential. Do not distribute. Full Architecture 16 Thanos
  • 17. Ticketmaster confidential. Do not distribute. Deployment Model( Example) • Federation through Store API 17 Thanos
  • 18. Ticketmaster confidential. Do not distribute. Cost • Store + Query node + Compaction ~ Savings on Prometheus side (+/- 0) • Fewer SSD space on Prometheus side (Savings) • Basically we are only paying for your data stored in S3/GCS/etc + requests 18 Thanos
  • 19. Ticketmaster confidential. Do not distribute. Cortex 19 Alternatives
  • 20. Ticketmaster confidential. Do not distribute. Cortex - Reason that we didn’t choose Cortex • Documentation is not really good • Need to maintain another dataset (NoSQL DB) • Interfere with the datapath 20 Alternatives
  • 21. Ticketmaster confidential. Do not distribute. Question?
  • 22. Ticketmaster confidential. Do not distribute.