Time Series Data: OpenTSDB and TSP (Betfair)

•Download as PPTX, PDF•

0 likes•1,552 views

A 5-minute IGNITE talk given at DevOpsDays Paris on Time Series Data. Focusing on Betfair's use of OpenTSDB and TSP as streaming aggregation.

Technology

CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited.
TIME SERIES DATA
Richard Haigh
Global Head of Reliability and Operations
@rakh1

CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited.
Born in 1999
UK, Romania, Portugal, Ireland, Malta, Gibraltar, USA
Engineering Blog: www.betsandbits.com
700+ Engineers and Growing
Exchange
Sportsbook
Games
~750k active users
~70M daily transactions
~2.5Bn daily API calls
~100k/s Monitoring Points
E2E P95 transaction times <4ms
One of UK’s most successful .coms with ~£2Bn Market Cap
2

CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 3
In the Beginning….
Bernard

CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 4
In the Beginning…. there was DATA!
time
value
CPU 5, betfair-exchange-007

CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 5
time
value
Time Series Data

CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 6
time
value
Time Series Data

• Capture all of these metrics, thousands
of them, hundreds of thousands of them
• From every machine in your estate
• At least every 10 seconds, maybe faster
• You can store this data for years
• With no down sampling
• Graph any combination you like
That would make you
happy, wouldn’t it?
CONFIDENTIAL and not for reproduction without prior written consent. © of Te Sporting Exchange Limited. 7
What if….

CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 8

Source :http://mo.nitor.me/hadoop-hbase-opentsdb/

OpenTSDB – in their own words
Source : http://opentsdb.net/overview.html

CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 14

OpenTSDB – limitations
Aggregator
Site Feed
Source : http://opentsdb.net/overview.html

Kale
“We’d like to introduce you to the Kale stack, which is our attempt to fix both of these problems. It
consists of two parts: Skyline and Oculus. We first use Skyline to detect anomalous metrics. Then, we
search for that metric in Oculus, to see if any other metrics look similar.” - Etsy
What next – the future?
CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 17

CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 18

http://opentsdb.net/
https://github.com/betfair/opentsp
http://riemann.io/
https://github.com/Ticketmaster/metrilyx-2.0
SRE@Betfair.com
CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 19

Richard.Haigh@Betfair.com
www.betsandbits.com
@RAKH1
CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 20
HOW TO GET IN TOUCH?

Similar to Time Series Data: OpenTSDB and TSP (Betfair)

10 Things You Can Do With New Relic - Number 9 Will Shock YouNew Relic

James Brooks (Betfair) - Show me the MetricsOutlyer

Sitecore Symposium 2018 - Cooking Up Smart Product Recommendations for Siteco...John Montes

Double your bitcoin with doubly!Toonie Yvrmagic

Crossing the chasm with a high performance dynamically scalable open source p...mark madsen

Sitecore Symposium 2018 - Getting Value Out of Your DataMichael Shaw

Finding Value in Your Data: Sitecore AnalyticsJacqueline Baxter

Databases and DragonsNew Relic

7 Technologies That Will Change The Future of MarketingSean Singleton

Peter holditch devopsPeter Holditch

Calcium Propionate Manufacturing Industry. Production of Calcium PropanoateAjjay Kumar Gupta

To the moon: scaling startups with Rocket InternetStephan Spijkers

Brace Yourself: The Future of Sales Development is Product-LedTenbound

AlgoBit - One PagerAndrey Piletsky

IRJET- Real-Time Cryptocurrency Trading SystemIRJET Journal

Sitecore Symposium 2018 - Supercharge Your Author Experience With Machine Lea...Mark Stiles

Sitecore: Session recommendation engineVarunNehra

Sitecore & Microsoft Breakfast: Building a business case for transformationSitecore

Truecaller towards a data-driven companyGetInData

Vertex presentation 2015Darren Kane

Similar to Time Series Data: OpenTSDB and TSP (Betfair) (20)

10 Things You Can Do With New Relic - Number 9 Will Shock You

James Brooks (Betfair) - Show me the Metrics

Sitecore Symposium 2018 - Cooking Up Smart Product Recommendations for Siteco...

Double your bitcoin with doubly!

Crossing the chasm with a high performance dynamically scalable open source p...

Sitecore Symposium 2018 - Getting Value Out of Your Data

Finding Value in Your Data: Sitecore Analytics

Databases and Dragons

7 Technologies That Will Change The Future of Marketing

Peter holditch devops

Calcium Propionate Manufacturing Industry. Production of Calcium Propanoate

To the moon: scaling startups with Rocket Internet

Brace Yourself: The Future of Sales Development is Product-Led

AlgoBit - One Pager

IRJET- Real-Time Cryptocurrency Trading System

Sitecore Symposium 2018 - Supercharge Your Author Experience With Machine Lea...

Sitecore: Session recommendation engine

Sitecore & Microsoft Breakfast: Building a business case for transformation

Truecaller towards a data-driven company

Vertex presentation 2015

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

AI as an Interface for Commercial BuildingsMemoori

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Understanding the Laravel MVC ArchitecturePixlogix Infotech

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

Vulnerability_Management_GRC_by Sohang Sengupta.pptxnull - The Open Security Community

Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Pigging Solutions Piggable Sweeping ElbowsPigging Solutions

Key Features Of Token Development (1).pptxLBM Solutions

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024

"Debugging python applications inside k8s environment", Andrii Soldatenko

AI as an Interface for Commercial Buildings

Dev Dives: Streamline document processing with UiPath Studio Web

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn

Are Multi-Cloud and Serverless Good or Bad?

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Understanding the Laravel MVC Architecture

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Advanced Test Driven-Development @ php[tek] 2024

Benefits Of Flutter Compared To Other Frameworks

Vulnerability_Management_GRC_by Sohang Sengupta.pptx

Science&tech:THE INFORMATION AGE STS.pdf

My Hashitalk Indonesia April 2024 Presentation

Pigging Solutions Piggable Sweeping Elbows

Key Features Of Token Development (1).pptx

Designing IA for AI - Information Architecture Conference 2024

Unraveling Multimodality with Large Language Models.pdf

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

Time Series Data: OpenTSDB and TSP (Betfair)

1. CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. TIME SERIES DATA Richard Haigh Global Head of Reliability and Operations @rakh1

2. CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. Born in 1999 UK, Romania, Portugal, Ireland, Malta, Gibraltar, USA Engineering Blog: www.betsandbits.com 700+ Engineers and Growing Exchange Sportsbook Games ~750k active users ~70M daily transactions ~2.5Bn daily API calls ~100k/s Monitoring Points E2E P95 transaction times <4ms One of UK’s most successful .coms with ~£2Bn Market Cap 2

3. CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 3 In the Beginning…. Bernard

4. CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 4 In the Beginning…. there was DATA! time value CPU 5, betfair-exchange-007

5. CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 5 time value Time Series Data

6. CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 6 time value Time Series Data

7. • Capture all of these metrics, thousands of them, hundreds of thousands of them • From every machine in your estate • At least every 10 seconds, maybe faster • You can store this data for years • With no down sampling • Graph any combination you like That would make you happy, wouldn’t it? CONFIDENTIAL and not for reproduction without prior written consent. © of Te Sporting Exchange Limited. 7 What if….

9. Betfair’s first TSDB implementation

10. Source :http://mo.nitor.me/hadoop-hbase-opentsdb/

11. TSDB is great for root cause analysis

12. But it could be better!

13. OpenTSDB – in their own words Source : http://opentsdb.net/overview.html

15. OpenTSDB – limitations Aggregator Site Feed Source : http://opentsdb.net/overview.html

16. The Site Feed

17. Kale “We’d like to introduce you to the Kale stack, which is our attempt to fix both of these problems. It consists of two parts: Skyline and Oculus. We first use Skyline to detect anomalous metrics. Then, we search for that metric in Oculus, to see if any other metrics look similar.” - Etsy What next – the future? CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 17

19. http://opentsdb.net/ https://github.com/betfair/opentsp http://riemann.io/ https://github.com/Ticketmaster/metrilyx-2.0 SRE@Betfair.com CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 19

20. Richard.Haigh@Betfair.com www.betsandbits.com @RAKH1 CONFIDENTIAL and not for reproduction without prior written consent. © of The Sporting Exchange Limited. 20 HOW TO GET IN TOUCH?

Editor's Notes

I’m going to talk to you about Time Series data. I’ll show you what it is and how we use it. important and valuable alerting and diagnostic tool we have.
Focus on 100k/s monitoring estate
A single point in your estate – means nothing
Sound good? What would be even more interesting would be if we could see how that value changed over time. Let’s bring in Bernard’s brothers…..
And finally, what if we could also bring in some other metrics, maybe some of the others we mentioned. Maybe some others. Really useful data. lifeblood of your system. If you don’t think this data is valuable then none of the rest of what I have to say will be of any interest….
And finally, what if we could also bring in some other metrics, maybe some of the others we mentioned. Maybe some others. Really useful data. lifeblood of your system. If you don’t think this data is valuable then none of the rest of what I have to say will be of any interest….
* Virtual or physical, including network devices, storage arrays and you good old fashioned application, web and database servers.
FREE!!! OMG!!!
Our first implementation. As a side note, this is a pretty effective way of getting the guys to own the hardware to provide you with decent servers in a data centre. You can jump the queue by showing them something like this.
We chose OpenTSDB We made a more usable visualiser TicketMaster made Metrylix.
TSDB is GREAT for retrospective Root Cause Analysis We still have ALL of the data since we started. 500 billion data points. ingesting data PRODUCTION estate at 70k a second.
“if only I could have been notified when this happened” And this They wanted a dashboard of graphs that update in real time. Either way, TSDB doesn’t really support these requirements in a scalable manner. Let’s go back to the TSDB architecture to see why.
From TSDB website. The metric data is sampled (by the COLLECTOR) LOCAL or REMOTE via SNMP (it’s not always possible to deploy a COLLECTOR on every machine) Sent to the TSD deduping and compression writes to HBase, which, in most cases runs on HDFS. alerting and crons. HTTP or RPC calls to TSD which in turn goes to Hbase. That’s a BIG problem. So that’s the architecture – and here’s a physical implementation.
We decided to write our own solution – called TSP and available open source
drop in replacement for the tcollectors. (Forwarders) More efficient Write to multiple targets. Still write to TSDB second new component, the Aggregator. Out of the aggregator is Site Feed stream of the metric data from ALL sources in the estate and I can add any number of subscribers. I can now use these valuable metrics in REAL TIME from multiple CONCURRENT consumers.
Currently 3 consumers PLUS TSDB Simple Heath Check on feed. long delays METRICS that stop Profiles source of the metrics. In Memory TSDB. Suport Nagios. Not perfect, but easy. Riemann.
Riemann handles the CONFIGURED alert problem well. But there are 10s of thousands of metrics captured because we like to capture ALL of the metrics. How do we find the valuable information in there? Luckily for us, Etsy asked the same question and then provided what we hope is the answer with Kale. --- ---- After that …. We don’t know.
Maybe the future is self-aware artificial intelligence defence network

Time Series Data: OpenTSDB and TSP (Betfair)

Recommended

Recommended

More Related Content

Similar to Time Series Data: OpenTSDB and TSP (Betfair)

Similar to Time Series Data: OpenTSDB and TSP (Betfair) (20)

Recently uploaded

Recently uploaded (20)

Time Series Data: OpenTSDB and TSP (Betfair)

Editor's Notes