Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SplunkLive! London 2019: Paddy Power Betfair


Published on

The SRE team from Paddy Power Betfair explain how Splunk is used to handle 13TB of daily ingest.

Published in: Technology
  • Nothing short of a miracle! I'm writing on behalf of my husband to send you a BIG THANK YOU!! The improvement has been amazing. Peter's sleep apnea was a huge worry for both of us, and it left us both feeling tired and drowsy every morning. What you've discovered here is nothing short of a miracle. God bless you. ♣♣♣
    Are you sure you want to  Yes  No
    Your message goes here
  • You're Going to Want to Read This Before Tonight ➤➤
    Are you sure you want to  Yes  No
    Your message goes here

SplunkLive! London 2019: Paddy Power Betfair

  1. 1. WHAT DO WE DO WITH THE 13TB OF DAILY INGEST? Paddy Power Betfair: Who can handle our data? David Ashe Senior SRE Gerry Healy SRE SplunkLive! London June 2019
  2. 2. 26/06/2019 PPB SRE team David Ashe, Senior SRE Dublin Office Gerry Healy, SRE Dublin 11 years in Banking Over 9 years in PPB SRE based in Dublin, London and Porto  Consultancy  Monitoring and Alerting  Automation
  3. 3. 26/06/2019 [CELLRANGE ], 51% [CELLRANGE ], 21% [CELLRANGE ], 10% [CELLRANGE ], 18% Market Product UK and Ireland UK&I, Europe, ROW Australia USA USA Sportsbook and Gaming Sportsbook, Exchange and Gaming Sportsbook Sportsbook and Daily- Fantasy-Sports Advanced Deposit Wagering (Tote) and Television broadcast Channel Online and Retail Online Online Online and Retail Online …plus a growing B2B portfolio… Brand Revenue Mix1 Georgia, Armenia Sportsbook and Gaming Online Paddy Power Betfair: Part of the Flutter group
  4. 4. 26/06/2019 Exchange – Sophisticated Sports Bettors
  5. 5. 26/06/2019
  6. 6. 26/06/2019 Situation before Splunk Paddy Power and Betfair merged 2015  After the merger there were a lot of synergies to be made. Single tools chosen across the board  Manage Large number of sources, hosts (1000s) and users  Scale well, Loads of Data, (7-15TBs) of daily ingest  Initially required for Dev and ITOps to monitor and get stats
  7. 7. 26/06/2019 Why Splunk Cloud?  Managed by Splunk in the cloud, scales very easily  Loads of free training and support resources on the web  Splunk support, CRM/CSM (Gavin Nash) provided. Escalate anything to them.  Easy to onboard data  Easy to Automate in our pipeline deployments – we have over 10000 devices so automating as much as possible is crucial  Integrates great with other alerting tools - email, Slack and PagerDuty when alerting on issues.  Single sign on with windows makes user management simple
  8. 8. 26/06/2019 Ingestion increase over 24 months 1TB to 13TB without compromising effectiveness of the tool
  9. 9. 26/06/2019 PPB consists on 100’s of microservices
  10. 10. 26/06/2019 Splunk Architecture and metrics 7 TB Average Daily Ingest 1700 Users 1m+ Daily Searches 1250 Dashboards
  11. 11. Use Cases
  12. 12. 26/06/2019 Fraud  Protect Customer accounts  One of the most active users of Splunk in PPBF  Identify accounts that have had a high number of failed login attempts  Suspend accounts, contact customers and ask them to use a strong password  Attacks from countries where gambling is restricted or banned totally
  13. 13. 26/06/2019 Fraud - Quickly identify risk accounts Betfair Yesterday’s Failed logins per country Betfair Yesterday’s Successful logins per country Betfair Last 60 mins Failed logins per country Betfair Last 60 minutes Successful logins per country
  14. 14. 26/06/2019 Customer Services – Reacting quicker  Be aware of issues before increase in contacts  Get Tactical Messages out to stem contact levels  Shorter queue and a better service for customers  Used to investigate common issues, quicker turnaround  Looking to expand to deal with other common issues
  15. 15. 26/06/2019 Customer Services – Reacting quicker
  16. 16. 26/06/2019 Capacity Management – REST interfaces  Know what your inventory is and plan for future requirements  Understand VM distribution and resilience  Ingests data produced by nightly jobs that make API calls to OpenStack and ServiceNow  Joins the data to build customized dashboards
  17. 17. 26/06/2019 Capacity Management – Custom built to help manage our private cloud
  18. 18. 26/06/2019 Capacity Management – Drill down to find TLA(Micro Service) owners
  19. 19. 26/06/2019 Capacity Management – Distribution of VMs on Hypervisors
  20. 20. 26/06/2019  Grand National busiest day of the year  Ingesting 13TB of Data  Critical to have zero latency  Potential loss of revenue, customers and reputation  Confirm fully recovered Value of Splunk – Zero latency during busy days
  21. 21. Top tips  Using correct sourcetypes = cleaner data  Dashboards should only have enough panels to fill your screen. Save panels as saved search  Splunk Answers is a great resource  Tune Splunk – work with Splunk to ensure you are sending data in the most efficient way Next steps  Promote Splunk’s capabilities to more commercial teams in PPB  With the help of our CSM - organize roadshows in our European locations  Continuous improvements
  22. 22. Thank You