SlideShare a Scribd company logo
1 of 8
Download to read offline
Detecting fraud
in sport
| Domen Požrl, Stefano Gemin
23.11.2023
2
Agenda
Sportradar’s key datasets
The Fraud Detection System journey
The Model and the Challenges
From Day After to Real Time
The Inference Streaming Pipeline
04
05
3
Sportradar’skeydatasets
MTS BOOKMAKER
BET
- Bookmaker
- Channel (online, retail)
- Bet status (accepted, rejected, …)
- Stake
- Profit / Loss
- Sport event(s)
- Market(s) / outcome(s)
- Odds
- User info
COMPUTER
VISION
LIVESCORE
CRAWLERS
DATA
JOURNALISTS
LIVE DATA
- Event details (status, time, score)
- Timeline (ex. Shots, fouls, goals)
- Tracking data (players, ball, …)
ODDS CRAWLERS
ODDS CHANGE
- Bookmaker
- Sport event
- Event details (status, time, score)
- Market
- Odds
600+ bookmakers
crawled
~ 25 billions odds
movements per year
200+ bookmakers
with bet integration
with Sportradar
Managed Trading
Services
~ 5 billions bets per year
ANALYTICS PLATFORM
900,000+ sport events per year covered live
9,500+ data journalists globally
70+ sports
DATA LAKE
-Model training
- Batch inferencing
- Performance
monitoring
KAFKA
-Real time inferencing
INTEGRITY ANALYSTS
Over 9,000 matches
escalated since 2009
SPORT EVENT ANALYSIS
- Hotlisting data
- Escalation data
4
TheFraudDetectionSystemjourney
Alerting system
- SME driven
- Looking at anomalous odds
movements or values
- 125,000+ alerts per year
- Less than 4% of them
suspicious after manual review
Alert scoring model
- Binary classification model (on alerts)
- Features: Alert data + match properties
- 70% of alerts automatically classified
as not suspicious
Match Fixing Detection model – day
after
- Binary classification model (on
matches)
- Batch inference on matches of
previous day
- Features: alert scores + betting data
+ match properties
Match Fixing Detection model –
real time
- Features: subset of previous
model (not possible to compute
some features in real time)
- Inference time: starting one day
prior to kick off, until match is over
- In 2023 up to Q4 814 escalated
matches were initially flagged by
the model (160 of them uniquely)
2018 2022
2020
5
TheModelandtheChallenges
Binary classification model
Maximising the AUC on validation
set over multiple iterations
Model explainability
Extremely Unbalanced Dataset
9,325 escalations out of 4,522,387
Target Variable: Hotlisted
Weighted by: Escalated
Seasonality of sport events Random split on 3 years of
historical data
Keeping track with evolving
match-fixing trends
&
Unexpected event outcomes
Human in the loop
&
Continuous retraining
Crucial to provide evidence of
fixing
Feature contribution at inference
time
6
FromDayAftertoRealTime
Current time: 45'
Potential payout: 9,000 €
Est. probability: 1%
Expected Profit: 9€
Current time: 60'
Potential payout : 9,000 €
Est. probability: 25%
Expected Profit: -2,175€
Current time: End of match
Potential payout : 9,000 €
Est. probability: 100%
Expected Profit: -9,000€
45' 60' End
• A single bet placed at 45’
• Odds: 90
• Stake: 100€
• Outcome: Green Team wins
• Changes in the probability for the
outcome to occur
7
TheInferenceStreamingPipeline
• Processes up to 2000 messages per second
• Outputs around 3 predictions per second
• Score, time, alert-based predictions
• Feature construction
• Consumes 3 Kafka topics and outputs to 1
• Modelling resources fetched from S3 on
startup
• Aggregated features periodically retrieved
from Redshift
• Outputs to S3 via NiFi, with Athena on top
for analytics
THANK YOU!

More Related Content

Similar to [DSC Europe 23] Stefano Gemin & Domen Pozrl - Detecting Fraud in Sport.pdf

Decentralized Sports Media Platform
Decentralized Sports Media PlatformDecentralized Sports Media Platform
Decentralized Sports Media PlatformTanya McTavish
 
Sportz Interactive Product Portfolio
Sportz Interactive Product PortfolioSportz Interactive Product Portfolio
Sportz Interactive Product Portfolioshaileshg
 
Slot Yield Presentation 2015
Slot Yield Presentation 2015Slot Yield Presentation 2015
Slot Yield Presentation 2015richlehman
 
Financial Races
Financial RacesFinancial Races
Financial Racesepicwebaz
 
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real WorldWSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real WorldWSO2
 
Stock Pitch For Satellite Based Solutions PowerPoint Presentation Ppt Slide T...
Stock Pitch For Satellite Based Solutions PowerPoint Presentation Ppt Slide T...Stock Pitch For Satellite Based Solutions PowerPoint Presentation Ppt Slide T...
Stock Pitch For Satellite Based Solutions PowerPoint Presentation Ppt Slide T...SlideTeam
 
Pycricbuzz - a python library to fetch live cricket scores
Pycricbuzz -  a python library to fetch live cricket scoresPycricbuzz -  a python library to fetch live cricket scores
Pycricbuzz - a python library to fetch live cricket scoresShivam Mitra
 
ArcLight Tournament System
ArcLight Tournament SystemArcLight Tournament System
ArcLight Tournament SystemJason Kaehler
 
[DSC Europe 23] Stefano Gemin & Lidija Jovanovska - Leveraging Language Model...
[DSC Europe 23] Stefano Gemin & Lidija Jovanovska - Leveraging Language Model...[DSC Europe 23] Stefano Gemin & Lidija Jovanovska - Leveraging Language Model...
[DSC Europe 23] Stefano Gemin & Lidija Jovanovska - Leveraging Language Model...DataScienceConferenc1
 
Evenbet Gaming DFS Platform
Evenbet Gaming DFS PlatformEvenbet Gaming DFS Platform
Evenbet Gaming DFS Platformstardmitry
 
Simply Business and Snowplow - Multichannel Attribution Analysis
Simply Business and Snowplow - Multichannel Attribution AnalysisSimply Business and Snowplow - Multichannel Attribution Analysis
Simply Business and Snowplow - Multichannel Attribution AnalysisStewart Duncan
 
Bet strikerz updated
Bet strikerz updatedBet strikerz updated
Bet strikerz updatedAli Al-Enzi
 
M|18 Analytics in the Real World, Case Studies and Use Cases
M|18 Analytics in the Real World, Case Studies and Use CasesM|18 Analytics in the Real World, Case Studies and Use Cases
M|18 Analytics in the Real World, Case Studies and Use CasesMariaDB plc
 

Similar to [DSC Europe 23] Stefano Gemin & Domen Pozrl - Detecting Fraud in Sport.pdf (20)

Decentralized Sports Media Platform
Decentralized Sports Media PlatformDecentralized Sports Media Platform
Decentralized Sports Media Platform
 
Sportz Interactive Product Portfolio
Sportz Interactive Product PortfolioSportz Interactive Product Portfolio
Sportz Interactive Product Portfolio
 
Financial Races Presentation
Financial Races PresentationFinancial Races Presentation
Financial Races Presentation
 
Slot Yield Presentation 2015
Slot Yield Presentation 2015Slot Yield Presentation 2015
Slot Yield Presentation 2015
 
ccs 066790 Algorithm Investement Report
ccs 066790 Algorithm Investement Reportccs 066790 Algorithm Investement Report
ccs 066790 Algorithm Investement Report
 
PgConf_2016_EU.pptx
PgConf_2016_EU.pptxPgConf_2016_EU.pptx
PgConf_2016_EU.pptx
 
Casino online system arplc
Casino online system   arplcCasino online system   arplc
Casino online system arplc
 
Financial Races
Financial RacesFinancial Races
Financial Races
 
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real WorldWSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
WSO2Con USA 2015: Patterns for Deploying Analytics in the Real World
 
Stock Pitch For Satellite Based Solutions PowerPoint Presentation Ppt Slide T...
Stock Pitch For Satellite Based Solutions PowerPoint Presentation Ppt Slide T...Stock Pitch For Satellite Based Solutions PowerPoint Presentation Ppt Slide T...
Stock Pitch For Satellite Based Solutions PowerPoint Presentation Ppt Slide T...
 
datasolution 263800 Algorithm Investment Report
datasolution 263800 Algorithm Investment Reportdatasolution 263800 Algorithm Investment Report
datasolution 263800 Algorithm Investment Report
 
Pycricbuzz - a python library to fetch live cricket scores
Pycricbuzz -  a python library to fetch live cricket scoresPycricbuzz -  a python library to fetch live cricket scores
Pycricbuzz - a python library to fetch live cricket scores
 
ArcLight Tournament System
ArcLight Tournament SystemArcLight Tournament System
ArcLight Tournament System
 
[DSC Europe 23] Stefano Gemin & Lidija Jovanovska - Leveraging Language Model...
[DSC Europe 23] Stefano Gemin & Lidija Jovanovska - Leveraging Language Model...[DSC Europe 23] Stefano Gemin & Lidija Jovanovska - Leveraging Language Model...
[DSC Europe 23] Stefano Gemin & Lidija Jovanovska - Leveraging Language Model...
 
www-businesswire-com
www-businesswire-comwww-businesswire-com
www-businesswire-com
 
Intelligent Systems for Process Mining
Intelligent Systems for Process MiningIntelligent Systems for Process Mining
Intelligent Systems for Process Mining
 
Evenbet Gaming DFS Platform
Evenbet Gaming DFS PlatformEvenbet Gaming DFS Platform
Evenbet Gaming DFS Platform
 
Simply Business and Snowplow - Multichannel Attribution Analysis
Simply Business and Snowplow - Multichannel Attribution AnalysisSimply Business and Snowplow - Multichannel Attribution Analysis
Simply Business and Snowplow - Multichannel Attribution Analysis
 
Bet strikerz updated
Bet strikerz updatedBet strikerz updated
Bet strikerz updated
 
M|18 Analytics in the Real World, Case Studies and Use Cases
M|18 Analytics in the Real World, Case Studies and Use CasesM|18 Analytics in the Real World, Case Studies and Use Cases
M|18 Analytics in the Real World, Case Studies and Use Cases
 

More from DataScienceConferenc1

[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdfDataScienceConferenc1
 
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...DataScienceConferenc1
 
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdfDataScienceConferenc1
 
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdfDataScienceConferenc1
 
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdfDataScienceConferenc1
 
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptxDataScienceConferenc1
 
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdfDataScienceConferenc1
 
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...DataScienceConferenc1
 
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdfDataScienceConferenc1
 
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...DataScienceConferenc1
 
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...DataScienceConferenc1
 
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdfDataScienceConferenc1
 
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptxDataScienceConferenc1
 
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...DataScienceConferenc1
 
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptxDataScienceConferenc1
 
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...DataScienceConferenc1
 
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...DataScienceConferenc1
 
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptxDataScienceConferenc1
 
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptxDataScienceConferenc1
 
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdfDataScienceConferenc1
 

More from DataScienceConferenc1 (20)

[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
[DSC MENA 24] Mostafa_Essa_-_Ai_and_cloud.pdf
 
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
[DSC MENA 24] Yasser_El_Bendary - How NLP & LLMs model can excel in comprehen...
 
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
[DSC MENA 24] Medhat_Kandil - Empowering Egypt's AI & Biotechnology Scenes.pdf
 
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
[DSC MENA 24] Youssef_Kamal - Data governance and quality.pdf
 
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
[DSC MENA 24] Abdelrahman_Ghallab_-_Data_Product_mgmt.pdf
 
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
[DSC MENA 24] Asmaa_Eltaher_-_Innovation_Beyond_Brainstorming.pptx
 
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
[DSC MENA 24] Muhammad_Ezzat_-_Sustianable_Growth_Empowerment.pdf
 
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
[DSC MENA 24] Basma_Rady_-_Building_a_Data_Driven_Culture_in_Your_Organizatio...
 
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
[DSC MENA 24] Ahmed_Muselhy_-_Unveiling-the-Secrets-of-AI-in-Hiring.pdf
 
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
[DSC MENA 24] Ziad_Diab_-_Data-Driven_Disruption_-_The_Role_of_Data_Strategy_...
 
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
[DSC MENA 24] Mohammad_Essam_- Leveraging Scene Graphs for Generative AI and ...
 
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
[DSC MENA 24] Ahmed_Fahmy - Navigating the Future.pdf
 
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
[DSC MENA 24] Hany_Saad_Gheit_-_Azure_OpenAI_service.pptx
 
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
[DSC MENA 24] Nezar_El_Kady_-_From_Turing_to_Transformers__Navigating_the_AI_...
 
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
[DSC MENA 24] Amira_Abdelaziz_-_AI_in_Financial_Services.pptx
 
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
[DSC MENA 24] Omar_Ossama - My Journey from the Field of Oil & Gas, to the Ex...
 
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
[DSC MENA 24] Ramy_Agieb_-_Advancements_in_Artificial_Intelligence_for_Cybers...
 
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
[DSC MENA 24] Sohaila_Diab_-_Lets_Talk_Gen_AI_Presentation.pptx
 
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
[DSC MENA 24] Amal_Elgammal_-_QUALITOP_presentation.pptx
 
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
[DSC MENA 24] Abdelrahman_Sleem_-_AI_For_Marketing_DSC.pdf
 

Recently uploaded

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 

Recently uploaded (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 

[DSC Europe 23] Stefano Gemin & Domen Pozrl - Detecting Fraud in Sport.pdf

  • 1. Detecting fraud in sport | Domen Požrl, Stefano Gemin 23.11.2023
  • 2. 2 Agenda Sportradar’s key datasets The Fraud Detection System journey The Model and the Challenges From Day After to Real Time The Inference Streaming Pipeline 04 05
  • 3. 3 Sportradar’skeydatasets MTS BOOKMAKER BET - Bookmaker - Channel (online, retail) - Bet status (accepted, rejected, …) - Stake - Profit / Loss - Sport event(s) - Market(s) / outcome(s) - Odds - User info COMPUTER VISION LIVESCORE CRAWLERS DATA JOURNALISTS LIVE DATA - Event details (status, time, score) - Timeline (ex. Shots, fouls, goals) - Tracking data (players, ball, …) ODDS CRAWLERS ODDS CHANGE - Bookmaker - Sport event - Event details (status, time, score) - Market - Odds 600+ bookmakers crawled ~ 25 billions odds movements per year 200+ bookmakers with bet integration with Sportradar Managed Trading Services ~ 5 billions bets per year ANALYTICS PLATFORM 900,000+ sport events per year covered live 9,500+ data journalists globally 70+ sports DATA LAKE -Model training - Batch inferencing - Performance monitoring KAFKA -Real time inferencing INTEGRITY ANALYSTS Over 9,000 matches escalated since 2009 SPORT EVENT ANALYSIS - Hotlisting data - Escalation data
  • 4. 4 TheFraudDetectionSystemjourney Alerting system - SME driven - Looking at anomalous odds movements or values - 125,000+ alerts per year - Less than 4% of them suspicious after manual review Alert scoring model - Binary classification model (on alerts) - Features: Alert data + match properties - 70% of alerts automatically classified as not suspicious Match Fixing Detection model – day after - Binary classification model (on matches) - Batch inference on matches of previous day - Features: alert scores + betting data + match properties Match Fixing Detection model – real time - Features: subset of previous model (not possible to compute some features in real time) - Inference time: starting one day prior to kick off, until match is over - In 2023 up to Q4 814 escalated matches were initially flagged by the model (160 of them uniquely) 2018 2022 2020
  • 5. 5 TheModelandtheChallenges Binary classification model Maximising the AUC on validation set over multiple iterations Model explainability Extremely Unbalanced Dataset 9,325 escalations out of 4,522,387 Target Variable: Hotlisted Weighted by: Escalated Seasonality of sport events Random split on 3 years of historical data Keeping track with evolving match-fixing trends & Unexpected event outcomes Human in the loop & Continuous retraining Crucial to provide evidence of fixing Feature contribution at inference time
  • 6. 6 FromDayAftertoRealTime Current time: 45' Potential payout: 9,000 € Est. probability: 1% Expected Profit: 9€ Current time: 60' Potential payout : 9,000 € Est. probability: 25% Expected Profit: -2,175€ Current time: End of match Potential payout : 9,000 € Est. probability: 100% Expected Profit: -9,000€ 45' 60' End • A single bet placed at 45’ • Odds: 90 • Stake: 100€ • Outcome: Green Team wins • Changes in the probability for the outcome to occur
  • 7. 7 TheInferenceStreamingPipeline • Processes up to 2000 messages per second • Outputs around 3 predictions per second • Score, time, alert-based predictions • Feature construction • Consumes 3 Kafka topics and outputs to 1 • Modelling resources fetched from S3 on startup • Aggregated features periodically retrieved from Redshift • Outputs to S3 via NiFi, with Athena on top for analytics