2. • Identify variables that matter in terms of winning
• Offer actionable insights for match strategy plan
Problem: Use data to impact soccer match outcome
Solution: NextMatch
“Many organizations spend considerable sums on collecting data, but lack
the capability to … analyze the data using machine learning.”
– Tony Strudwick, Head of Athletic Development, Manchester United FC
“(By doing simple correlation, one finds no single stats matters for
winning), the only stats that matters is winning”
– Jo Clubb, Lead Sport Scientist, Chelsea FC
5. Scraped 3 years of match stats for each team in the Spanish Soccer
League from whoscored.com
Data and Algorithm
Outcome of Team
A’s match
Features of team
A
Features of team A’s
opponent
1: win / 0: not win Passes, Shots, … Passes_OP, Shots_OP, …
Step Algorithm Advantage
1. Feature
Selection
Random Forest Dealing with large No. of
features while maintaining
interpretability
2. Classification Logistic regression
using features
selected above
Simple; Easy interpretation;
Sign of coefficients
A different set of models for each team
6. Model Performance
Model performs well on most teams
Training: 12/13-13/14 season, 76 matches / team; Testing: 14/15 season, 18 matches / team
7. Model Performance
Matches of teams that change squad a lot in a short period are less predictable
Training: 12/13-13/14 season, 76 matches / team; Testing: 14/15 season, 18 matches / team
8. Use Case – for the team Valencia
Valencia Sevilla
Round 1 1 1
Go to NextMatch
Before Round 2
9. Use Case – for the team Valencia
Match Valencia Sevilla
Round 2 3 1
Recommendations
Valencia “followed”