Winning in Sports with Networks

Winning in Sports with Networks
Konstantinos Pelechrinis
University of Pittsburgh
@kpelechrinis

Why now?
• Data analysis & use of statistics is not new in sports!!
• Now we have the technology to collect many more detailed information
about the game
• Detailed/Advanced box score
• Play-by-play data
• Player tracking

So…
• If you like data and sports…
• ...and you are too short to play basketball professionally
• You work on sports analytics 

In this talk
• Focus on network methods for sports applications
• Team ranking
• Lineup rating
• Player characterization

Team sports are won by…teams
• Single player evaluation metrics might not capture the interactions between
players when they appear in the same lineup

Team Rankings
• Team rankings are permanent feature in media content
• Power rankings for NFL are updated after every week’s games
• Typically decided by pundits
• However they are also important in the operations of leagues
• Playoff seeding
• NCAA tournament seeding and bowl games selection

Network-based Ranking
• Networks represent connection patterns between objects
• Who-wins-whom networks can utilize vertex ranking metrics to rank teams
Pelechrinis, Konstantinos, Evangelos Papalexakis, and Christos Faloutsos.
"Sportsnetrank: Network-based sports team ranking." ACM SIGKDD
Workshop on Large Scale Sports Analytics. 2016.

• PageRank:

0.5
0.6
0.7
0.8
1 2 3 4
Season Period
Accuracy
Method
Base
Net

• Intransitivity is realized through cycles in the network
• An acyclic network can provide a total ranking
• Compute correlation between minimum arc set and performance differential
between SportsNetRank and baseline
-0.19 (p-value 0.015) -0.21 (p-value 0.039)

Lineup Evaluation
• Summing up individual players’ performance (e.g., PM) is not appropriate
• Why not rely on networks?

LinNet: Network-based Lineup Evaluation
wi
wj
wr
node2vec

Network Embedding for LinNet
• Consider a network G=(V, E)
• node2vec learns a low dimensional vector representation of the nodes, f: V
ℝd, by optimizing a neighborhood preserving objective
• Maximum likelihood estimation problem
• With NR(u) being the neighborhood identified for node u with random walk strategy R,
f is learned through the following maximization problem
node2vec: Scalable Feature Learning for Networks.
A. Grover, J. Leskovec. ACM SIGKDD 2016.

Two standard assumptions
• Conditional independence: Likelihood of observing node ui ∈ NR(u) is
independent of observing any other node in NR(u)
• Feature space symmetry: A source node and a neighborhood node have
symmetric effect over each other in feature space

Two standard assumptions
• The objective function simplifies to:

Sampling strategies
• For every node u we sample several neighborhoods of size k
• This provides the probability of observing node v in NS(u)
• With ci being the ith node on a walk started at node c0=u:
• πvx is the unnormalized transition probability between nodes v and x

Sampling strategies
• In order to allow for a large spectrum of sampling strategies πvx is expressed
as the product of a bias term α and the edge weights w
• 2nd order random walk
• The search bias term α is parametrized by two variables p and q and is a
function of the node t via which the random walker reached current node v
Shortest path distance
between t and x

Sampling strategies: Intuition
• p and q control how fast the random walker explores/leaves the vicinity of
the starting node
• High p (>max(q,1))  less likely to sample a previously sampled node
• q > 1  random walker is biased towards nodes close to
node t
node2vec: Scalable Feature Learning for Networks.
A. Grover, J. Leskovec. ACM SIGKDD 2016.
LinNet: p = 0.5 and q = 3
d = 128

Latent Bradley-Terry Model
• The learned features for the nodes – i.e., lineups – can be used as the input
to a Bradley-Terry logistic regression model
• The difference in the abilities of two lineups can be modeled after lineup-
specific explanatory variables zi
Probability lineup λi
outperforms λj
Conditioned on lineup’s
abilities πi and πj
We use the learned lineup feature from
the network embedding for z

Previously unseen lineups
• What if a lineup λi has not played before?
• Not part of the matchup network
• Define the similarity σ(λi,λj) between two lineups λi and λj (of the same team)
as the number of common players between them
• Can we use this similarity value and the latent representation for λj to obtain an
educated approximation of the latent space for λi ?
Yes!

Previously unseen lineups
Weak but significant (linear) correlation

Datasets
• Five year of NBA lineup matchup data
• 2007-08 to 2011-12
• basketballvalue.com
• ({Players of lineup λi},{Players of lineup λj}, Total point differential, Total time of
matchup)

Evaluation
• Out-of-sample accuracy: using the first 70% of the lineup matchups how
does LinNet perform over the rest 30%
• Probability calibration: how well calibrated the probabilities output are
• Brier score
• Validation curves
• Four baselines for comparison

Plus-Minus (+/-)
• Boxplot  limited view of players contributions
Points scored: s1
Points allowed: a1
Points scored: s2
Points allowed: a2
Points scored: sn
Points allowed: an
.
.
.
𝑖=1
𝑛
(𝑠𝑖 − 𝑎𝑖)

Adjusted Plus-Minus
• Controls for the presence of other players on the court
• Both offense and defense
• Each stint is a data point
• DV: PM/possession
• IVs: Dummy variables for all players
• 1 for home team players in the stint, -1 for visiting team players in the stint and 0 for the rest

Adjusted Plus-Minus
• Pass all the stints through a linear regression
• The coefficient for each player αi is the adjusted plus-minus of the player
• Using the adjusted plus-minus of each player in a lineup we can consider the
ability of a lineup to be the average adjusted plus-minus
𝑦 = 𝑎1 𝑥1 + 𝑎1 𝑥1+…+𝑎 𝑟 𝑥 𝑟 + 𝜀

Page Rank
• We can use network centrality metrics to rank nodes – i.e., lineups - directly
from the matchup network
• Our previous study has shown that this is a better predictor than win-loss%
• We calculate the Page Rank of each lineup based on the matchup network
and use this as the performance predictor for future lineup matchups
Pelechrinis, Konstantinos, Evangelos Papalexakis, and Christos Faloutsos.
"Sportsnetrank: Network-based sports team ranking." ACM SIGKDD
Workshop on Large Scale Sports Analytics. 2016.

0.00
0.25
0.50
0.75
1.00
2007-08 2008-09 2009-10 2010-11 2011-12
Season
Accuracy
Method
APM
LinNet
PageRank

0.00
0.05
0.10
0.15
0.20
0.25
2007-08 2008-09 2009-10 2010-11 2011-12
NBA Season
BrierScore
Method
APM
LinNet
PageRank
Predicted probability
Actual binary outcome

Lineup-based Plus-Minus
• We also compared the performance of lineups plus-minus with LinNet
• Raw lineup +/-
• Adjusted lineup +/-
• Using a regression similar to the players adjusted +/
Raw +/- Adjusted +/- LinNet
Mean Accuracy 54.5% 59.1% 67%
Mean Brier score 0.24 0.23 0.19

LinNet validation curves
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Predicted Lineup Matchup Probability
FractionofCorrectPredictions
Count
2000
4000
6000
Linear fit R2 = 96%
Intercept: 0.1, [0.02, 0.14]
Slope: 0.85, [0.79, 0.98]

Season Performance – LinNet Score
• How well can lineup ratings obtained from LinNet explain the win-loss
record for a team?
• With LT being all the lineups of team T and γλ being the time that lineup λ
was on the field, we can define the LinNet rating of team T as:

Season Performance – LinNet Score
• Linear fit has significant slope
(p<0.001)
• Correlation coefficient: 0.53
• R2 = 27%
• Teams do not use their best
lineups (?)
• Time on the court might be
important for a lineup’s
performance

Embedding dimensionality
0.00
0.25
0.50
0.75
1.00
0 100 200 300 400 500
Embedding Dimensionality
Accuracy

Discussion
• How crucial is the time that a lineup stays on the court?
• Is there a lineup ability curve similar to Oliver’s player skill curve?
• Is there a better way to predict outcomes for previously unseen lineups?
• Clearly there is not strong linearity in the latent space
• Can a task-specific network embedding perform better?
• How can we further optimize node2vec (e.g., choice of p and q)?

Substitution Networks
(work in progress)

Substitute Networks
Who-subs-whom
K. Pelechrinis, “Can NBA Substitution Networks Explain
a Team’s Performance?”, the Athlytics Blog (2017)

Substitute Networks
• Simple network metrics of the sub
network can explain 33% of the variation
of winning percentage
• This is not small!
• Sub networks say nothing about the quality of
the subs
• Network embedding can also reveal latent
relationships between starters and bench

Conclusions & Discussion
• Networks offer a strong tool for sports-related problems
• Many interesting problems and extensions of the work presented here

Conclusions & Discussion
• How crucial is the time that a lineup stays on the court?
• Is there a lineup ability curve similar to Oliver’s player skill curve?
• Is there a better way to predict outcomes for previously unseen lineups?
• Clearly there is not strong linearity in the latent space
• Can a task-specific network embedding perform better?
• How can we further optimize node2vec (e.g., choice of p and q)?

Winning in Sports with Networks

Winning in Sports with Networks

Recommended

Recommended

More Related Content

Similar to Winning in Sports with Networks

Similar to Winning in Sports with Networks (20)

More from Konstantinos Pelechrinis

More from Konstantinos Pelechrinis (6)

Recently uploaded

Recently uploaded (20)

Winning in Sports with Networks

Editor's Notes