Logistic Regression/Markov Chain presentation

Logistic Regression and Markov Chain approach to
NCAA Basketball seeding
Michael Hankin
University of Southern California
mhankin@usc.edu

April 22, 2013

Michael Hankin (USC)

LRMC

April 22, 2013

1 / 22

Overview

1

Background
Logistic Regression
Markov Chain


LRMC

April 22, 2013

2 / 22

Overview of Logistic Regression

Basic idea of Logistic Regression: Given explanatory variables X and
binary response variable Y we wish to determine P(Y = 1 | X ). Logistic
regression allows us to estimate this by modeling
1
Y ∼ Bernoulli σ(w T X ) where σ(w T X ) =
1+e w T X
If we model P(i beats j on j’s homecourt | i beat j by x on i’s homecourt)
as σ(α + βx) we obtain the following likelihood:
L(α, β) =
g :games


1
1 + e α+βxg

LRMC

wg

1−

1
1 + e α+βxg

1−wg

April 22, 2013

3 / 22

We then ﬁnd parameters that maximize the likelihood.

= log L(α, β) =

wg log
g :games

+(1 − wg ) log 1 −

1
1 + e α+βxg
1
1 + e α+βxg

−wg log 1 + e α+βxg +(1−wg ) α + βxg − log 1 + e α+βxg

=
g :games

(1 − wg ) (α + βxg ) − log 1 + e α+βxg

=
g :games


LRMC

April 22, 2013

4 / 22

∂
e α+βxg
=
(1 − wg ) −
∂α g :games
1 + e α+βxg
(1 − wg ) − 1 −

=
g :games

=
g :games

(1)

1
1 + e α+βxg

(2)

1
− wg
1 + e α+βxg

(3)

e α+βxg
∂
=
xg
(1 − wg )xg −
∂β g :games
1 + e α+βxg
(1 − wg )xg − 1 −

=
g :games

=
g :games


1
− wg
1 + e α+βxg

LRMC

1
1 + e α+βxg

(4)
xg

xg

(5)
(6)

April 22, 2013

5 / 22

∂2
=
−
∂α2 g :games

1
1 + e α+βxg

e α+βxg
1 + e α+βxg

−

1
1 + e α+βxg

1−

=
g :games

∂2
∂2
=
=
−
∂α∂β
∂β∂α g :games

1
1 + e α+βxg

1
1 + e α+βxg

1
1 + e α+βxg

1−

∂2
=
−
∂β 2 g :games

1
1 + e α+βxg

e α+βxg
1 + e α+βxg

−

1
1 + e α+βxg

1−

g :games

=
g :games


LRMC

(8)

e α+βxg
1 + e α+βxg

−

=

(7)

1
1 + e α+βxg

xg

xg

2
xg

1
1 + e α+βxg

(9)
(10)
(11)

2
xg

April 22, 2013

(12)

6 / 22

Want α, β s.t.
Taylor we have:

0=

(α, β) = 0. For α , β let

(α, β) =

0=

(α +

α, β

(α , β ) +

2

+

β)

≈

(α , β )

α

=α−α ,

(α , β ) +

β

= β − β . By

2

(α , β )

α
β

α
−
β

2

(α , β )

α
β

Newton to the rescue: Successive updates of the following form should
converge to the optimal values.
α
α
=
β
β


−

2

(α , β )

LRMC

−1

(α , β )

April 22, 2013

7 / 22

Use of Logistic Regression in LRMC

H
Victory/Defeat margin: We have now found rx , the probability that if
team i beats team j by x at i’s home court, team i will beat team j at j’s
home court. Assuming homecourt advantage is additive, the superiority
H
probability sx , the probability that team i would beat team j on a neutral
H
court given that team i beat team j by x on team i’s home court= rx+h .
H
This gives h = −αrr and sx = σ( αr + βr x).
2β
2


LRMC

April 22, 2013

8 / 22

Alternative assumptions: Because each game has finite length (equal
except for overtime), a reasonable estimator for a teams skill is the
proportion of they control the ball. Going further, the proportion of time a
team controls the ball can be estimated by their score divided by the sum
of both teams scores. Multiplicative homecourt advantage (look at score
ratio) and log multiplicative (log of score ratio).
Reduce overfitting: By penalizing for large parameter values (implying
that future games are independent of past games) we can reduce
overfiiting by choosing nonnegative λα , λβ and minimizing
− + λα α2 + λβ β 2 .
In my regularized examples I placed larger penalties on the α’s, operating
under the hypothesis that there is no homecourt advantage.


LRMC

April 22, 2013

9 / 22

Logistic Regression ”Goodness of Fit”

Assumptions for test: Because the number of observations is much
larger than the number of ”buckets” (for classical LRMC mean and
median observations per score differential were approximately 32.9 and 17,
respectively) the CLT allows us to normalize the residuals by assuming
H
y
that each observation is Bernoulli ri = √ yi −î and thus i ri2 ∼ χ2 .
n−2
yi (1−î )
ˆ
y
-


LRMC

April 22, 2013

10 / 22

Chi Squared p-values for logistic regressions

additive
additive (reg)
multiplicative
multiplicative (reg)
log mult
log mult (reg)

2011
0.511777
0.500654
0.495586
0.027208
0.499545
0.424898

2012
0.552131
0.534811
0.537728
0.001498
0.558072
0.440884

2013
0.569139
0.550568
0.522612
0.001819
0.593485
0.483908

Table : χ2 p-values


LRMC

April 22, 2013

11 / 22

2010-2011 Logistic Regressions
Numbers in legends are estimated homecourt advantages.


LRMC

April 22, 2013

12 / 22



LRMC

April 22, 2013

13 / 22



LRMC

April 22, 2013

14 / 22

Parameter estimates for 2012-2013

Additive Parameters:
αr , βr =0.68503617299539032, -0.056212447269008876.
Variance:
α
β


α
1.94257829e-03
-6.11459051e-05

LRMC

β
-6.11459051e-05
1.20313009e-05

April 22, 2013

15 / 22

Overview of Markov Chains

Stochastic Process with ﬁnite states: A Finite-state markov chain is a
stochastic process where the probability of being in X at time t is
dependent only on the state at time t-1.
Steady state: Given some basic conditions, there exists a probability
distribution across the states such that if a Markov Chain is run for a long
time we can expect the state at any given time to be ”Multinoulli” with
the steady state distribution.


LRMC

April 22, 2013

16 / 22

Use of Markov Chains in LRMC

LRMC states: In LRMC we create a state for each team, indicating that
we think that team is the best team.
Transition probabilities: Given some probability distribution based on
each team’s regular season record we either jump to another team or stay
put at each ”step”.
Expected time per state: Eventually a steady state distribution emerges
representing the amount of time we expect to be in each state. In this
case because the transition matrix is sparse and small enough for my
laptop to handle, we just ﬁnd its eigenvector corresponding to
eigenvalue=1, and normalize in L1 .


LRMC

April 22, 2013

17 / 22

Transition Probabilities

Naive Approach: To motivate the more complex LRMC approach we
start simple. Take
p = P(team i is better than team j | team i beat team j), wij = the
number of times i beat j, lij = the number of time j beat i, and Ni = total
number of games played by i (required to normalize transition
probabilities). Then we deﬁne the transition probability
1
tij = Ni (wij (1 − p) + lij p).
Better approach: Obviously we can do better by considering the victory
1
H
H
margin and game location. tij Ni
g :iatj rx(g ) +
g :jati (1 − rx(g ) ) ,
tii = 1 − j=i tij .
-


LRMC

April 22, 2013

18 / 22

2013 Top 10 projected teams

0
1
2
3
4
5
6
7
8
9

Top teams
Miami (FL)
Michigan
Wisconsin
Ohio State
Syracuse
Kansas
Gonzaga
Indiana
Louisville
Florida


Top teamsL
Nevada-Las Vegas
Notre Dame
Virginia Commonwealth
James Madison
Louisville
North Carolina A&T
North Carolina State
New Mexico
Syracuse
Memphis

LRMC

TopProb
0.006619
0.006619
0.006670
0.006788
0.006991
0.007234
0.007625
0.008241
0.008352
0.008582

TopProbL
0.003262
0.003262
0.003262
0.003262
0.003262
0.003262
0.003262
0.003361
0.003361
0.003361

April 22, 2013

19 / 22

Solitary and comparative accuracy

Proportion of Tournament matchups predicted correctly:
2012-2013
2011-2012
2010-2011
Additive
0.630769230769 0.716417910448 0.615384615385
Multiplicative 0.569230769231
0.641791044776
0.615384615385
Log Mult
0.630769230769 0.686567164179 0.630769230769


LRMC

April 22, 2013

20 / 22

2012-2013 Linear Regression for Playoﬀ probability
diﬀerence vs victory margin


LRMC

April 22, 2013

21 / 22

References

Paul Kvam and Joel S. Sokol (2006)
A Logistic Regression/Markov Chain Model for NCAA Basketball
Naval Research Logistics

RogueWave Logistic Regression Documentation
http://www.roguewave.com/portals/0/products/legacy-hpp/docs/anaug/3-3.html


LRMC

April 22, 2013

22 / 22

The End


LRMC

April 22, 2013

23 / 22

Logistic Regression/Markov Chain presentation

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (19)

Similar to Logistic Regression/Markov Chain presentation

Similar to Logistic Regression/Markov Chain presentation (20)

Recently uploaded

Recently uploaded (20)

Logistic Regression/Markov Chain presentation