1. Determinants of College Football Attendance
Connor Weaver
Econometrics 390
Professor Johnson
9 May 2015
Weaver 1
2. I. Introduction
College Football is an exciting sport that mixes student athletes with die-hard
fans. One of the key elements of a college football game is the attendance because a
more rowdy crowd signifies an intensified game. Lately, college football has seen a dip
in attendance and that startling fact grasped my attention of what factors affect
attendance. (Wieberg, 2012) After reading Wieberg, the dependent variable of average
attendance is going to be much better to evaluate than a total attendance in my model.
To truly evaluate what affects attendance average, a cross sectional analysis will be
made on the year 2010. The 2010 season is the last season before the consistent
conference realignment that would bewilder any further analysis.
II. Literature Review
The dip in college football isn’t enough; the interesting aspect is what really
affects attendance. The first independent variable to consider affecting attendance is
the conference and how competitive the conference is. The Southeastern Conference
(SEC) should have a higher average attendance then the Sun Belt. Rodney Paul (2012)
states that the more competitive conference means that game will have more outcomes
that are unpredictable. Furthermore, fans aren’t going to a game that that has an
extreme underdog unless it is a bowl game. James Quirk (2004) believes that the more
competitive conference increases attendance and that the variance in attendance is all
about which conference you are in. To sum up, the higher the conference average
attendance will increase the dependent variable of average attendance for a school.
Weaver 2
3. Another major variable that affects attendance is the size of the school, for
instance Central Florida compared to the Air Force. Price and Kabir (2003) both believe
that undergrad enrollments are a major factor in the attendance of a college football
game. Price and Kabir further their factors by including the presence of a professional
team. The presence of a professional team should decline the attendance because only
so many games can be attended. Since it doesn’t matter how many professional teams
are nearby, simply a yes or no, we will deem this as a dummy variable. School size will
be another independent variable that should increase the average attendance of a
team. Conversely, having a pro team is expected to lower the dependent variable.
Besides school size and pro team presence, Falls and Natke (2014) believes that
win percentage and conference championships affects the average attendance. If a
team wins a lot of games, then their attendance would increase because they are a
good team. This idea doesn’t conflict with Paul because a team can have a high win
percentage and still play games with uncertainty. Conference championships are a
variable that Falls thinks will be increase the average attendance. The more conference
champions in a program’s history means that they play highly contested games and
have a history of winning. On the side, a team that does not have as strong of history of
conference success will have less games being sold out. Lastly, the conference
championship ideology can be applied to national championships. In the end, win
percentage, conference championships and national championships are independent
variables that affect attendance average.
Weaver 3
4. The last two independent variables to consider for our model are recruiting
rankings and SOS (Strength of Schedule). Langelett (2003) supposes that recruiting
rankings have a positive impact on win pct. Additionally Langelett has regression
analysis which shows that recruiting rankings and win percentage are positively
correlated, but not perfect. A possible reason why the relationship isn’t perfect is
because SOS is not included. The strength of schedule variable incorporates in
conference and out of conference games. Furthermore, SOS can be evaluated as how
many games are played against really good teams and if both teams are very good then
average attendance will increase. (Fair, 2002) SOS and recruiting rankings are the last
two independent variables for our model.
III. Model Specification
After reviewing the literature, numerous independent variables are proposed to
explain what affects attendance average. Although enough variables are proposed,
three more variables are included for further analysis: weeks ranked in the top 25, if the
team has a new coach or not (1 if yes and 0 if no), number of teams in conference and
the number of home games. The functional form for our analysis will be linear and semi-
log. The semi-log form is simply a variation of the double log equation, which for our
dependent variable will in terms of its natural log.
Linear Model
ATTAVGi = UNDERGRAD + CONFATTAVG + RECRANK + TEAMCONF + HGMAES + SOS +
CHAMP + NEW + PROTEAM + WINPCT + WEEKRANK + CONFCHAMP + εi
Weaver 4
5. Semi-Log Model
LNATTAVG= UNDERGRAD + CONFATTAVG + RECRANK + TEAMCONF + HGMAES + SOS +
CHAMP + NEW + PROTEAM + WINPCT + WEEKRANK + CONFCHAMP + εi
* εi is classical error term and below are variable definitions
Dependent Variable:
ATTAVG= the average home game attendance for the college in 2010
LNATTAVG= the natural log of the average home game attendance for the college in
2010
Independent Variables:
UNDERGRAD= undergrad enrollment for the college
CONFATTAVG= the respective conference average for each college
RECRANK= recruiting ranking prior to the season for each college
TEAMCONF= number of teams in the conference for each college
HGAMES= number of home games for each college
SOS= the strength of schedule for each college
CHAMP= number of national champs for the college, includes the 2010 season
Weaver 5
6. NEW= 1 f team has a new coach, 0 if otherwise
PROTEAM= 1 f team has a new coach, 0 if otherwise
WINPCT= win percentage for the 2010 season
WEEKRANK= number of weeks ranked in the top 25, includes preseason
CONFCHAMP= number of conference championships for the school, includes the 2010
season
IV. Expected Signs of the Coefficients
The coefficients for UNDERGRAD, CONFAVG, WINPCT, WEEKRANK, SOS,
RECRANK, HGAMES, CONFCHAMP and CHAMP are all expected to be positive. The
aforementioned articles and personal knowledge have a consensus that as they
increase, average attendance will too. The coefficient on UNDERGRAD is positive
because more students allows for them to fill the stadium. Also, the coefficients on
WINPCT and WEEKRANK are positive because the more wins during the season will
increase the attendance average. Additionally, the coefficient on SOS and CONFAVG will
be positive because a harder schedule and conference increases the attendance at
games. Next, the more national and conference championships indicate that the team
wins and therefore more fans will attend. Lastly, the coefficient on HGAMES should be
positive. Obviously, the more home games gives each team a chance to win or schedule
a worthy opponent; therefore the attendance average will be higher.
The expected coefficients on RECRANK and PROTEAM should be negative. For
Weaver 6
7. PROTEAM the coefficient will be negative because fans can only attend so many games.
If there is a presence of a professional team, fans may want to see that team opposed to
the college football team. Since college games are often broadcasted, that even furthers
the idea that students and fans may stay home and just go to the professional game.
The coefficient on RECRANK will be negative because for every increase in recruiting
ranking means the worse the incoming recruits are. Fans want to see a team with great
freshman, therefore the higher RECRANK will lower ATTAVG.
The coefficients on NEW and TEAMCONF are expected to be ambiguous because
you can make the case for them to be either negative or positive. A new coach can
either hurt or help the average attendance. On one hand, a new coach may bring in a
new scheme that energizes the school and increases the average. On the other hand, a
new coach may not be as liked and the fans do not attend the games. Also, the number
of teams in conference can either hurt or help the attendance average. The more teams
you play in a subpar conference constrain a team from scheduling better opponents to
increase the attendance average. Conversely, if team plays in the Southeastern
conference (SEC), then more games in that conference raises the attendance.
Hypothesis Tests:
For UNDERGRAD, CONFAVG, WINPCT, WEEKRANK, SOS, HGAMES, CONFCHAMP
and CHAMP
H0:
Weaver 7
8. HA:
For PROTEAM and RECRANK
H0:
HA:
For NEW and TEAMCONF
H0:
HA:
V. Data Collection
There are 120 cross sectional observations that last during the 2010 college
football season. Data could have been collected from one source, but multiple sources
were used to ensure accuracy.
TABLE 1: Data Sources
Variables Data Source
ATTAVG, CONFATTAVG, TEAMCONF and
HGAMES
http://fs.ncaa.org/Docs/stats/football_recor
ds/Attendance/2010.pdf
UNDERGRAD http://ope.ed.gov/athletics/InstList.aspx
RECRANK http://247sports.com/Season/2010-
Football/CompositeTeamRankings
SOS, CHAMP, WINPCT, WEEKRANK and http://www.sports-reference.com/cfb/
Weaver 8
10. VI. Model Estimation
Two different models can best explain the variation in the dependent variable,
attendance average. TABLE 3 has the regression results from STATA.
TABLE 3: Regression Results
Source | SS df MS Number of obs = 120
-------------+------------------------------ F( 12, 107) = 63.82
Model | 6.7446e+10 12 5.6205e+09 Prob > F = 0.0000
Residual | 9.4236e+09 107 88071394.5 R-squared = 0.8774
-------------+------------------------------ Adj R-squared = 0.8637
Total | 7.6869e+10 119 645960483 Root MSE = 9384.6
------------------------------------------------------------------------------
attavg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
undergrad | .5306921 .1256966 4.22 0.000*** .2815133 .7798709
confattavg | .4309405 .075333 5.72 0.000*** .2816016 .5802794
recrank | -195.0137 57.29057 -3.40 0.001*** -308.5856 -81.44182
teamconf | -185.404 505.7328 -0.37 0.715 -1187.96 817.1522
hgames | 1082.241 1378.276 0.79 0.434 -1650.03 3814.512
sos | -107.7795 384.3704 -0.28 0.780 -869.7489 654.19
champ | 1997.315 449.9921 4.44 0.000*** 1105.258 2889.372
new | 3764.991 2373.692 1.59 0.116 -940.5767 8470.558
proteam | -3079.677 2128.554 -1.45 0.151 -7299.287 1139.933
Weaver 10
11. winpct | 12900.44 6100.355 2.11 0.037** 807.1965 24993.68
weekrank | 155.1882 245.1085 0.63 0.528 -330.7108 641.0872
confchamp | 369.6474 126.1819 2.93 0.004*** 119.5065 619.7884
_cons | 11221.92 13221.76 0.85 0.398 -14988.68 37432.52
------------------------------------------------------------------------------
* = significance at the 10% level
** = significance at the 5% level
*** = significance at the 1% level
TABLE 4: Linear Model, AIC and BIC
Akaike's information criterion and Bayesian information criterion
-----------------------------------------------------------------------------
Model | Obs ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
. | 120 -1386.945 -1261.012 13 2548.025 2584.262
-----------------------------------------------------------------------------
Linear Model
++++++++ εi
The aforementioned tables were run to estimate the coefficients and AIC/BIC
from the data. The R2
value is .8774, which means that the regression equation explains
87.74% of the variation in the ATTAVG around the mean. Of the 12 independent
variables, only six are significant at the one, five or ten percent levels. Only one of the
coefficients has the wrong sign, SOS. Overall, the t-scores are a little too low and the
Weaver 11
12. prob values are high. Although the R2
is high, the inflated prob values and the low t-
scores make this model unattractive. A possible reason for the poor results from this
model is most likely incorrect functional form. To fix the results, the semi-log functional
form must be used.
TABLE 5: Semi-Log Form
LNATTAVG= UNDERGRAD + CONFATTAVG + RECRANK + TEAMCONF + HGMAES + SOS +
CHAMP + NEW + PROTEAM + WINPCT + WEEKRANK + CONFCHAMP + εi
Source | SS df MS Number of obs = 120
-------------+------------------------------ F( 12, 107) = 69.03
Model | 39.1094473 12 3.2591206 Prob > F = 0.0000
Residual | 5.05190147 107 .047214032 R-squared = 0.8856
-------------+------------------------------ Adj R-squared = 0.8728
Total | 44.1613487 119 .371103771 Root MSE = .21729
------------------------------------------------------------------------------
lnattavg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
undergrad | 8.38e-06 2.91e-06 2.88 0.005*** 2.61e-06 .0000141
confattavg | .0000117 1.74e-06 6.71 0.000*** 8.24e-06 .0000152
recrank | -.0040281 .0013265 -3.04 0.003 -.0066577 -.0013985
teamconf | -.0216139 .0117095 -1.85 0.068** -.0448267 .0015988
hgames | .0527472 .031912 1.65 0.101*** -.0105147 .116009
sos | .0191125 .0088995 2.15 0.034** .0014702 .0367548
Weaver 12
13. champ | .0158546 .0104189 1.52 0.131*** -.0047997 .0365089
new | .0749942 .0549595 1.36 0.175 -.0339565 .1839449
proteam | -.0393191 .0492836 -0.80 0.427 -.1370182 .0583799
winpct | .672936 .141245 4.76 0.000*** .3929342 .9529379
weekrank | -.0099104 .0056751 -1.75 0.084 -.0211607 .0013399
confchamp | .0074229 .0029216 2.54 0.012** .0016313 .0132146
_cons | 9.616651 .3061311 31.41 0.000 9.009782 10.22352
------------------------------------------------------------------------------
Table 6: Semi-Log form, AIC and BIC
Akaike's information criterion and Bayesian information criterion
-----------------------------------------------------------------------------
Model | Obs ll(null) ll(model) df AIC BIC
-------------+---------------------------------------------------------------
. | 120 -110.2941 19.791 13 -13.582 22.6554
-----------------------------------------------------------------------------
Semi-Log Model
Initially, this estimation looks a lot better because the sign on SOS, which
wasn’t the case in the linear model. The only glaring issue with this model is the
coefficient on RECRANK and WEEKRANK have the wrong signs. This model has
lower AIC and BIC values along with increased R2
and adjusted R2
. Also, 8 out of the
12 variables can be rejected at either the one, five or ten percent levels. The
variables NEW and PROTEAM all have low t-scores and high prob-values. The
Weaver 13
14. insignificance of these variables, along with the wrong signs on RECRANK and
WEEKRANK make them appear as irrelevant variables. To test the significance of
these variables we conduct a F-test, joint hypothesis that coefficients jointly equal
zero.
H0: = 0
HA: H0 is false
TABLE 6: F-Test Results
( 1) recrank = 0
( 2) new = 0
( 3) proteam = 0
( 4) weekrank = 0
F( 4, 107) = 4.03
Prob > F = 0.0044
Serial Correlation
Serial correlation violates Classical Assumption IV, that observations of the error
term are uncorrelated with each other. This problem occurs when the present value of
the error term can be explained by the previous error term. The issue of serial
correlation happens when time is important in the model, usually a time series data.
Furthermore, the Durbin-Watson Test can detect serial correlation and Prais-Winsten or
Newey-West can correct serial correlation. Nonetheless, our model is a cross sectional
analysis and we need not worry about serial correlation.
Weaver 14
15. Heteroskedasticity
Bibliography
Cohen, Ben. 2013. “College Football, Minus the Students.” Wall Street Journal – Eastern
Edition, September 26th
. D5. Academic Search Complete
Groza, Mark D. 2010. "NCAA Conference Realignment and Football Game Day
Attendance." Managerial And Decision Economics 31, no. 8: 517-529. EconLit
Fair, Ray C., and John F. Oster. 2002. "Comparing the Predictive Information Content of
Weaver 15
16. College Football Rankings." 19 pages. EconLit
Falls, Gregory A., and Paul A. Natke. 2014. "College Football Attendance: A Panel Study
of the Football Bowl Subdivision." Applied Economics 46, no. 10-12: 1093-1107.
EconLit
Holmes, Paul. 2011. "Win or Go Home: Why College Football Coaches Get Fired."
Journal Of Sports Economics 12, no. 2: 157-178. EconLit
Langelett, George. 2003. "The Relationship between Recruiting and Team Performance
in Division 1A College Football." Journal Of Sports Economics 4, no. 3: 240-245.
EconLit
Paul, Rodney, Brad R. Humphreys, and Andrew Weinbach. 2012. "Uncertainty of
Outcome and Attendance in College Football: Evidence from Four Conferences."
Economic And Labour Relations Review 23, no. 2: 69-81. EconLit
Price, Donald I., and Kabir C. Sen. 2003. "The Demand for Game Day Attendance in
College Football: An Analysis of the 1997 Division 1-A Season." Managerial And
Decision Economics 24, no. 1: 35-46. EconLit
Quirk, James. 2004. "College Football Conferences and Competitive Balance."
Managerial And Decision Economics 25, no. 2: 63-75. EconLit
Steve, Wieberg. 2012. "Attendance, ratings dip in college football." USA Today, n.d.
Academic Search Complete, EBSCOhost
Weaver 16