2. To attempt to find out what effects MLB game attendance, we looked at these
variables: City Population, Median Household Income by city, Team Payroll, All
Stars on team, and age of the stadium.
What effects Attendance to MLB Games
City Population is taken as the number of people within the metropolitan city area
Median Household Income is taken as the median income per household within the
metropolitan area
Team Payroll is the total salary on the active roster per team
All Stars consist of the players who qualified for the All Star teams
Age of Stadium is measured in terms of whether or not the stadium is over ten
years old (dummy variable)
*All data collected is for the year of 2010
Attendance levels would be of interest to team owners, local businesses and city
planners as an increase or decrease in game attendance will effect commerce within
the city and effect profits within the franchise. Also, this information could assist in
future production and location of stadiums.
3. Our data was collected via:
Process of Data Collection
U.S. Census Bureau – Collected City Population & Household Median
Income for the year 2010
Statistics Canada – Collected City Population & Median Income (for
Toronto) for the year 2010
USA Today – Collected data for payroll per team for the year 2010
www.MLB .com – Collected data concerning Stadium Age and All Star
Qualifiers for the year of 2010
www.ESPN.com – Collected data regarding total attendance at each
stadium for the year of 2010
4. City Population was chosen as we figured the larger a city’s population the more
people there would be in attendance to each MLB game
Team Payroll was used as we figured the more a team invests in their players the
more interested the general public would be in the team
Median Household Income by City was selected as we figured higher median
income per households would lead to higher disposable incomes, leading to
expected higher attendance rates
All Stars were chosen as the more All Stars on team led us to believe that the more
fans would be interested in watching the game live at the stadium
Stadium Age was used as we believed that a newer stadium (constructed within
the last ten years) would be a stimulant to higher attendance
Why these variables were considered
5. Expectations Signs for Coefficients
Variables Expected Sign
City Population +
Median Household Income +
Team Payroll +
All Stars +
Stadium Age -
6. Stadium Attendance = β0 + β1*(City Population) + β2*(Team Payroll) + β3*(Median
Household Income by City) + β4*(All Stars on Team) + β5*(Stadium Age) + ε
Population Regression Model
Estimated Regression Model
Population Regression Model
Stadium Attendance = 766123.0488 – 0.102536271*(City Population) +
0.0143576984*(Team Payroll) + 11.11058502*(Median Household Income by City)
+ 129824.9479*(All Stars on Team) – 349284.9632*(Stadium Age)
7. Variable Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 766123.0488 429776.0089 1.782610088 0.087305761 -120891.0377 1653137.135
City Population -0.102536271 0.049448203 -2.073609665 0.049013975 -0.204592346 -0.000480197
Team Payroll 0.014357684 0.002580158 5.564652373 0.000010028 0.009032499 0.019682868
Median Household City Income 11.11058502 7.29946367 1.522109777 0.141049586 -3.954767547 26.17593759
All Stars 129824.9479 64193.03228 2.022414946 0.054413842 -2662.95907 262312.8549
Stadium Age -349284.9632 180102.8836 -1.939363525 0.064299644 -720999.0456 22429.11914
Regression Statistics
Multiple R 0.817584905
R Square 0.668445077
Adjusted R Square 0.599371135
Standard Error 450975.6013
Observations 30
Degrees of Freedom SS MS F Significance F
Regression 5 9.84074E+12 1.96815E+12 9.677239435 3.70117E-05
Residual 24 4.8811E+12 2.03379E+11
Total 29 1.47218E+13
Regression Analysis Results
8. β1 = -0.102536271 is the amount that will effect Attendance for every
marginal increase or decrease in City Population
Interpretation of Coefficients
β2 = 0.014357684 is the amount that will effect Attendance for every
marginal increase or decrease by the dollar in Team Payroll
β3 = 11.11058502 is the amount that will effect Attendance for every marginal
increase or decrease by the dollar in Median Household Income by City
β4 = 129824.9479 is the amount that will effect Attendance for every marginal
increase or decrease in the number of qualified All Stars on the team
11. Residual Plots
-1000000
-500000
0
500000
1000000
1500000
0 4,000,000 8,000,000
Residuals
City Population
City Population Residual
Plot
-1000000
-500000
0
500000
1000000
1500000
0 100,000,000 200,000,000 300,000,000
Residuals
Team Payroll
Team Payroll Residual Plot
-1000000
-500000
0
500000
1000000
1500000
0 20000 40000 60000 80000
Residuals
Median Household City Income
Median Household City Income
Residual Plot
-1000000
-500000
0
500000
1000000
1500000
0 1 2 3 4 5 6
Residuals
All Stars
All Stars Residual Plot
12. Model Strength
RA
2 = 0.599371
Regression Statistics
Multiple R 0.817584905
R Square 0.668445077
Adjusted R Square 0.599371135
Standard Error 450975.6013
Observations 30
59.9371% of the variation in stadium attendance is explained by the variation in
City Population, Median Household Income, Team Payroll, All Stars, & Stadium
Age when you account for the number of variables in the model
F Significance F
9.677239435 3.70117E-05
F-test at = 0.05, Ha: β1 = β2 = β3 = β4 = β5 =0 Ho: At least one β ≠ 0
With our f-test result being so low, 3.7012E-5, which is far lower than = 0.05,
there is evidence suggesting that at least one independent variable does in fact
effect attendance
F-Test
13. Model Strength
The regression model does explain a significant proportion of the variation in
attendance. There is evidence that at least one independent variable (Team
Payroll at 0.0000) effects y, Attendance
P-Test
P-value
0.087305761
0.049013975
0.000010028
0.141049586
0.054413842
0.064299644
Variable
Intercept
City Population
Team Payroll
Median Household City Income
All Stars
Stadium Age
14. Confidence Interval
We are 95% confident that the attendance is expected to increase between
0.0090 units and 0.0197 units for each additional increase of 1 dollar of payroll
holding constant the population of the city, number of all stars, median household
income for each city
Player Payroll
15. Multicollinearity
Variables Attendance (Y) City Population Team Payroll Median Household City Income All Stars Stadium Age
Attendance (Y) 1
City Population 0.2534 1
Team Payroll 0.7272 0.5055 1
Median Household City Income 0.1170 0.1904 0.0406 1
All Stars 0.3968 0.3113 0.3008 -0.0986 1
Stadium Age -0.2581 -0.2130 -0.1032 0.0756 -0.0849 1
Also, there is likely a multicollinearity due to there are, what we
believe to be, incorrect signs on our coefficient regarding City
Population
16. Predicting with Regression
Regression Model Prediction
Stadium Attendance = 766123.0488 – 0.102536271*(1307402) +
0.0143576984*(37799300) + 11.11058502*(63990) + 129824.9479*(2) –
349284.9632*(0)
We selected the city of San Diego to see how accurately our regression model
predicts Attendance. The true attendance is equal to 2,131,774. Our equation is:
Solving the above equation for the City of San Diego puts the estimated
attendance at 2,145,395 (rounded up from 2,145,394.103) people. The difference
between the estimation and true attendance is 13,621 people
17. Relevant Findings & Shortcomings
Although the signs regarding the coefficients did not meet our expectations, our
model does tend to accurately predict expected attendance based on the variables
given
Team Payroll being the only significant variable came as a surprise as we would
have expected attendance to be effected by income and population more so than
player’s salaries
The standard error regarding All Stars did not provide as much information geared
toward attendance as we would have expected based on other factors
Aside from Team Payroll, all other variable’s lower and upper limits on the 95%
confidence interval cross zero, indicating that they may have no impact on
attendance
Due to the small sample size, our histogram did not have a normal distribution
18. Business Recommendation
Based on our regression analysis, in order to maximize attendance over the course
of an MLB season it would seem that the leadership of a given team would be wise
to invest back into the players. This would appear to heavily drive attendance,
leading to profit maximization within the stadium in regards to concessions, team
paraphernalia, and other expenditures that go along with attending a game. In turn
this has the potential to drive local business within the city as well.
Although the other factors may affect attendance (and should be given proper
attention) it would appear that they could come second to overall team payroll.
Furthermore, acquiring All Star talent would seem not to be necessary to maximize
attendance. Quality players with mid-ranged salaries would be more beneficial in
the long run.