SlideShare a Scribd company logo
DREXEL UNIVERSITY | ECON 350
Predicting Salary for MLB
Players
An Empirical project
Patel, Vraj & Greene, Robert
6/4/2015
ABSTRACT
The focal point of this paper is an attempt to examine the relationship between the percentage-
based contractual salaries (log(salary)) [hereto referred throughout as “LSalary”] of qualifying 2013 Major
League Baseball players and the following statistics: age, age squared, games, home-runs, slugs, hits, at bats,
and on base percentage. As there are many factors that are contributory towards LSalary, those players who
are designated as pitchers have been omitted from the data set as they have dissimilar qualities that obscure
the intended data set.
With this restricted data set, the sample consists of four hundred forty seven observations across
thirty teams within the 2013 season. Utilizing the aforementioned restricted data collection, we have
regressed LSalary, age, agesquared, games, home-runs, slugs, hits, at bats, on base percentage, strikeouts,
times caught stolen, stolen bases and runs. However, after controlling our variables, our conclusion is drawn
from the empirical results suggesting that age, age squared, games, home-runs, slugs, hits, at bats, and on
base percentageare arguably significant drivers in determining a player’s LSalary at an adjusted R2 value of
0.5139, or 51.39% explanatory level.
Introduction
Not prone to recent years, the seven and eight-figure salary that many professional baseball players
receive has been criticized as gross overpayment. Some critique that, because few working-class individuals
can imagine making a one-million dollar plus annual salary, the once “working-class baseball game” has lost
touch with its origin1. This begs the question as to whether or not the players of the Major League Baseball
profession are overpaid; but perhaps through their certain skillset, these players have earned these
outrageous salaries. As such, the purpose behind this research is to ascertain whether or not signed players
of Major League Baseball teams aregiven their contractual salary subsequent to their performance, or if
there is an outside factor that acts as the primary driver in this determination. However, in order to present
solutions in a meaningful manner, a more thorough understanding and discussion must be held in
determining what factors, or rather performance-based statistics, should be closely looked at.
The scope of this study resides in observing those players which have a full-detailed list of playing
statistics attributed to them. Previous conducted studies, which will be discussed further within the literature
review sub-section of this report, have attempted to illustrate meaningfulness via acknowledging every
available playing statistic variable; however, approaching the project as such will obstruct the ultimate
outcome of the data, as a percentage of players are expected to excel in (largely) batting/scoring, while
others are paid primarily for the ability to pitch. Be it that we arelooking for the main driving force behind
MLB player salaries, and that pitchers make a minute percentage of the entire league, we have omitted any
player designated as a pitcher from our sample. Additionally, although perhaps a factorial indicator in salary,
as we areunable to quantify such attributes as emotional drawl, optimism, or reactionary self-awareness,
these too will be omitted from our calculations.
1 See Phillip VanFossen (2009) for an insightful argument and justification of MLB player salary on howviewership
is largely responsiblefor salary contribution.
Literature Review
Unlike any comparative methodology, Major League Baseball is the only one the big four (4) North
American Sports teams that suppress any form of salary limitation. “Salary caps areemployed with
professional team sports leagues all over the world…. Conventional wisdom suggests that they are a collusive
effort of club owners to control labor costs,” (Dietl, 13). However, while there is no binding restriction on
commending a certain player with an egregious salary, there is an incorporated surcharge when the
aggregatedpayroll of a team exceeds that which is previously established by the league. This tax
restriction acts as an incentive for team management not to sign a largepercentage of the A-List players, ergo
rescinding the competitive balance needed to maintain viewership.
Upon reviewing previously published articles and scholarly journals pertaining to this subject matter,
many of the topics had predicted economical short-comings or comparable salaries of separate industry
entities as the focal point of their research. That is not to say that the research went without merit; of the
limited relevant documentation found, two had shared a similar goal in determining what factors were
detrimental in predicting a player’s salary: Meltzer and VanFossen.
Meltzer had investigated various means to measure player’s performance which lead toward
determining not only salary, but contract length as well. Explicitly, Meltzer had conducted his experimental
research utilizing data from a 2002 study, in a two-stage least squares examination. Utilizing this method
enabled Meltzer to estimate both salary and contract length as a function of the other. Meltzer’sconcluding
results illustrated that there are fundamentally two distinct areas of deviation for contract length and
averagesalary: “[The] first comes from young improving players who are likely to get long-term contracts at
low annual salaries. [The] second comes from players with chronic injuries, whose salary is not affected by
their injuries but who will tend to get shorter contracts that they otherwise would,” (Meltzer, 1).
The variables used within Meltzer’smodels are fairly consistent with that of with which we had
conducted. Meltzer had used various independent variables in the opportunity to predict salary via the
following variables: averagesalary, length of contract, OPSChange, Plate Appearance, All Star Selection, Gold
Glove, Age, Age-Sq, Catcher, Short Stop, Outfielder, Free Agent, Arbitration, Hi-Pay, Lux, Lo-Pay, and
Population (of the team’s metropolitan area)2. What is interesting about Meltzer’scollection of data is that
he introduces an identical hypothesis that we, as the authors of this analysis, had in conducting our own
research: Dropping any non-hitters, i.e., Pitchers. Both the research conducted by Meltzer and ourselves had
limited the data collection to hitters, as pitching statistics are “less universal than hitting statistics” (Meltzer,
13). Attempting to introduce this data would ultimately obscure the existing data pool while also being
incomplete in areas that have been determined as quantitative in terms of skill and performance, such as
variable “Hit” for instance.
VanFossen had introduced an unexpected perspective upon approaching justification of player salary
by means of economical inflation, risk assessment, strict marketing strategy, and non-quantitative
measurements of emotional appeal. Where there article lacked in strict quantitative and statistical analysis of
his premises and conclusion, his concept and theorized verbal analysis was thorough and argumentatively
sound. VanFossen’s elementary conclusion was that “[a]thletes are paid based upon their contribution to fan
satisfaction… [fans/viewership] contribute toward their salaries by attending games, watching them live on
the television, and supporting them through apparel purchases...” (VanFossen). Speaking-at-large, the
usefulness of this article is found within its ability to illuminate any inconsistencies with our predicted salary;
otherwise stated, further explaining the calculated R2 value, and consequently any error terms that might
suffice in our equation.
Data
The data used in our study originally consisted of all the player data that was available for all Major
League Baseball players during the 2013 season. This data set did not however separate players by position.
The total number observations in our full data set consisted of 837 players. Also it worth noting that the data
2 See Josh Meltzer (2005) for his regression in its entirety via step-by-step augmentation and analysis.
set did not exclude players because they did not have a minimum number of at bats during the season. Our
data was collected form a single source, the site Baseball-Reference.com, and any inaccuracies that are
recorded on the site will also be reflected in our regression model. The statistics that we collected included
salary, age, games played, number of home-runs, slugging percentage, number of hits, number of runs
scored, number of at bats, on base percentage, number of strikeouts, number of stolen bases, and the
number of times caught stolen. We selected these statistics because they were readily available on our data
source, and also because we believed that these statistics would be the best predictors of salary.
Methodology
Our initial thoughts on how to perform the regression was to regress salary against all the statistics
we had gathered, however this resulted very abnormal results such as the coefficient on hits and runs being
negative. Economic theory however would suggest that these statistics should have a positive effect on the
salty earned by the player. We later realized that this most likely caused by including Pitchers in the data set.
Pitcher in Major LeagueBaseball are evaluated on different metrics than batters and thus were throwing of
our results. We then striped all the Pitchers from our data set, and were left with 447 observations. When we
ran a regression on this data set, the results looked much better. We then changed our dependent variable
from salary to the log of salary because we believed that this give us a better indication of the movement in
salary. Our final step in the process was to drop all statistically insignificant variable that were not helping us
explain the variation in the log of salary such as stolen basses and stroke outs.
Empirical Results
Our final regression model, after all of the adjustment that we mentioned previously were made,
was:
𝐿𝑜𝑔𝑆𝑎𝑙𝑎𝑟𝑦 = β0 + β1Age + β2Age2 + β3Games + β4AtBats + β5Sluging + β6Hits + β7HR + β8OBP + μ
Where LogSalary is the log of a player’s salary, Age is the Player’s Age, Age2 is the Age squared, AtBatsis the
number of times a player batted, Slugging is the total number of bases a player had dived by the number of
at bats, Hits is the number of hits a player earned, HR is the number of home-runs a player hit, and OBP is the
percentage of times a player was on to base. As the tables at the end indicate, the model has an Adj R2 value
of .5139 which means that model is able to explain around 51% of the variation in LogSalary given the eight
aforementioned explanatory variables. This may seem to be low for a predictive model, but is actually in line
with similar studies conducted in the past.
𝐿𝑜𝑔𝑆𝑎𝑙𝑎𝑟𝑦̂ = -.213+ .869Age + -.012Age2 + -.018Games + .003AtBats + -2.01Sluging + .01Hits + .03HR +
1.58OBP
Given that all of our observations had a positive age and at least one gameplayed, we believe that
the constant term in our model is not significant to our findings. The beta coefficients on the explanatory
however are meaningful. For example one additional hit is expected to increase the salary by .01 percentage
points. Slugging percentageon the other hand is expected to have a negative effect on LogSalary, specifically
a one percent increase in slugging is expected to lower salary by 2.01 percentagepoints. Home-runs, hits, at
bats, and on base percent are all indicated to have a positive effect on salary, with interpretations of a .03
percentage point increase, .01 percentagepoint increase, .003 percentagepoint increase, and a 1.58
percentage point increase in salary respectively.
Conclusion
Given the results of model we believe that it is possible to predict the earnings of professional
baseball players in the MLB given their performance during the season. Even though our model only has an
Adj R2 of .51, we still believe that this model is usable given that it has an F stat of 59.94 and Probability F Crit
> F = 0.0000. This by no means that our model is a perfect predictor of salary, there arevarious
improvements that we can make to gain a better understating of the variance in player salary. Some possible
improvements include adding careerstatics to the model, and also classifying each player by position by
using dummy variables. Although we may be unable to quantify the x factor that a player may have for an
organization and its sponsors, we are however able to quantify how much a home-run, hit, or at bat is worth
to the organization, and by using this statistics we can predict how a player’s salary will change given his
performance during the season.
Tables
Summary Statistics
Regression
Correlation Matrix (Full Data Set)
Correlation Matrix (Partial Statistics)
Bibliography
Dietl, Helmut M, Markus Lang, and Alexander Rathke. "The Effect of Salary Caps in Professional Team
Sports on Social Welfare." The B.E. Journal of Economic Analysis & Policy 1, no. 72 (2009): 7-14.
Meltzer, Josh. “Average Salary and Contract Length in Major League Baseball: When Do They Diverge?”
2005, Department of Economics, Stanford University, CA. Accessed May 24, 2015
Rhonda Magel, Michael Hoffman, Predicting Salaries of Major League Baseball Players, International
Journal of Sports Science, Vol. 5 No. 2, 2015, pp. 51-58. doi: 10.5923/j.sports.20150502.02.
Sports Reference LLC. "(2013 Major League Baseball Standard Batting)." Baseball-Reference.com - Major
League Statistics and Information. http://www.baseball-reference.com/. Accessed May 22, 2015
VanFossen, Phillip. "The Economics of Professional Sports: Underpaid Millionaires?" The Economics of
Professional Sports: Underpaid Millionaires? August 5, 2009. Accessed May 27, 2015.

More Related Content

Similar to Predicting Salary for MLB Players

SSRN-id2816685
SSRN-id2816685SSRN-id2816685
SSRN-id2816685Dean Dagan
 
The Contract Year Effect in the NBA
The Contract Year Effect in the NBAThe Contract Year Effect in the NBA
The Contract Year Effect in the NBAJoshua Kaplan
 
C16SG6Assg7MiLBv100
C16SG6Assg7MiLBv100C16SG6Assg7MiLBv100
C16SG6Assg7MiLBv100Kyle Boas
 
Bank Shots to Bankroll Final
Bank Shots to Bankroll FinalBank Shots to Bankroll Final
Bank Shots to Bankroll FinalJoseph DeLay
 
Multi Criteria Selection of All-Star Pitching Staff
Multi Criteria Selection of All-Star Pitching StaffMulti Criteria Selection of All-Star Pitching Staff
Multi Criteria Selection of All-Star Pitching StaffAustin Lambert
 
The Effect of RAT on Wages for Professional Basketball Players 0505.docx upda...
The Effect of RAT on Wages for Professional Basketball Players 0505.docx upda...The Effect of RAT on Wages for Professional Basketball Players 0505.docx upda...
The Effect of RAT on Wages for Professional Basketball Players 0505.docx upda...Andre Williams
 
WageDiscriminationAmongstNFLAthletes
WageDiscriminationAmongstNFLAthletesWageDiscriminationAmongstNFLAthletes
WageDiscriminationAmongstNFLAthletesGeorge Ulloa
 
Senior Project Research Paper
Senior Project Research PaperSenior Project Research Paper
Senior Project Research Papercrissy498
 
A Test For Salary Discrimination
A Test For Salary DiscriminationA Test For Salary Discrimination
A Test For Salary DiscriminationJorge Arias
 
Joe Kruger Report. OPTIMA
Joe Kruger Report. OPTIMAJoe Kruger Report. OPTIMA
Joe Kruger Report. OPTIMAJoe Kruger
 
Perfunctory NBA Analysis
Perfunctory NBA AnalysisPerfunctory NBA Analysis
Perfunctory NBA Analysis
Radu Stancut
 
Hitters vs. Pitchers: The Anomaly of the DH Rule in the MLB
Hitters vs. Pitchers: The Anomaly of the DH Rule in the MLBHitters vs. Pitchers: The Anomaly of the DH Rule in the MLB
Hitters vs. Pitchers: The Anomaly of the DH Rule in the MLBmjmiller84
 
Thesis comp
Thesis compThesis comp
Thesis compcig4life
 

Similar to Predicting Salary for MLB Players (20)

SSRN-id2816685
SSRN-id2816685SSRN-id2816685
SSRN-id2816685
 
Kerber_NBA_Analysis
Kerber_NBA_AnalysisKerber_NBA_Analysis
Kerber_NBA_Analysis
 
NBA Salary Discrimination Paper
NBA Salary Discrimination PaperNBA Salary Discrimination Paper
NBA Salary Discrimination Paper
 
The Contract Year Effect in the NBA
The Contract Year Effect in the NBAThe Contract Year Effect in the NBA
The Contract Year Effect in the NBA
 
C16SG6Assg7MiLBv100
C16SG6Assg7MiLBv100C16SG6Assg7MiLBv100
C16SG6Assg7MiLBv100
 
Bank Shots to Bankroll Final
Bank Shots to Bankroll FinalBank Shots to Bankroll Final
Bank Shots to Bankroll Final
 
Multi Criteria Selection of All-Star Pitching Staff
Multi Criteria Selection of All-Star Pitching StaffMulti Criteria Selection of All-Star Pitching Staff
Multi Criteria Selection of All-Star Pitching Staff
 
The Effect of RAT on Wages for Professional Basketball Players 0505.docx upda...
The Effect of RAT on Wages for Professional Basketball Players 0505.docx upda...The Effect of RAT on Wages for Professional Basketball Players 0505.docx upda...
The Effect of RAT on Wages for Professional Basketball Players 0505.docx upda...
 
Directed Research MRP
Directed Research MRPDirected Research MRP
Directed Research MRP
 
WageDiscriminationAmongstNFLAthletes
WageDiscriminationAmongstNFLAthletesWageDiscriminationAmongstNFLAthletes
WageDiscriminationAmongstNFLAthletes
 
LAX IMPACT! White Paper
LAX IMPACT! White PaperLAX IMPACT! White Paper
LAX IMPACT! White Paper
 
Senior Project Research Paper
Senior Project Research PaperSenior Project Research Paper
Senior Project Research Paper
 
A Test For Salary Discrimination
A Test For Salary DiscriminationA Test For Salary Discrimination
A Test For Salary Discrimination
 
Final Thesis
Final ThesisFinal Thesis
Final Thesis
 
Joe Kruger Report
Joe Kruger ReportJoe Kruger Report
Joe Kruger Report
 
Joe Kruger Report. OPTIMA
Joe Kruger Report. OPTIMAJoe Kruger Report. OPTIMA
Joe Kruger Report. OPTIMA
 
Perfunctory NBA Analysis
Perfunctory NBA AnalysisPerfunctory NBA Analysis
Perfunctory NBA Analysis
 
Econometrics Paper
Econometrics PaperEconometrics Paper
Econometrics Paper
 
Hitters vs. Pitchers: The Anomaly of the DH Rule in the MLB
Hitters vs. Pitchers: The Anomaly of the DH Rule in the MLBHitters vs. Pitchers: The Anomaly of the DH Rule in the MLB
Hitters vs. Pitchers: The Anomaly of the DH Rule in the MLB
 
Thesis comp
Thesis compThesis comp
Thesis comp
 

Recently uploaded

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 

Recently uploaded (20)

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 

Predicting Salary for MLB Players

  • 1. DREXEL UNIVERSITY | ECON 350 Predicting Salary for MLB Players An Empirical project Patel, Vraj & Greene, Robert 6/4/2015
  • 2. ABSTRACT The focal point of this paper is an attempt to examine the relationship between the percentage- based contractual salaries (log(salary)) [hereto referred throughout as “LSalary”] of qualifying 2013 Major League Baseball players and the following statistics: age, age squared, games, home-runs, slugs, hits, at bats, and on base percentage. As there are many factors that are contributory towards LSalary, those players who are designated as pitchers have been omitted from the data set as they have dissimilar qualities that obscure the intended data set. With this restricted data set, the sample consists of four hundred forty seven observations across thirty teams within the 2013 season. Utilizing the aforementioned restricted data collection, we have regressed LSalary, age, agesquared, games, home-runs, slugs, hits, at bats, on base percentage, strikeouts, times caught stolen, stolen bases and runs. However, after controlling our variables, our conclusion is drawn from the empirical results suggesting that age, age squared, games, home-runs, slugs, hits, at bats, and on base percentageare arguably significant drivers in determining a player’s LSalary at an adjusted R2 value of 0.5139, or 51.39% explanatory level.
  • 3. Introduction Not prone to recent years, the seven and eight-figure salary that many professional baseball players receive has been criticized as gross overpayment. Some critique that, because few working-class individuals can imagine making a one-million dollar plus annual salary, the once “working-class baseball game” has lost touch with its origin1. This begs the question as to whether or not the players of the Major League Baseball profession are overpaid; but perhaps through their certain skillset, these players have earned these outrageous salaries. As such, the purpose behind this research is to ascertain whether or not signed players of Major League Baseball teams aregiven their contractual salary subsequent to their performance, or if there is an outside factor that acts as the primary driver in this determination. However, in order to present solutions in a meaningful manner, a more thorough understanding and discussion must be held in determining what factors, or rather performance-based statistics, should be closely looked at. The scope of this study resides in observing those players which have a full-detailed list of playing statistics attributed to them. Previous conducted studies, which will be discussed further within the literature review sub-section of this report, have attempted to illustrate meaningfulness via acknowledging every available playing statistic variable; however, approaching the project as such will obstruct the ultimate outcome of the data, as a percentage of players are expected to excel in (largely) batting/scoring, while others are paid primarily for the ability to pitch. Be it that we arelooking for the main driving force behind MLB player salaries, and that pitchers make a minute percentage of the entire league, we have omitted any player designated as a pitcher from our sample. Additionally, although perhaps a factorial indicator in salary, as we areunable to quantify such attributes as emotional drawl, optimism, or reactionary self-awareness, these too will be omitted from our calculations. 1 See Phillip VanFossen (2009) for an insightful argument and justification of MLB player salary on howviewership is largely responsiblefor salary contribution.
  • 4. Literature Review Unlike any comparative methodology, Major League Baseball is the only one the big four (4) North American Sports teams that suppress any form of salary limitation. “Salary caps areemployed with professional team sports leagues all over the world…. Conventional wisdom suggests that they are a collusive effort of club owners to control labor costs,” (Dietl, 13). However, while there is no binding restriction on commending a certain player with an egregious salary, there is an incorporated surcharge when the aggregatedpayroll of a team exceeds that which is previously established by the league. This tax restriction acts as an incentive for team management not to sign a largepercentage of the A-List players, ergo rescinding the competitive balance needed to maintain viewership. Upon reviewing previously published articles and scholarly journals pertaining to this subject matter, many of the topics had predicted economical short-comings or comparable salaries of separate industry entities as the focal point of their research. That is not to say that the research went without merit; of the limited relevant documentation found, two had shared a similar goal in determining what factors were detrimental in predicting a player’s salary: Meltzer and VanFossen. Meltzer had investigated various means to measure player’s performance which lead toward determining not only salary, but contract length as well. Explicitly, Meltzer had conducted his experimental research utilizing data from a 2002 study, in a two-stage least squares examination. Utilizing this method enabled Meltzer to estimate both salary and contract length as a function of the other. Meltzer’sconcluding results illustrated that there are fundamentally two distinct areas of deviation for contract length and averagesalary: “[The] first comes from young improving players who are likely to get long-term contracts at low annual salaries. [The] second comes from players with chronic injuries, whose salary is not affected by their injuries but who will tend to get shorter contracts that they otherwise would,” (Meltzer, 1). The variables used within Meltzer’smodels are fairly consistent with that of with which we had conducted. Meltzer had used various independent variables in the opportunity to predict salary via the following variables: averagesalary, length of contract, OPSChange, Plate Appearance, All Star Selection, Gold
  • 5. Glove, Age, Age-Sq, Catcher, Short Stop, Outfielder, Free Agent, Arbitration, Hi-Pay, Lux, Lo-Pay, and Population (of the team’s metropolitan area)2. What is interesting about Meltzer’scollection of data is that he introduces an identical hypothesis that we, as the authors of this analysis, had in conducting our own research: Dropping any non-hitters, i.e., Pitchers. Both the research conducted by Meltzer and ourselves had limited the data collection to hitters, as pitching statistics are “less universal than hitting statistics” (Meltzer, 13). Attempting to introduce this data would ultimately obscure the existing data pool while also being incomplete in areas that have been determined as quantitative in terms of skill and performance, such as variable “Hit” for instance. VanFossen had introduced an unexpected perspective upon approaching justification of player salary by means of economical inflation, risk assessment, strict marketing strategy, and non-quantitative measurements of emotional appeal. Where there article lacked in strict quantitative and statistical analysis of his premises and conclusion, his concept and theorized verbal analysis was thorough and argumentatively sound. VanFossen’s elementary conclusion was that “[a]thletes are paid based upon their contribution to fan satisfaction… [fans/viewership] contribute toward their salaries by attending games, watching them live on the television, and supporting them through apparel purchases...” (VanFossen). Speaking-at-large, the usefulness of this article is found within its ability to illuminate any inconsistencies with our predicted salary; otherwise stated, further explaining the calculated R2 value, and consequently any error terms that might suffice in our equation. Data The data used in our study originally consisted of all the player data that was available for all Major League Baseball players during the 2013 season. This data set did not however separate players by position. The total number observations in our full data set consisted of 837 players. Also it worth noting that the data 2 See Josh Meltzer (2005) for his regression in its entirety via step-by-step augmentation and analysis.
  • 6. set did not exclude players because they did not have a minimum number of at bats during the season. Our data was collected form a single source, the site Baseball-Reference.com, and any inaccuracies that are recorded on the site will also be reflected in our regression model. The statistics that we collected included salary, age, games played, number of home-runs, slugging percentage, number of hits, number of runs scored, number of at bats, on base percentage, number of strikeouts, number of stolen bases, and the number of times caught stolen. We selected these statistics because they were readily available on our data source, and also because we believed that these statistics would be the best predictors of salary. Methodology Our initial thoughts on how to perform the regression was to regress salary against all the statistics we had gathered, however this resulted very abnormal results such as the coefficient on hits and runs being negative. Economic theory however would suggest that these statistics should have a positive effect on the salty earned by the player. We later realized that this most likely caused by including Pitchers in the data set. Pitcher in Major LeagueBaseball are evaluated on different metrics than batters and thus were throwing of our results. We then striped all the Pitchers from our data set, and were left with 447 observations. When we ran a regression on this data set, the results looked much better. We then changed our dependent variable from salary to the log of salary because we believed that this give us a better indication of the movement in salary. Our final step in the process was to drop all statistically insignificant variable that were not helping us explain the variation in the log of salary such as stolen basses and stroke outs. Empirical Results Our final regression model, after all of the adjustment that we mentioned previously were made, was: 𝐿𝑜𝑔𝑆𝑎𝑙𝑎𝑟𝑦 = β0 + β1Age + β2Age2 + β3Games + β4AtBats + β5Sluging + β6Hits + β7HR + β8OBP + μ
  • 7. Where LogSalary is the log of a player’s salary, Age is the Player’s Age, Age2 is the Age squared, AtBatsis the number of times a player batted, Slugging is the total number of bases a player had dived by the number of at bats, Hits is the number of hits a player earned, HR is the number of home-runs a player hit, and OBP is the percentage of times a player was on to base. As the tables at the end indicate, the model has an Adj R2 value of .5139 which means that model is able to explain around 51% of the variation in LogSalary given the eight aforementioned explanatory variables. This may seem to be low for a predictive model, but is actually in line with similar studies conducted in the past. 𝐿𝑜𝑔𝑆𝑎𝑙𝑎𝑟𝑦̂ = -.213+ .869Age + -.012Age2 + -.018Games + .003AtBats + -2.01Sluging + .01Hits + .03HR + 1.58OBP Given that all of our observations had a positive age and at least one gameplayed, we believe that the constant term in our model is not significant to our findings. The beta coefficients on the explanatory however are meaningful. For example one additional hit is expected to increase the salary by .01 percentage points. Slugging percentageon the other hand is expected to have a negative effect on LogSalary, specifically a one percent increase in slugging is expected to lower salary by 2.01 percentagepoints. Home-runs, hits, at bats, and on base percent are all indicated to have a positive effect on salary, with interpretations of a .03 percentage point increase, .01 percentagepoint increase, .003 percentagepoint increase, and a 1.58 percentage point increase in salary respectively. Conclusion Given the results of model we believe that it is possible to predict the earnings of professional baseball players in the MLB given their performance during the season. Even though our model only has an Adj R2 of .51, we still believe that this model is usable given that it has an F stat of 59.94 and Probability F Crit > F = 0.0000. This by no means that our model is a perfect predictor of salary, there arevarious improvements that we can make to gain a better understating of the variance in player salary. Some possible improvements include adding careerstatics to the model, and also classifying each player by position by
  • 8. using dummy variables. Although we may be unable to quantify the x factor that a player may have for an organization and its sponsors, we are however able to quantify how much a home-run, hit, or at bat is worth to the organization, and by using this statistics we can predict how a player’s salary will change given his performance during the season. Tables Summary Statistics Regression Correlation Matrix (Full Data Set)
  • 10. Bibliography Dietl, Helmut M, Markus Lang, and Alexander Rathke. "The Effect of Salary Caps in Professional Team Sports on Social Welfare." The B.E. Journal of Economic Analysis & Policy 1, no. 72 (2009): 7-14. Meltzer, Josh. “Average Salary and Contract Length in Major League Baseball: When Do They Diverge?” 2005, Department of Economics, Stanford University, CA. Accessed May 24, 2015 Rhonda Magel, Michael Hoffman, Predicting Salaries of Major League Baseball Players, International Journal of Sports Science, Vol. 5 No. 2, 2015, pp. 51-58. doi: 10.5923/j.sports.20150502.02. Sports Reference LLC. "(2013 Major League Baseball Standard Batting)." Baseball-Reference.com - Major League Statistics and Information. http://www.baseball-reference.com/. Accessed May 22, 2015 VanFossen, Phillip. "The Economics of Professional Sports: Underpaid Millionaires?" The Economics of Professional Sports: Underpaid Millionaires? August 5, 2009. Accessed May 27, 2015.