IPL auction q1_q2.docx

PAUS_A
Class Exercise 1_IPL Player Auction Price
Prediction
Guided By:
Piyusa Das
Assistant Professor
KSOM
Submitted By:
Aliva Mishra 21202151
Anirban Paul 21202153
Ashim Saraswati 21202026
Deviprasad Ojha 21202080
Ravi Shankar Pandey 21202099

1) Which are the most important predictors to predict the auction price for IPL player?
Ans: To determine the most important predictors to predict the auction price for an IPL player, we
first need to conduct exploratory data analysis to identify potential predictor variables and their
correlation with the target variable (auction price). We would then use multiple regression
analysis in SAS Studio or SAS Enterprise Miner to build the model and evaluate the significance
and impact of each predictor on the target variable. The important predictors to predict the
auction price for IPL player are:
1. BASE_PRICE
2. AGE
3. WKTS
2) How will you interpret the model?
Answer: To interpret the model, we examined the coefficients and p-values of the predictor variables, as
well as the overall model fit statistics (such as R-squared or adjusted R-squared). A high R-squared value
indicates that the model explains a large proportion of the variance in the target variable, and a low p-
value for a predictor variable indicates that the predictor is likely to be a significant contributor to the
model. The model shows the important predictors such as BASE_PRICE, AGE, WKTS. We can also
remove CAPTAINCY_EXP, TEAM, ODI_SR_BL to make the adjusted R squared valued nearer to
the R squared value.
3) Are there any differences in the model outputs from SAS Studio and Enterprise Miner?
4) What are the challenges in implementing this model in actual setting?
Challenges in implementing this model in an actual setting could include obtaining accurate and
complete data for all players and teams, dealing with missing or incomplete data, and ensuring
that the model is not overfitting the data. Additionally, there may be additional factors that
influence a player's auction price that are not captured by the model, such as player popularity or
team needs. Here we found that the economy rate of a bowler comes out to be an insignificant
predictor variable but in reality, it is an important factor for determining the price of the player.
Similarly bowling strike rate and batting strike rate comes out to be insignificant predictor
variable, but in actual setting these factors should be considered for predicting the price of the
player.
5) If the objective is to predict win probability with a team of 15 players (11 playing+ 4
extra), what approach will you take? And what additional details (variables) will you
require to create the above model?
In SAS, I would likely take the following approach:
I. Data preparation: Before building the model, I would need to import the data into SAS and
perform any necessary cleaning and preprocessing. This would include checking for missing

values, outliers, and errors, as well as ensuring that the data is in a format that SAS can read. Then,
I would need to gather and clean the relevant data, including historical match statistics for all
teams and players, as well as any additional information that may be relevant such as team
composition, home-field advantage, etc.
II. Feature engineering and selection:
a. After cleaning the data, I would perform feature engineering, which would involve
creating new variables or transforming existing variables to better capture relevant
information. For example, I would create new variables such as the average number of
runs scored by a team, the win-loss record of a team, etc.
b. Encoding Categorical Features: Categorical variables need to be encoded using dummy
variables before building the model. If a categorical variable has n categories then we will
need n-1 dummy variables. So, in the case of “PLAYING_ROLE, we will need three dummy
variables since there are four categories (Batsman, Bowler, Allrounder, and W.Keeper).
Similarly, we can create dummy variables for all categorical variables present in the
dataset.
c. Next, I would perform feature selection to identify the most important predictors for the
model. I would use techniques such as correlation analysis, or chi-squared test to identify
which variables are most strongly associated with the target variable (SOLD PRICE).
III. Model building: After feature engineering, I would use SAS procedures or machine learning
techniques such as logistic regression, decision trees, random forests, or neural networks to build
the multiple regression model. I would use the selected predictors from step 2 to build the model
and the "SOLD PRICE" as the target variable.
IV. Model evaluation: I would evaluate the model's performance using techniques such as cross-
validation, confusion matrix, residual analysis, R-squared, and adjusted R-squared. I would also
compare the results with other models to check the best one.
V. Model deployment: Once I have an accurate model, I can use it to predict the team composition.
To do this, I can input the relevant variables for each player, and the model will output the
predicted role of the player. Based on the predicted roles, we can build a team of 11 players and
4 extras.
Additional details (variables) that I would require to create the above model in SAS would include:
I. Player statistics & past performance: Runs (per match or per season), wickets (per match or per
season), etc. in the previous seasons of the IPL.
II. Player experience: Number of matches played in the IPL, number of matches played for the
national team, whether the player was in the playing 11 in past matches or not, etc.
III. Team performance: A player's performance is closely tied to the team's performance. So, the
team's win-loss record, team strengths, weaknesses, etc. would be important predictors.
IV. Player's popularity: The player's fame, fan following, and brand value also play a crucial role in
determining the auction price.
V. Player's role in the match: The player’s role in the match like which player has got Orange Cap,
Purple Cap, Man of the Match, or Man of the series, also plays a vital role.
VI. Match-related variables: Pitch condition, weather, opposition team, home field advantage, time
of the match, etc.

After considering all the above factors, I would select players for the team. The perfect composition of a
cricket team differs from format to format. In Test cricket, specialists in every field are of prime value. In
limited overs cricket like ODIs and T20Is, cricketers having multiple attributes are of utmost importance.
Hence, all-rounders are very valuable in these formats. As there must be one wicket-keeper, I can keep 2
wicket-keepers at maximum among the 15 players. Apart from that, I would eliminate players with a low
strike rate (for Batsmen), and low Economy (for Bowlers), and select players with both a good Strike Rate
and a good Economy.
The number of players required in each role would depend on several factors such as the playing
conditions, the opposition team's strengths and weaknesses, and the playing style of the players in the
team. Here are some possible methods that can be used to determine the number of players needed in
each role:
I. Statistical analysis: Using data analytics techniques such as regression analysis or decision trees,
you can analyze the relationship between various player performance metrics and the outcome
of the game (winning or losing). Based on the results of this analysis, you can determine the
optimal number of players in each role that would increase the team's chances of winning.
II. Domain expertise: You can seek the advice of experts in the field who have a deep understanding
of the game and the playing conditions. These experts can provide insights into the ideal number
of players required in each role based on their experience and knowledge.
III. Rule-based approach: Based on the rules of the game, you can determine the minimum number
of players required in each role. For example, in cricket, a team must have a minimum of two
batsmen, one wicketkeeper, and three bowlers in the playing eleven.
In conclusion, determining the number of players needed in each role is a crucial step in creating a
balanced and effective team. A combination of statistical analysis, domain expertise, and rule-based
approaches can be used to arrive at the optimal team composition.
Q. How will we predict which player would be a better choice?
Ans.:
We can use the multiple regression model that we built to predict the auction price of players, and use
that information to determine which player would be a better choice.
I. Using the model: After the model is deployed in SAS studio, we can use the input variables
(PLAYER NAME, AGE, COUNTRY, TEAM, PLAYING ROLE, T-RUNS, T-WKTS, ODI-RUNS-S, ODI-SR-B,
ODI-WKTS, ODI-SR-BL, CAPTAINCY EXP, RUNS-S, HS, AVE, SR-B, SIXERS, RUNS-C, WKTS, AVE-BL,
ECON, SR-BL, AUCTION YEAR, BASE PRICE) to predict the sold price of a player.
II. Choosing the player: Once we have the predicted sold price of a player, we can use that
information to compare the predicted sold price of different players and select the one with the
highest predicted value.
III. Other considerations: Of course, there may be other considerations that go into selecting a player
beyond just their predicted sold price. For example, the team may have a specific need for a
certain type of player or a specific role that the player is expected to fill. It's important to take all
of these factors into account when making the final decision.

IV. Model evaluation: we need to evaluate the model again on new data, to check its performance
and make sure it's still accurate.

IPL auction q1_q2.docx

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to IPL auction q1_q2.docx

Similar to IPL auction q1_q2.docx (20)

Recently uploaded

Recently uploaded (20)

IPL auction q1_q2.docx