SlideShare a Scribd company logo
1 of 37
Quick Regression Review
Mini-case Studies
Vishal Singh
NYU-Stern
What we should know
 Basics of the software
 Know the difference between a continuous & discrete
(nominal) variable
 Know how to summarize a continuous (e.g. mean income) and nominal
(e.g. % Female)
 Relationship between 2 variables
 Both continuous (correlation)
 Both Nominal (cross-tab, mosaic plot)
 One Continuous & one Nominal (e.g. take Mean of continuous
variable by Nominal)
 Understand p-value: Only time we are interested in
‘statistical’ test is when doing controlled experiments
Why do need models?
o Graphs are useful for understanding but don’t
scale (when we have too many potential
predictors).
o We want to automate the analysis
o Which Ad to display?
o How to provide an insurance quote based on the
information provided by a new customer
o Conduct ‘what-if’ analysis for planning Black Friday.
Example
Predicting auto insurance
Traditional measures
Usage based
GPS device
Forecasting sales for Sony digital camera at Best
Buy
Build a demand model based on historical data from
1000 stores
Regression: Key Points
Regression: widely used research tool
 Determine whether the independent variables explain a significant
variation in the dependent variable: whether a relationship exists.
 Determine how much of the variation in the dependent variable can
be explained by the independent variables: strength of the
relationship.
 Control for other independent variables when evaluating the
contributions of a specific variable or set of variables. Marginal effect
 Forecast/Predict the values of the dependent variable.
 Use regression results as inputs to additional computations:
Optimal pricing, promotion, time to launch a product….
Exercise 1:
Box Office Revenue Prediction (see JMP file Box Office)
D3M
Box Office Prediction
Suppose you are helping Warner Bros. in
developing a model for forecasting Box Office
revenues for their new movie The Watchman. In
the file “BoxOffice.csv” you are provided the
opening week revenues (in millions of $) for
various past movies along with several predictor
variables:
Variable Description of the Variable
Opening_Week_Revenue Opening Week Revenue in millions of $
# of Theaters Number of movie theaters each movie was initially released
Overall Rating Critic ratings for each movie (high number implies more favorable ratings)
Genre 1 for Action, 2 for Comedy, 3 for Kids, and 4 Other
Data
Movie Opening_Week_Revenue Num_Theaters Overall_Rating Genre
The Dark Knight 158.4 4366 82 1
Iron Man 98.6 4105 79 1
Indiana Jones and the Kingdom of the Crystal Skull 100.1 4260 65 1
Hancock 62.6 3965 49 1
Quantum of Solace 67.5 3451 58 1
The Incredible Hulk 55.4 3505 61 1
Wanted 50.9 3175 64 1
Get Smart 38.7 3911 54 1
The Mummy: Tomb of the Dragon Emperor 40.5 3760 31 1
Journey to the Center of the Earth 21 2811 57 1
Eagle Eye 29.2 3510 43 1
10,000 B.C. 35.9 3410 34 1
Valkyrie 21 2711 56 1
Jumper 27.4 3428 35 1
Cloverfield 40.1 3411 64 1
The Day the Earth Stood Still (2008) 30.5 3560 40 1
Hellboy II: The Golden Army 34.5 3204 78 1
Spider-Man 3 151.1 4252 59 1
Transformers 70.5 4011 61 1
Pirates of the Caribbean: At World's End 114.7 4362 50 1
Objective
 Develop a regression model for “Opening
week Revenues” and all other variables as
predictors. Interpret your parameters.
 Prediction: The attributes for the movie
“Watchman” are as follows:
– Theaters= 3611, Rating= 57, Action= 1
– Given this information, what are the predicted
first week revenues for the new movie
Watchman?
Bivariate Relationship with Predictors
Bivariate Relationship with Predictors
Developing a Regression Model
D3M
Regression: Forecasting Box-office Revenues
 You need to convert the “Genre” variable into a series of dummy variables. This
is a nominal variable (i.e. categories such as 1=Action, 2=Comedy..). Adding this
variable directly into regression does not teach us anything. For example, our
coding could have been 1=Comedy, 2=Action...).
 In addition, note that total number of dummy variables we include/need is 1
less than the number of categories. The left out category is absorbed in the
intercept.
 It does not matter what you leave out—all included dummy variables will be
interpreted with respect to what you leave out.
 For example, suppose we leave out “Action” and include dummy variables for
“comedy”, “kids” and “other”. The output of this regression:
Regression with Genre Dummy Variables Only
We left out “Action” as the
base. Compare the Intercept &
Average for Action
Just looking at the means, we
see that “Kids” movies generate
(56.66 - 45.10 = 11.56) less
than action. This is the
coefficient for ‘kids’ in the
regression.
Output from JMP
Note: In JMP output, go to red triangle and then select Estimates- Indicator Function
Parameterization to get “dummy” variable output
JMP Output
What is the interpretation of
Action here?
Leave out Comedy this time
We left out “Comedy” this itme
which is the intercept now.
See that Action is 24.68 More
than Comedy. Compare this to
the -24.68 coefficient on
Comedy in the previous
regression
Obviously none of the model fit
change. The coefficients get
adjusted based on the left out
category (Comedy in this case)
Add All Predictors
• Regression is OWR (dependent variable) & #of Theaters,
Ratings, Genre as predictors
# of Theaters: Each additional point in overall
rating increases OWR by $.278mn
Overall_Rating: Each additional point in overall
rating increases OWR by $.278mn.
Genre (Kids): Compared to “Other”, kids
movies generate 17.53 less in OWR after
controlling for the effect of # of Theaters and
Ratings
Objective
 Develop a regression model for “Opening
week Revenues” and all other variables as
predictors. Interpret your parameters.
 Prediction: The attributes for the movie
“Watchman” are as follows:
– Theaters= 3611, Rating= 57, Action= 1
– Given this information, what are the predicted
first week revenues for the new movie
Watchman?
Exercise 2: Impact of Southwest
Context
Southwest & the Wright Amendment
Click on article or
google “Southwest
Wright Amendment”
to get context
Impact of Southwest Airlines on Price
Suppose you are representing Southwest and want to claim that
presence of SW in a market is good for consumers-- because it lowers
the fares.
For analysis, you are provided data on Fares from approximately 600
“city-pairs” with following variables:
 Objective: Analyze the impact of Southwest
presence on the average fares
Snapshot of Data
Start with Distributions
Anything Unusual?
Compare Mean Fare by SW
NOTE: If you square the t-ratio 6.71:
(6.71* 6.71) you get 45.03 (F-ratio)
Basic intuition of Regression Based Models
o Conceptually, fares do not just depend on presence of
Southwest
o Other factors
o In our example: Competition, Distance
o Analyze relationship b/w these variables & Fares
o In analyzing output with single predictors, note the
correspondence between regression output vs. ANOVA (t-
test)
o We get the same output from regression as a t-test or ANOVA
o More important point is to understand the workings of a
“dummy” variable in regression
Know how to
interpret these
What happens when
we treat “# of other
airlines” as nominal
vs. continuous
variable
Conceptual & Practical Tip
“Recoding Variable”
 Collapse # of other airlines from 6 categories to 4.
 Arbitrary based on distribution of data
Framing this as a Regression Problem
Regression of Fares on
Southwest. Understand how
Dummy variable is coded
Understand Output
Rsquare: Of the
total variation in
Fares, 41.6% is
explained by our
model
Distance is the most
important predictor
& Southwest is least
important
Interpretation Of Coefficients
Southwest: After Controlling for Distance and Competition (#of airlines),
absence of Southwest in the market increases fares by approximately $49.
Distance: Increasing distance by 100 miles, increases the fare by $ 21.5
# of Airline: Increasing the number of airline serving the markets by 1, reduces
the fare by approximately $41.
• Least Squares Principle: Choose β’s so that the sum of the
squared prediction errors,
is a small as possible.
Ok, but what does that mean? Open the file SSQ_Intuition.xls
2
m3m2
1
m10m )SF()( CompDistWareSSQ
M
m
  
How does the software Compute the parameters?
Average Fare by # of Airlines
Split by Presence of Southwest (Interactions—for later)
Conclusion
 T-test and ANOVA are
both used to compare
means across different
groups
 T-test for 2 groups and
ANOVA for many
groups
 We can always convert
the question to a
regression problem
using dummy variables
 Advantage of
regression is that it is
straightforward to
control for any number
of other variables that
might impact the
outcome
 From now on, we will
focus on regression
analysis
Regression: Key Points
Regression: widely used research tool
• Determine whether the independent variables explain a significant
variation in the dependent variable: whether a relationship exists.
• Determine how much of the variation in the dependent variable can
be explained by the independent variables: strength of the
relationship.
• Control for other independent variables when evaluating the
contributions of a specific variable or set of variables. Marginal effect
• Forecast/Predict the values of the dependent variable.
• Use regression results as inputs to additional computations:
Optimal pricing, promotion, time to launch a product….

More Related Content

What's hot

Campaign response modeling
Campaign response modelingCampaign response modeling
Campaign response modelingEsteban Ribero
 
Module5.slp
Module5.slpModule5.slp
Module5.slpGimylin
 
Module5.slp
Module5.slpModule5.slp
Module5.slpGimylin
 
Xue paper-01-13-12
Xue paper-01-13-12Xue paper-01-13-12
Xue paper-01-13-12Yuhong Xue
 
Over Priced Listings
Over Priced ListingsOver Priced Listings
Over Priced ListingsKent Lardner
 
Moderation and Meditation conducting in SPSS
Moderation and Meditation conducting in SPSSModeration and Meditation conducting in SPSS
Moderation and Meditation conducting in SPSSOsama Yousaf
 
Value investing and emerging markets
Value investing and emerging marketsValue investing and emerging markets
Value investing and emerging marketsNavneet Randhawa
 
WeikaiLi_Publication
WeikaiLi_PublicationWeikaiLi_Publication
WeikaiLi_PublicationWeikai Li
 
Capm theory portfolio management
Capm theory   portfolio managementCapm theory   portfolio management
Capm theory portfolio managementBhaskar T
 
Mevsys Data Mining: Knowledge Discovery.
Mevsys Data Mining: Knowledge Discovery.Mevsys Data Mining: Knowledge Discovery.
Mevsys Data Mining: Knowledge Discovery.Mevsys Data Mining
 
Black_JPM93_Beta_And_return
Black_JPM93_Beta_And_returnBlack_JPM93_Beta_And_return
Black_JPM93_Beta_And_returnRussell Abrams
 
Expected value return & standard deviation
Expected value return & standard deviationExpected value return & standard deviation
Expected value return & standard deviationJahanzeb Memon
 
Quantifying an association to predict future events chapt
Quantifying an association to predict future events chaptQuantifying an association to predict future events chapt
Quantifying an association to predict future events chaptMARK547399
 
The X Factor
The X FactorThe X Factor
The X Factoryamanote
 

What's hot (19)

Campaign response modeling
Campaign response modelingCampaign response modeling
Campaign response modeling
 
Module5.slp
Module5.slpModule5.slp
Module5.slp
 
Module5.slp
Module5.slpModule5.slp
Module5.slp
 
Xue paper-01-13-12
Xue paper-01-13-12Xue paper-01-13-12
Xue paper-01-13-12
 
Assignment
AssignmentAssignment
Assignment
 
Over Priced Listings
Over Priced ListingsOver Priced Listings
Over Priced Listings
 
Moderation and Meditation conducting in SPSS
Moderation and Meditation conducting in SPSSModeration and Meditation conducting in SPSS
Moderation and Meditation conducting in SPSS
 
Value investing and emerging markets
Value investing and emerging marketsValue investing and emerging markets
Value investing and emerging markets
 
Feature selection
Feature selectionFeature selection
Feature selection
 
WeikaiLi_Publication
WeikaiLi_PublicationWeikaiLi_Publication
WeikaiLi_Publication
 
Capm theory portfolio management
Capm theory   portfolio managementCapm theory   portfolio management
Capm theory portfolio management
 
Demand Estimation
Demand EstimationDemand Estimation
Demand Estimation
 
Mevsys Data Mining: Knowledge Discovery.
Mevsys Data Mining: Knowledge Discovery.Mevsys Data Mining: Knowledge Discovery.
Mevsys Data Mining: Knowledge Discovery.
 
Black_JPM93_Beta_And_return
Black_JPM93_Beta_And_returnBlack_JPM93_Beta_And_return
Black_JPM93_Beta_And_return
 
Demand forcasting
Demand forcastingDemand forcasting
Demand forcasting
 
Expected value return & standard deviation
Expected value return & standard deviationExpected value return & standard deviation
Expected value return & standard deviation
 
Quantifying an association to predict future events chapt
Quantifying an association to predict future events chaptQuantifying an association to predict future events chapt
Quantifying an association to predict future events chapt
 
The X Factor
The X FactorThe X Factor
The X Factor
 
muthu.shree
muthu.shreemuthu.shree
muthu.shree
 

Similar to Regressioin mini case

Machine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paperMachine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paperJames by CrowdProcess
 
Marketing Research Approaches .docx
Marketing Research Approaches .docxMarketing Research Approaches .docx
Marketing Research Approaches .docxalfredacavx97
 
UNIT - I Reinforcement Learning .pptx
UNIT - I Reinforcement Learning .pptxUNIT - I Reinforcement Learning .pptx
UNIT - I Reinforcement Learning .pptxDrUdayKiranG
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network ModelEric Esajian
 
Risk Concept And Management 5
Risk Concept And Management 5Risk Concept And Management 5
Risk Concept And Management 5rajeevgupta
 
Brown bag 2012_fall
Brown bag 2012_fallBrown bag 2012_fall
Brown bag 2012_fallXiaolei Zhou
 
Faster and cheaper, smart ab experiments - public ver.
Faster and cheaper, smart ab experiments - public ver.Faster and cheaper, smart ab experiments - public ver.
Faster and cheaper, smart ab experiments - public ver.Marsan Ma
 
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATION
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATIONGENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATION
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATIONijaia
 
Profit Maximization over Social Networks
Profit Maximization over Social NetworksProfit Maximization over Social Networks
Profit Maximization over Social NetworksWei Lu
 
Causal Inference, Reinforcement Learning, and Continuous Optimization
Causal Inference, Reinforcement Learning, and Continuous OptimizationCausal Inference, Reinforcement Learning, and Continuous Optimization
Causal Inference, Reinforcement Learning, and Continuous OptimizationScientificRevenue
 
Machine learning algorithms and business use cases
Machine learning algorithms and business use casesMachine learning algorithms and business use cases
Machine learning algorithms and business use casesSridhar Ratakonda
 
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...IRJET Journal
 
movieRecommendation_FinalReport
movieRecommendation_FinalReportmovieRecommendation_FinalReport
movieRecommendation_FinalReportSohini Sarkar
 

Similar to Regressioin mini case (20)

Machine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paperMachine learning in credit risk modeling : a James white paper
Machine learning in credit risk modeling : a James white paper
 
Linear_Regression
Linear_RegressionLinear_Regression
Linear_Regression
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Marketing Research Approaches .docx
Marketing Research Approaches .docxMarketing Research Approaches .docx
Marketing Research Approaches .docx
 
UNIT - I Reinforcement Learning .pptx
UNIT - I Reinforcement Learning .pptxUNIT - I Reinforcement Learning .pptx
UNIT - I Reinforcement Learning .pptx
 
Chapter 04
Chapter 04 Chapter 04
Chapter 04
 
Neural Network Model
Neural Network ModelNeural Network Model
Neural Network Model
 
Risk Concept And Management 5
Risk Concept And Management 5Risk Concept And Management 5
Risk Concept And Management 5
 
Brown bag 2012_fall
Brown bag 2012_fallBrown bag 2012_fall
Brown bag 2012_fall
 
Faster and cheaper, smart ab experiments - public ver.
Faster and cheaper, smart ab experiments - public ver.Faster and cheaper, smart ab experiments - public ver.
Faster and cheaper, smart ab experiments - public ver.
 
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATION
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATIONGENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATION
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATION
 
1 chapter 04
1 chapter 041 chapter 04
1 chapter 04
 
Predicting Movie Success Using Neural Network
Predicting Movie Success Using Neural NetworkPredicting Movie Success Using Neural Network
Predicting Movie Success Using Neural Network
 
Pro max icdm2012-slides
Pro max icdm2012-slidesPro max icdm2012-slides
Pro max icdm2012-slides
 
Profit Maximization over Social Networks
Profit Maximization over Social NetworksProfit Maximization over Social Networks
Profit Maximization over Social Networks
 
Causal Inference, Reinforcement Learning, and Continuous Optimization
Causal Inference, Reinforcement Learning, and Continuous OptimizationCausal Inference, Reinforcement Learning, and Continuous Optimization
Causal Inference, Reinforcement Learning, and Continuous Optimization
 
PyGotham 2016
PyGotham 2016PyGotham 2016
PyGotham 2016
 
Machine learning algorithms and business use cases
Machine learning algorithms and business use casesMachine learning algorithms and business use cases
Machine learning algorithms and business use cases
 
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...
NON-STATIONARY BANDIT CHANGE DETECTION-BASED THOMPSON SAMPLING ALGORITHM: A R...
 
movieRecommendation_FinalReport
movieRecommendation_FinalReportmovieRecommendation_FinalReport
movieRecommendation_FinalReport
 

More from veesingh

Identification1
Identification1Identification1
Identification1veesingh
 
Brand Asset Case Study
Brand Asset Case StudyBrand Asset Case Study
Brand Asset Case Studyveesingh
 
Fat Tax Slideshow
Fat Tax SlideshowFat Tax Slideshow
Fat Tax Slideshowveesingh
 
Correlation causality
Correlation causalityCorrelation causality
Correlation causalityveesingh
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learningveesingh
 
Field experiments
Field experimentsField experiments
Field experimentsveesingh
 
Brand mining
Brand miningBrand mining
Brand miningveesingh
 
D3M Commodity
D3M Commodity D3M Commodity
D3M Commodity veesingh
 
D3M Online Reviews
D3M Online ReviewsD3M Online Reviews
D3M Online Reviewsveesingh
 
D3M Politics
D3M PoliticsD3M Politics
D3M Politicsveesingh
 

More from veesingh (12)

Slalom
SlalomSlalom
Slalom
 
Identification1
Identification1Identification1
Identification1
 
Brand Asset Case Study
Brand Asset Case StudyBrand Asset Case Study
Brand Asset Case Study
 
Fat Tax Slideshow
Fat Tax SlideshowFat Tax Slideshow
Fat Tax Slideshow
 
Correlation causality
Correlation causalityCorrelation causality
Correlation causality
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
 
Obesity
ObesityObesity
Obesity
 
Field experiments
Field experimentsField experiments
Field experiments
 
Brand mining
Brand miningBrand mining
Brand mining
 
D3M Commodity
D3M Commodity D3M Commodity
D3M Commodity
 
D3M Online Reviews
D3M Online ReviewsD3M Online Reviews
D3M Online Reviews
 
D3M Politics
D3M PoliticsD3M Politics
D3M Politics
 

Recently uploaded

(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCRsoniya singh
 
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCRsoniya singh
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechNewman George Leech
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis UsageNeil Kimberley
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessAggregage
 
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Serviceankitnayak356677
 
Catalogue ONG NUOC PPR DE NHAT .pdf
Catalogue ONG NUOC PPR DE NHAT      .pdfCatalogue ONG NUOC PPR DE NHAT      .pdf
Catalogue ONG NUOC PPR DE NHAT .pdfOrient Homes
 
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCRsoniya singh
 
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...lizamodels9
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 
Non Text Magic Studio Magic Design for Presentations L&P.pptx
Non Text Magic Studio Magic Design for Presentations L&P.pptxNon Text Magic Studio Magic Design for Presentations L&P.pptx
Non Text Magic Studio Magic Design for Presentations L&P.pptxAbhayThakur200703
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurSuhani Kapoor
 
FULL ENJOY - 9953040155 Call Girls in Chhatarpur | Delhi
FULL ENJOY - 9953040155 Call Girls in Chhatarpur | DelhiFULL ENJOY - 9953040155 Call Girls in Chhatarpur | Delhi
FULL ENJOY - 9953040155 Call Girls in Chhatarpur | DelhiMalviyaNagarCallGirl
 
NewBase 22 April 2024 Energy News issue - 1718 by Khaled Al Awadi (AutoRe...
NewBase  22 April  2024  Energy News issue - 1718 by Khaled Al Awadi  (AutoRe...NewBase  22 April  2024  Energy News issue - 1718 by Khaled Al Awadi  (AutoRe...
NewBase 22 April 2024 Energy News issue - 1718 by Khaled Al Awadi (AutoRe...Khaled Al Awadi
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewasmakika9823
 

Recently uploaded (20)

(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
 
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Keshav Puram 🔝 Delhi NCR
 
Best Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting PartnershipBest Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting Partnership
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman Leech
 
2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage2024 Numerator Consumer Study of Cannabis Usage
2024 Numerator Consumer Study of Cannabis Usage
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for Success
 
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts ServiceVip Female Escorts Noida 9711199171 Greater Noida Escorts Service
Vip Female Escorts Noida 9711199171 Greater Noida Escorts Service
 
Catalogue ONG NUOC PPR DE NHAT .pdf
Catalogue ONG NUOC PPR DE NHAT      .pdfCatalogue ONG NUOC PPR DE NHAT      .pdf
Catalogue ONG NUOC PPR DE NHAT .pdf
 
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Hauz Khas 🔝 Delhi NCR
 
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
Call Girls In Kishangarh Delhi ❤️8860477959 Good Looking Escorts In 24/7 Delh...
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 
Non Text Magic Studio Magic Design for Presentations L&P.pptx
Non Text Magic Studio Magic Design for Presentations L&P.pptxNon Text Magic Studio Magic Design for Presentations L&P.pptx
Non Text Magic Studio Magic Design for Presentations L&P.pptx
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
 
FULL ENJOY - 9953040155 Call Girls in Chhatarpur | Delhi
FULL ENJOY - 9953040155 Call Girls in Chhatarpur | DelhiFULL ENJOY - 9953040155 Call Girls in Chhatarpur | Delhi
FULL ENJOY - 9953040155 Call Girls in Chhatarpur | Delhi
 
NewBase 22 April 2024 Energy News issue - 1718 by Khaled Al Awadi (AutoRe...
NewBase  22 April  2024  Energy News issue - 1718 by Khaled Al Awadi  (AutoRe...NewBase  22 April  2024  Energy News issue - 1718 by Khaled Al Awadi  (AutoRe...
NewBase 22 April 2024 Energy News issue - 1718 by Khaled Al Awadi (AutoRe...
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
 

Regressioin mini case

  • 1. Quick Regression Review Mini-case Studies Vishal Singh NYU-Stern
  • 2. What we should know  Basics of the software  Know the difference between a continuous & discrete (nominal) variable  Know how to summarize a continuous (e.g. mean income) and nominal (e.g. % Female)  Relationship between 2 variables  Both continuous (correlation)  Both Nominal (cross-tab, mosaic plot)  One Continuous & one Nominal (e.g. take Mean of continuous variable by Nominal)  Understand p-value: Only time we are interested in ‘statistical’ test is when doing controlled experiments
  • 3. Why do need models? o Graphs are useful for understanding but don’t scale (when we have too many potential predictors). o We want to automate the analysis o Which Ad to display? o How to provide an insurance quote based on the information provided by a new customer o Conduct ‘what-if’ analysis for planning Black Friday.
  • 4. Example Predicting auto insurance Traditional measures Usage based GPS device Forecasting sales for Sony digital camera at Best Buy Build a demand model based on historical data from 1000 stores
  • 5. Regression: Key Points Regression: widely used research tool  Determine whether the independent variables explain a significant variation in the dependent variable: whether a relationship exists.  Determine how much of the variation in the dependent variable can be explained by the independent variables: strength of the relationship.  Control for other independent variables when evaluating the contributions of a specific variable or set of variables. Marginal effect  Forecast/Predict the values of the dependent variable.  Use regression results as inputs to additional computations: Optimal pricing, promotion, time to launch a product….
  • 6. Exercise 1: Box Office Revenue Prediction (see JMP file Box Office) D3M
  • 7.
  • 8. Box Office Prediction Suppose you are helping Warner Bros. in developing a model for forecasting Box Office revenues for their new movie The Watchman. In the file “BoxOffice.csv” you are provided the opening week revenues (in millions of $) for various past movies along with several predictor variables: Variable Description of the Variable Opening_Week_Revenue Opening Week Revenue in millions of $ # of Theaters Number of movie theaters each movie was initially released Overall Rating Critic ratings for each movie (high number implies more favorable ratings) Genre 1 for Action, 2 for Comedy, 3 for Kids, and 4 Other
  • 9. Data Movie Opening_Week_Revenue Num_Theaters Overall_Rating Genre The Dark Knight 158.4 4366 82 1 Iron Man 98.6 4105 79 1 Indiana Jones and the Kingdom of the Crystal Skull 100.1 4260 65 1 Hancock 62.6 3965 49 1 Quantum of Solace 67.5 3451 58 1 The Incredible Hulk 55.4 3505 61 1 Wanted 50.9 3175 64 1 Get Smart 38.7 3911 54 1 The Mummy: Tomb of the Dragon Emperor 40.5 3760 31 1 Journey to the Center of the Earth 21 2811 57 1 Eagle Eye 29.2 3510 43 1 10,000 B.C. 35.9 3410 34 1 Valkyrie 21 2711 56 1 Jumper 27.4 3428 35 1 Cloverfield 40.1 3411 64 1 The Day the Earth Stood Still (2008) 30.5 3560 40 1 Hellboy II: The Golden Army 34.5 3204 78 1 Spider-Man 3 151.1 4252 59 1 Transformers 70.5 4011 61 1 Pirates of the Caribbean: At World's End 114.7 4362 50 1
  • 10. Objective  Develop a regression model for “Opening week Revenues” and all other variables as predictors. Interpret your parameters.  Prediction: The attributes for the movie “Watchman” are as follows: – Theaters= 3611, Rating= 57, Action= 1 – Given this information, what are the predicted first week revenues for the new movie Watchman?
  • 14. Regression: Forecasting Box-office Revenues  You need to convert the “Genre” variable into a series of dummy variables. This is a nominal variable (i.e. categories such as 1=Action, 2=Comedy..). Adding this variable directly into regression does not teach us anything. For example, our coding could have been 1=Comedy, 2=Action...).  In addition, note that total number of dummy variables we include/need is 1 less than the number of categories. The left out category is absorbed in the intercept.  It does not matter what you leave out—all included dummy variables will be interpreted with respect to what you leave out.  For example, suppose we leave out “Action” and include dummy variables for “comedy”, “kids” and “other”. The output of this regression:
  • 15. Regression with Genre Dummy Variables Only We left out “Action” as the base. Compare the Intercept & Average for Action Just looking at the means, we see that “Kids” movies generate (56.66 - 45.10 = 11.56) less than action. This is the coefficient for ‘kids’ in the regression.
  • 16. Output from JMP Note: In JMP output, go to red triangle and then select Estimates- Indicator Function Parameterization to get “dummy” variable output JMP Output What is the interpretation of Action here?
  • 17. Leave out Comedy this time We left out “Comedy” this itme which is the intercept now. See that Action is 24.68 More than Comedy. Compare this to the -24.68 coefficient on Comedy in the previous regression Obviously none of the model fit change. The coefficients get adjusted based on the left out category (Comedy in this case)
  • 18. Add All Predictors • Regression is OWR (dependent variable) & #of Theaters, Ratings, Genre as predictors # of Theaters: Each additional point in overall rating increases OWR by $.278mn Overall_Rating: Each additional point in overall rating increases OWR by $.278mn. Genre (Kids): Compared to “Other”, kids movies generate 17.53 less in OWR after controlling for the effect of # of Theaters and Ratings
  • 19. Objective  Develop a regression model for “Opening week Revenues” and all other variables as predictors. Interpret your parameters.  Prediction: The attributes for the movie “Watchman” are as follows: – Theaters= 3611, Rating= 57, Action= 1 – Given this information, what are the predicted first week revenues for the new movie Watchman?
  • 20. Exercise 2: Impact of Southwest
  • 21. Context Southwest & the Wright Amendment Click on article or google “Southwest Wright Amendment” to get context
  • 22. Impact of Southwest Airlines on Price Suppose you are representing Southwest and want to claim that presence of SW in a market is good for consumers-- because it lowers the fares. For analysis, you are provided data on Fares from approximately 600 “city-pairs” with following variables:  Objective: Analyze the impact of Southwest presence on the average fares
  • 25. Compare Mean Fare by SW NOTE: If you square the t-ratio 6.71: (6.71* 6.71) you get 45.03 (F-ratio)
  • 26. Basic intuition of Regression Based Models o Conceptually, fares do not just depend on presence of Southwest o Other factors o In our example: Competition, Distance o Analyze relationship b/w these variables & Fares o In analyzing output with single predictors, note the correspondence between regression output vs. ANOVA (t- test) o We get the same output from regression as a t-test or ANOVA o More important point is to understand the workings of a “dummy” variable in regression
  • 28. What happens when we treat “# of other airlines” as nominal vs. continuous variable
  • 29. Conceptual & Practical Tip “Recoding Variable”  Collapse # of other airlines from 6 categories to 4.  Arbitrary based on distribution of data
  • 30. Framing this as a Regression Problem
  • 31. Regression of Fares on Southwest. Understand how Dummy variable is coded
  • 32. Understand Output Rsquare: Of the total variation in Fares, 41.6% is explained by our model Distance is the most important predictor & Southwest is least important
  • 33. Interpretation Of Coefficients Southwest: After Controlling for Distance and Competition (#of airlines), absence of Southwest in the market increases fares by approximately $49. Distance: Increasing distance by 100 miles, increases the fare by $ 21.5 # of Airline: Increasing the number of airline serving the markets by 1, reduces the fare by approximately $41.
  • 34. • Least Squares Principle: Choose β’s so that the sum of the squared prediction errors, is a small as possible. Ok, but what does that mean? Open the file SSQ_Intuition.xls 2 m3m2 1 m10m )SF()( CompDistWareSSQ M m    How does the software Compute the parameters?
  • 35. Average Fare by # of Airlines Split by Presence of Southwest (Interactions—for later)
  • 36. Conclusion  T-test and ANOVA are both used to compare means across different groups  T-test for 2 groups and ANOVA for many groups  We can always convert the question to a regression problem using dummy variables  Advantage of regression is that it is straightforward to control for any number of other variables that might impact the outcome  From now on, we will focus on regression analysis
  • 37. Regression: Key Points Regression: widely used research tool • Determine whether the independent variables explain a significant variation in the dependent variable: whether a relationship exists. • Determine how much of the variation in the dependent variable can be explained by the independent variables: strength of the relationship. • Control for other independent variables when evaluating the contributions of a specific variable or set of variables. Marginal effect • Forecast/Predict the values of the dependent variable. • Use regression results as inputs to additional computations: Optimal pricing, promotion, time to launch a product….

Editor's Notes

  1. 5
  2. 8
  3. 9
  4. 10
  5. 19
  6. 37