SlideShare a Scribd company logo
1 of 9
John Worth
In an effort to determine the best possible method to predict the total gross revenue of
major Hollywood movies in theaters in the United States, a test was conducted based on
observations of 52 Hollywood movies released in the year 2012. The independent variables
chosen to explain this were opening weekend performance, the season in which the film was
released, the estimated budget of the film, and the number of screens the film was initially shown
on in the USA. Based on the data observed, I can say with 95% confidence that a Hollywood
movie released in the USA will generate between $97 million and $164 million. This being said,
there is reason to believe that the performance of films are explained by at least one of the other
variables analyzed.
Prior to any statistical analysis, predictions were made pertaining to the relationships
between each independent variable and the dependent. It is believed that opening weekend
performance, estimated budget, and number of screens shown in the USA all have a positive
relationship with the dependent variable (as each increases, so does the dependent). It is also
believed that of the seasons, summer films would generate the highest total gross. A regression
model of the entire data set can be seen in the attached appendix. It is believed a more accurate
model can be formed.
Multicollinearity was not present in the model, in which case a test was conducted to find
overall significance in the variables’ relationships. It was found that overall significance is
present. Next, tests were conducted to find individual significance between each independent
variable and total gross. From these tests, season released was found to be insignificant and was
removed. A new test was conducted to ensure that this new reduced model was more accurate,
which was the result. When the observations for “Taken 2” (see data) were entered into this
model, the result was $121.12M. It is believed that a more accurate model could still be formed.
VIF tests were then conducted and opening weekend was found to be dangerously high, it
was subsequently removed. The new model now includes only estimated budget and screens
shown in the USA as the independent variables. The next test conducted was to determine if
there was overall significance in this model, the results showed that there was. Another set of
tests for individual significance of the two remaining independent variables were conducted
afterwards. Estimated budget was still found to have individual significance. Screens shown in
the USA, however, no longer held individual significance, and consequently were removed as
well. The new model now only contains estimated budget as the independent variable explaining
total USA gross. Now that a bivariate model has been formed, it must be checked for any
violations of the assumptions of linear regression.
Through multiple tests including scatter plots, residual plots, and Durbin Watson tests, a
visual detection indicated a possible violation of the assumption that the expected value of the
error terms are linear. The appropriate polynomial prescription was applied to determine whether
or not this would improve the model. A new variable, budget squared, was then added and a
regression was run. This new model increased the percent of variability in the dependent variable
explained by the equation as well as reduced the standard error. A test was then conducted to
determine overall significance in the model, which there was. However, when conducting tests of
individual significance, both variables failed to show any, which resulted in the removal of the
polynomial variable. When entered into the bivariate model, the “Taken 2” observation is
34.89+1.13(45)=$85.72M. Based on the findings of the tests conducted, this is believed to be the
best model to predict total USA gross for major Hollywood movies.
Appendices
Opening Weekend (in millions) Estimated Budget (in millions) Screens Shown in USA
Opening Weekend (in millions) 1
Estimated Budget (in millions) 0.60686209 1
Screens Shown in USA 0.604160598 0.597862186 1
Total USA Gross (in millions)
Mean 130.9815
Standard Error 16.64957
Median 98
Mode 126
Standard Deviation 120.0618
Sample Variance 14414.83
Kurtosis 5.217573
Skewness 2.043474
Range 622.96
Minimum 0.04
Maximum 623
Sum 6811.04
Count 52
Confidence Level(95.0%) 33.42542
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.95803782
R Square 0.917836465
Adjusted R Square 0.906881327
Standard Error 36.63727831
Observations 52
ANOVA
df SS MS F Significance F
Regression 6 674753.4466 112458.9078 83.78136928 9.29657E-23
Residual 45 60403.0573 1342.290162
Total 51 735156.5039
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 90.0% Upper 90.0%
Intercept 109.995316 24.08266359 4.567406575 3.82446E-05 61.49034168 158.5002904 69.55023111 150.440401
Opening Weekend (in millions) 2.729583269 0.166160103 16.42742881 1.2431E-20 2.394919641 3.064246896 2.450529439 3.008637098
Fall -23.82059761 16.58196578 -1.436536411 0.157766127 -57.21839107 9.577195857 -51.66880516 4.027609941
Summer -28.54918988 16.92093063 -1.687211567 0.09848373 -62.6296936 5.531313848 -56.96666428 -0.131715467
Spring -40.67183284 17.60329404 -2.310467162 0.025502793 -76.12668703 -5.21697865 -70.23528706 -11.10837862
Estimated Budget (in millions) 0.352391577 0.102653087 3.432839562 0.001292254 0.145637647 0.559145506 0.179993171 0.524789982
Screens Shown in USA -0.028689901 0.007449963 -3.851012523 0.00036959 -0.043694897 -0.013684905 -0.041201574 -0.016178229
Full Regression F Test
H0: β1=0 α=0.05 F Critical=2.34
Ha: β1≠0 I reject H0 if F>2.34
F=83.78 83.78>2.34 I REJECT H0, THERE IS OVERALL SIGNIFICANCE.
Individual t Tests
Opening Weekend:
H0: β1=0 α=0.025 t Critical=2.021
Ha: β1≠0 I reject H0 if t>2.021 or t<-2.021
t=16.43 16.43>2.021 I REJECT H0, THERE IS INDIVIDUAL SIGNIFICANCE.
Estimated Budget:
H0: β=0 α=0.025 t Critical=2.021
Ha: β≠0 I reject H0 if t>2.021 or t<-2.021
t=3.43 3.43>2.021 I REJECT H0, THERE IS INDIVIDUAL SIGNIFICANCE.
Screens Shown in USA:
H0: β=0 α=0.025 t Critical=2.021
Ha: β≠0 I reject H0 if t>2.021 or t<-2.021
t=-3.85 -3.85<-2.021 I REJECT H0, THERE IS INDIVIDUAL SIGNIFICANCE.
Seasons:
Fall;
H0: β=0 α=0.025 t Critical=2.021
Ha: β≠0 I reject H0 if t>2.021 or t<-2.021
t=-1.44 I DO NOT REJECT H0, THERE IS NO INDIVIDUAL SIGNIFICANCE.
QUALITATIVE VARIABLE AS A WHOLE IS INSIGNIFICANT
Correlation Test
Multicollinearity does not exist.
Reduced Regression Model
Partial F Test
H0: All β=0 α=0.05 F Critical=2.84
Ha: At least one β≠0 I reject H0 is F>2.84
SSEf=67751.94 SSEr=60403.06 K=6 L=3 MSEf=1342.29
[(67751.94-60403.06)/6-3]/1342.29= 1.82
1.82<2.84 I DO NOTREJECT H0. SUGGEST REDUCED MODEL.
Total Gross=83.84+2.69(Opening Weekend)+0.34(Estimated Budget)-0.03(Screens)
Point Estimate (Based on final observation)
=83.84+2.69(49)+0.34(45)-0.03(3661)
=121.12
Total USA Gross (in millions) Opening Weekend (in millions) Estimated Budget (in millions) Screens Shown in USA
Total USA Gross (in millions) 1
Opening Weekend (in millions) 0.933995579 1
Estimated Budget (in millions) 0.637269456 0.60686209 1
Screens Shown in USA 0.466753369 0.604160598 0.597862186 1
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.952806441
R Square 0.907840114
Adjusted R Square 0.902080121
Standard Error 37.56991798
Observations 52
ANOVA
df SS MS F Significance F
Regression 3 667404.5645 222468.1882 157.6113264 7.55342E-25
Residual 48 67751.93939 1411.498737
Total 51 735156.5039
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 83.84209661 19.02099695 4.407870777 5.852E-05 45.59781902 122.0863742 45.59781902 122.0863742
Opening Weekend (in millions) 2.690003668 0.168337033 15.97986863 7.54681E-21 2.351539379 3.028467957 2.351539379 3.028467957
Estimated Budget (in millions) 0.343073737 0.104961234 3.268575672 0.002001929 0.132035031 0.554112443 0.132035031 0.554112443
Screens Shown in USA -0.02784698 0.007340335 -3.793693302 0.000415794 -0.042605713 -0.013088247 -0.042605713 -0.013088247
Bivariate Regression Models
Opening Weekend:
Estimated Budget:
Screens Shown in USA:
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.637269456
R Square 0.40611236
Adjusted R Square 0.394234607
Standard Error 93.44520973
Observations 52
ANOVA
df SS MS F Significance F
Regression 1 298556.1428 298556.1428 34.19100961 3.77646E-07
Residual 50 436600.361 8732.007221
Total 51 735156.5039
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 34.89425938 20.9274484 1.667391968 0.101688883 -7.139757794 76.92827655 -7.139757794 76.92827655
Estimated Budget (in millions) 1.129544142 0.193173365 5.847307895 3.77646E-07 0.74154402 1.517544264 0.74154402 1.517544264
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.933995579
R Square 0.872347742
Adjusted R Square 0.869794697
Standard Error 43.32306257
Observations 52
ANOVA
df SS MS F Significance F
Regression 1 641312.1164 641312.1164 341.6891161 5.36337E-24
Residual 50 93844.3875 1876.88775
Total 51 735156.5039
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 28.63753479 8.169972732 3.505217915 0.000972696 12.22766161 45.04740797 12.22766161 45.04740797
Opening Weekend (in millions)2.639380358 0.142786257 18.48483476 5.36337E-24 2.352585721 2.926174995 2.352585721 2.926174995
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.466753369
R Square 0.217858707
Adjusted R Square 0.202215881
Standard Error 107.237704
Observations 52
ANOVA
df SS MS F Significance F
Regression 1 160160.2455 160160.2455 13.9270685 0.000486616
Residual 50 574996.2584 11499.92517
Total 51 735156.5039
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -48.89307659 50.44122658 -0.969307844 0.337057283 -150.2072619 52.42110869 -150.2072619 52.42110869
Screens Shown in USA 0.058006437 0.015543411 3.731898779 0.000486616 0.026786577 0.089226297 0.026786577 0.089226297
VIF’s For Reduced Model
Opening Weekend=Total USA Gross, Budget, Screens VIF=11.11
Budget=Opening Weekend,Total USA, Screens VIF=2.22
Screens=Budget, Opening Weekend, Total USA VIF=2.33
Total USA=Budget, Weekend, Screens VIF=11.11
REMOVE VARIABLE “OPENING WEEKEND” VIF TOO HIGH.
New Reduced Regression Model
Individual t Tests
Budget:
H0: β1=0 α=0.025 t Critical=2.021
Ha: β1≠0 I reject H0 if t>2.021 or t<-2.021
t=4.10 4.10>2.021 I REJECT H0. THERE IS INDIVIDUAL SIGNIFICANCE.
Screens:
H0: β1=0 α=0.025 t Critical=2.021
Ha: β1≠0 I reject H0 if t>2.021 or t<-2.021
t=0.98 I DO NOT REJECT H0. VARIABLE IS NOT SIGNIFICANT. REMOVE.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.646186363
R Square 0.417556816
Adjusted R Square 0.393783624
Standard Error 93.47998751
Observations 52
ANOVA
df SS MS F Significance F
Regression 2 306969.6087 153484.8044 17.56418867 1.77301E-06
Residual 49 428186.8952 8738.508065
Total 51 735156.5039
Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -4.505973741 45.28395337 -0.099504867 0.921143411 -95.50748506 86.49553758 -95.50748506 86.49553758
Estimated Budget (in millions) 0.988120623 0.241074754 4.098814197 0.00015567 0.503662767 1.472578478 0.503662767 1.472578478
Screens Shown in USA 0.016585523 0.016902866 0.981225493 0.331301683 -0.017382058 0.050553105 -0.017382058 0.050553105
New Reduced Model
Assumptions Check
-500
0
500
0 50 100 150 200 250 300
Residuals
Estimated Budget (in millions)
Estimated Budget (in millions)
Residual Plot
Original Data Set
Movie Title Total USA Gross (in millions) Opening Weekend (in millions) Fall Summer Spring Estimated Budget (in millions) Screens Shown in USA
The Avengers 623 207 0 0 1 220 4349
Skyfall 304 88 1 0 0 200 3505
The Dark Knight Rises 448 160 0 1 0 250 4404
The Hobbit 303 84 0 0 0 180 4045
Ice Age 161 46 0 1 0 95 3881
Twilight 292 141 1 0 0 120 4070
Amazing Spider Man 262 62 0 1 0 230 4318
Madagascar 3 216 60 0 1 0 145 4258
Men in Black 3 179 54 0 0 1 225 4248
The Hunger Games 408 152 0 0 1 78 4137
This is 40 67 11 0 0 0 35 2913
Argo 136 19 1 0 0 44.5 3232
Ted 218 54 0 1 0 50 3239
21 Jump Street 138 36 0 0 1 42 3121
Prometheus 126 51 0 1 0 51 3396
Dictator 59 17 0 0 1 65 3008
Safe House 126 40 0 0 0 85 3119
The Bourne Legacy 113 38 0 1 0 125 3745
Django 162 30 0 0 0 100 3010
Rise of the Guardians 103 23 1 0 0 145 3653
Paranormal Activity 4 53 29 1 0 0 5 3412
Looper 66 20 1 0 0 30 2992
Dark Shadows 79 29 0 0 1 150 3755
Snow White and the Huntsmen 155 56 0 1 0 170 3773
Dredd 13 6 1 0 0 35 2506
Step up Revolution 35 11 0 1 0 33 2567
Silver Linings Playbook 132 0.4 1 0 0 21 16
Wreck-It Ralph 189 49 1 0 0 165 3752
Cloud Atlas 27 9 1 0 0 102 2008
Les Miserables 148 28 0 0 0 61 2808
Cabin in the Woods 42 14 0 0 1 30 2811
Magic Mike 113 39 0 1 0 7 2930
Lincoln 182 0.9 1 0 0 65 11
Jack Reacher 80 15 0 0 0 60 3352
Flight 93 4 1 0 0 31 1884
Savages 47 16 0 1 0 45 2628
End of Watch 41 13 1 0 0 7 2730
Hotel Transylvania 148 42 1 0 0 85 3349
Expendables 2 85 28 0 1 0 92 3316
LOL 0.04 0.04 0 0 1 11 105
American Reunion 56 21 0 0 1 50 3192
Total Recall 58 25 0 1 0 125 3601
Abraham Lincoln Vampire Slayer 37 16 0 1 0 69 3108
Red Dawn 44 14 1 0 0 65 2725
Project X 54 21 0 0 1 12 3055
Battleship 65 25 0 0 1 209 3690
Chronicle 64 22 0 0 0 12 2907
Here Comes the Boom 45 11 1 0 0 42 3014
The Watch 34 12 0 1 0 68 3168
The Chernobyl Diaries 18 7 0 0 1 1 2433
Alex Cross 25 11 1 0 0 35 2339
Taken 2 139 49 1 0 0 45 3661

More Related Content

Similar to Statistics Report

Regressioin mini case
Regressioin mini caseRegressioin mini case
Regressioin mini caseveesingh
 
Statistics in Finance - M&A and GDP growth
Statistics in Finance - M&A and GDP growthStatistics in Finance - M&A and GDP growth
Statistics in Finance - M&A and GDP growthJean Lemercier
 
Chapter 10 One sample test of hypothesis.ppt
Chapter 10 One sample test of hypothesis.pptChapter 10 One sample test of hypothesis.ppt
Chapter 10 One sample test of hypothesis.pptrhanik1596
 
HW2_Joanne&Yeqi&Danlin
HW2_Joanne&Yeqi&DanlinHW2_Joanne&Yeqi&Danlin
HW2_Joanne&Yeqi&DanlinJuan Du
 
Decision analysis
Decision analysisDecision analysis
Decision analysisTony Nguyen
 
Decision analysis
Decision analysisDecision analysis
Decision analysisJames Wong
 
Decision analysis
Decision analysisDecision analysis
Decision analysisFraboni Ec
 
Intro to econometrics
Intro to econometricsIntro to econometrics
Intro to econometricsGaetan Lion
 
hspring07-085bk
hspring07-085bkhspring07-085bk
hspring07-085bkWinter Liu
 
Asset Owner Survey Second-Half 2010 Institutional Survey
Asset Owner Survey Second-Half 2010 Institutional SurveyAsset Owner Survey Second-Half 2010 Institutional Survey
Asset Owner Survey Second-Half 2010 Institutional SurveyNat Rice
 
Chapter 07 - Autocorrelation.pptx
Chapter 07 - Autocorrelation.pptxChapter 07 - Autocorrelation.pptx
Chapter 07 - Autocorrelation.pptxFarah Amir
 
THE U.S. EMPLOYMENT RATE WHEN THE MINIMUM WAGE IS INCREASED / TUTORIALOUTLET ...
THE U.S. EMPLOYMENT RATE WHEN THE MINIMUM WAGE IS INCREASED / TUTORIALOUTLET ...THE U.S. EMPLOYMENT RATE WHEN THE MINIMUM WAGE IS INCREASED / TUTORIALOUTLET ...
THE U.S. EMPLOYMENT RATE WHEN THE MINIMUM WAGE IS INCREASED / TUTORIALOUTLET ...albert0032
 
You can use a calculator to do numerical calculations. No graphing.docx
You can use a calculator to do numerical calculations. No graphing.docxYou can use a calculator to do numerical calculations. No graphing.docx
You can use a calculator to do numerical calculations. No graphing.docxjeffevans62972
 
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...Business, Management and Economics Research
 

Similar to Statistics Report (19)

Regressioin mini case
Regressioin mini caseRegressioin mini case
Regressioin mini case
 
Statistics in Finance - M&A and GDP growth
Statistics in Finance - M&A and GDP growthStatistics in Finance - M&A and GDP growth
Statistics in Finance - M&A and GDP growth
 
Chapter 10 One sample test of hypothesis.ppt
Chapter 10 One sample test of hypothesis.pptChapter 10 One sample test of hypothesis.ppt
Chapter 10 One sample test of hypothesis.ppt
 
HW2_Joanne&Yeqi&Danlin
HW2_Joanne&Yeqi&DanlinHW2_Joanne&Yeqi&Danlin
HW2_Joanne&Yeqi&Danlin
 
Decision analysis
Decision analysisDecision analysis
Decision analysis
 
Decision analysis
Decision analysisDecision analysis
Decision analysis
 
Decision analysis
Decision analysisDecision analysis
Decision analysis
 
Decision analysis
Decision analysisDecision analysis
Decision analysis
 
Decision analysis
Decision analysisDecision analysis
Decision analysis
 
Decision analysis
Decision analysisDecision analysis
Decision analysis
 
Intro to econometrics
Intro to econometricsIntro to econometrics
Intro to econometrics
 
hspring07-085bk
hspring07-085bkhspring07-085bk
hspring07-085bk
 
Asset Owner Survey Second-Half 2010 Institutional Survey
Asset Owner Survey Second-Half 2010 Institutional SurveyAsset Owner Survey Second-Half 2010 Institutional Survey
Asset Owner Survey Second-Half 2010 Institutional Survey
 
Exercises.pptx
Exercises.pptxExercises.pptx
Exercises.pptx
 
The Laird Report - January 11 2014
The Laird Report - January 11 2014The Laird Report - January 11 2014
The Laird Report - January 11 2014
 
Chapter 07 - Autocorrelation.pptx
Chapter 07 - Autocorrelation.pptxChapter 07 - Autocorrelation.pptx
Chapter 07 - Autocorrelation.pptx
 
THE U.S. EMPLOYMENT RATE WHEN THE MINIMUM WAGE IS INCREASED / TUTORIALOUTLET ...
THE U.S. EMPLOYMENT RATE WHEN THE MINIMUM WAGE IS INCREASED / TUTORIALOUTLET ...THE U.S. EMPLOYMENT RATE WHEN THE MINIMUM WAGE IS INCREASED / TUTORIALOUTLET ...
THE U.S. EMPLOYMENT RATE WHEN THE MINIMUM WAGE IS INCREASED / TUTORIALOUTLET ...
 
You can use a calculator to do numerical calculations. No graphing.docx
You can use a calculator to do numerical calculations. No graphing.docxYou can use a calculator to do numerical calculations. No graphing.docx
You can use a calculator to do numerical calculations. No graphing.docx
 
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
Undercover Boss: Stripping Away the Disguise to Analyze the Financial Perform...
 

Statistics Report

  • 1. John Worth In an effort to determine the best possible method to predict the total gross revenue of major Hollywood movies in theaters in the United States, a test was conducted based on observations of 52 Hollywood movies released in the year 2012. The independent variables chosen to explain this were opening weekend performance, the season in which the film was released, the estimated budget of the film, and the number of screens the film was initially shown on in the USA. Based on the data observed, I can say with 95% confidence that a Hollywood movie released in the USA will generate between $97 million and $164 million. This being said, there is reason to believe that the performance of films are explained by at least one of the other variables analyzed. Prior to any statistical analysis, predictions were made pertaining to the relationships between each independent variable and the dependent. It is believed that opening weekend performance, estimated budget, and number of screens shown in the USA all have a positive relationship with the dependent variable (as each increases, so does the dependent). It is also believed that of the seasons, summer films would generate the highest total gross. A regression model of the entire data set can be seen in the attached appendix. It is believed a more accurate model can be formed. Multicollinearity was not present in the model, in which case a test was conducted to find overall significance in the variables’ relationships. It was found that overall significance is present. Next, tests were conducted to find individual significance between each independent variable and total gross. From these tests, season released was found to be insignificant and was removed. A new test was conducted to ensure that this new reduced model was more accurate,
  • 2. which was the result. When the observations for “Taken 2” (see data) were entered into this model, the result was $121.12M. It is believed that a more accurate model could still be formed. VIF tests were then conducted and opening weekend was found to be dangerously high, it was subsequently removed. The new model now includes only estimated budget and screens shown in the USA as the independent variables. The next test conducted was to determine if there was overall significance in this model, the results showed that there was. Another set of tests for individual significance of the two remaining independent variables were conducted afterwards. Estimated budget was still found to have individual significance. Screens shown in the USA, however, no longer held individual significance, and consequently were removed as well. The new model now only contains estimated budget as the independent variable explaining total USA gross. Now that a bivariate model has been formed, it must be checked for any violations of the assumptions of linear regression. Through multiple tests including scatter plots, residual plots, and Durbin Watson tests, a visual detection indicated a possible violation of the assumption that the expected value of the error terms are linear. The appropriate polynomial prescription was applied to determine whether or not this would improve the model. A new variable, budget squared, was then added and a regression was run. This new model increased the percent of variability in the dependent variable explained by the equation as well as reduced the standard error. A test was then conducted to determine overall significance in the model, which there was. However, when conducting tests of individual significance, both variables failed to show any, which resulted in the removal of the polynomial variable. When entered into the bivariate model, the “Taken 2” observation is 34.89+1.13(45)=$85.72M. Based on the findings of the tests conducted, this is believed to be the best model to predict total USA gross for major Hollywood movies.
  • 3. Appendices Opening Weekend (in millions) Estimated Budget (in millions) Screens Shown in USA Opening Weekend (in millions) 1 Estimated Budget (in millions) 0.60686209 1 Screens Shown in USA 0.604160598 0.597862186 1 Total USA Gross (in millions) Mean 130.9815 Standard Error 16.64957 Median 98 Mode 126 Standard Deviation 120.0618 Sample Variance 14414.83 Kurtosis 5.217573 Skewness 2.043474 Range 622.96 Minimum 0.04 Maximum 623 Sum 6811.04 Count 52 Confidence Level(95.0%) 33.42542 SUMMARY OUTPUT Regression Statistics Multiple R 0.95803782 R Square 0.917836465 Adjusted R Square 0.906881327 Standard Error 36.63727831 Observations 52 ANOVA df SS MS F Significance F Regression 6 674753.4466 112458.9078 83.78136928 9.29657E-23 Residual 45 60403.0573 1342.290162 Total 51 735156.5039 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 90.0% Upper 90.0% Intercept 109.995316 24.08266359 4.567406575 3.82446E-05 61.49034168 158.5002904 69.55023111 150.440401 Opening Weekend (in millions) 2.729583269 0.166160103 16.42742881 1.2431E-20 2.394919641 3.064246896 2.450529439 3.008637098 Fall -23.82059761 16.58196578 -1.436536411 0.157766127 -57.21839107 9.577195857 -51.66880516 4.027609941 Summer -28.54918988 16.92093063 -1.687211567 0.09848373 -62.6296936 5.531313848 -56.96666428 -0.131715467 Spring -40.67183284 17.60329404 -2.310467162 0.025502793 -76.12668703 -5.21697865 -70.23528706 -11.10837862 Estimated Budget (in millions) 0.352391577 0.102653087 3.432839562 0.001292254 0.145637647 0.559145506 0.179993171 0.524789982 Screens Shown in USA -0.028689901 0.007449963 -3.851012523 0.00036959 -0.043694897 -0.013684905 -0.041201574 -0.016178229
  • 4. Full Regression F Test H0: β1=0 α=0.05 F Critical=2.34 Ha: β1≠0 I reject H0 if F>2.34 F=83.78 83.78>2.34 I REJECT H0, THERE IS OVERALL SIGNIFICANCE. Individual t Tests Opening Weekend: H0: β1=0 α=0.025 t Critical=2.021 Ha: β1≠0 I reject H0 if t>2.021 or t<-2.021 t=16.43 16.43>2.021 I REJECT H0, THERE IS INDIVIDUAL SIGNIFICANCE. Estimated Budget: H0: β=0 α=0.025 t Critical=2.021 Ha: β≠0 I reject H0 if t>2.021 or t<-2.021 t=3.43 3.43>2.021 I REJECT H0, THERE IS INDIVIDUAL SIGNIFICANCE. Screens Shown in USA: H0: β=0 α=0.025 t Critical=2.021 Ha: β≠0 I reject H0 if t>2.021 or t<-2.021 t=-3.85 -3.85<-2.021 I REJECT H0, THERE IS INDIVIDUAL SIGNIFICANCE. Seasons: Fall; H0: β=0 α=0.025 t Critical=2.021 Ha: β≠0 I reject H0 if t>2.021 or t<-2.021 t=-1.44 I DO NOT REJECT H0, THERE IS NO INDIVIDUAL SIGNIFICANCE. QUALITATIVE VARIABLE AS A WHOLE IS INSIGNIFICANT
  • 5. Correlation Test Multicollinearity does not exist. Reduced Regression Model Partial F Test H0: All β=0 α=0.05 F Critical=2.84 Ha: At least one β≠0 I reject H0 is F>2.84 SSEf=67751.94 SSEr=60403.06 K=6 L=3 MSEf=1342.29 [(67751.94-60403.06)/6-3]/1342.29= 1.82 1.82<2.84 I DO NOTREJECT H0. SUGGEST REDUCED MODEL. Total Gross=83.84+2.69(Opening Weekend)+0.34(Estimated Budget)-0.03(Screens) Point Estimate (Based on final observation) =83.84+2.69(49)+0.34(45)-0.03(3661) =121.12 Total USA Gross (in millions) Opening Weekend (in millions) Estimated Budget (in millions) Screens Shown in USA Total USA Gross (in millions) 1 Opening Weekend (in millions) 0.933995579 1 Estimated Budget (in millions) 0.637269456 0.60686209 1 Screens Shown in USA 0.466753369 0.604160598 0.597862186 1 SUMMARY OUTPUT Regression Statistics Multiple R 0.952806441 R Square 0.907840114 Adjusted R Square 0.902080121 Standard Error 37.56991798 Observations 52 ANOVA df SS MS F Significance F Regression 3 667404.5645 222468.1882 157.6113264 7.55342E-25 Residual 48 67751.93939 1411.498737 Total 51 735156.5039 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 83.84209661 19.02099695 4.407870777 5.852E-05 45.59781902 122.0863742 45.59781902 122.0863742 Opening Weekend (in millions) 2.690003668 0.168337033 15.97986863 7.54681E-21 2.351539379 3.028467957 2.351539379 3.028467957 Estimated Budget (in millions) 0.343073737 0.104961234 3.268575672 0.002001929 0.132035031 0.554112443 0.132035031 0.554112443 Screens Shown in USA -0.02784698 0.007340335 -3.793693302 0.000415794 -0.042605713 -0.013088247 -0.042605713 -0.013088247
  • 6. Bivariate Regression Models Opening Weekend: Estimated Budget: Screens Shown in USA: SUMMARY OUTPUT Regression Statistics Multiple R 0.637269456 R Square 0.40611236 Adjusted R Square 0.394234607 Standard Error 93.44520973 Observations 52 ANOVA df SS MS F Significance F Regression 1 298556.1428 298556.1428 34.19100961 3.77646E-07 Residual 50 436600.361 8732.007221 Total 51 735156.5039 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 34.89425938 20.9274484 1.667391968 0.101688883 -7.139757794 76.92827655 -7.139757794 76.92827655 Estimated Budget (in millions) 1.129544142 0.193173365 5.847307895 3.77646E-07 0.74154402 1.517544264 0.74154402 1.517544264 SUMMARY OUTPUT Regression Statistics Multiple R 0.933995579 R Square 0.872347742 Adjusted R Square 0.869794697 Standard Error 43.32306257 Observations 52 ANOVA df SS MS F Significance F Regression 1 641312.1164 641312.1164 341.6891161 5.36337E-24 Residual 50 93844.3875 1876.88775 Total 51 735156.5039 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 28.63753479 8.169972732 3.505217915 0.000972696 12.22766161 45.04740797 12.22766161 45.04740797 Opening Weekend (in millions)2.639380358 0.142786257 18.48483476 5.36337E-24 2.352585721 2.926174995 2.352585721 2.926174995 SUMMARY OUTPUT Regression Statistics Multiple R 0.466753369 R Square 0.217858707 Adjusted R Square 0.202215881 Standard Error 107.237704 Observations 52 ANOVA df SS MS F Significance F Regression 1 160160.2455 160160.2455 13.9270685 0.000486616 Residual 50 574996.2584 11499.92517 Total 51 735156.5039 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept -48.89307659 50.44122658 -0.969307844 0.337057283 -150.2072619 52.42110869 -150.2072619 52.42110869 Screens Shown in USA 0.058006437 0.015543411 3.731898779 0.000486616 0.026786577 0.089226297 0.026786577 0.089226297
  • 7. VIF’s For Reduced Model Opening Weekend=Total USA Gross, Budget, Screens VIF=11.11 Budget=Opening Weekend,Total USA, Screens VIF=2.22 Screens=Budget, Opening Weekend, Total USA VIF=2.33 Total USA=Budget, Weekend, Screens VIF=11.11 REMOVE VARIABLE “OPENING WEEKEND” VIF TOO HIGH. New Reduced Regression Model Individual t Tests Budget: H0: β1=0 α=0.025 t Critical=2.021 Ha: β1≠0 I reject H0 if t>2.021 or t<-2.021 t=4.10 4.10>2.021 I REJECT H0. THERE IS INDIVIDUAL SIGNIFICANCE. Screens: H0: β1=0 α=0.025 t Critical=2.021 Ha: β1≠0 I reject H0 if t>2.021 or t<-2.021 t=0.98 I DO NOT REJECT H0. VARIABLE IS NOT SIGNIFICANT. REMOVE. SUMMARY OUTPUT Regression Statistics Multiple R 0.646186363 R Square 0.417556816 Adjusted R Square 0.393783624 Standard Error 93.47998751 Observations 52 ANOVA df SS MS F Significance F Regression 2 306969.6087 153484.8044 17.56418867 1.77301E-06 Residual 49 428186.8952 8738.508065 Total 51 735156.5039 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept -4.505973741 45.28395337 -0.099504867 0.921143411 -95.50748506 86.49553758 -95.50748506 86.49553758 Estimated Budget (in millions) 0.988120623 0.241074754 4.098814197 0.00015567 0.503662767 1.472578478 0.503662767 1.472578478 Screens Shown in USA 0.016585523 0.016902866 0.981225493 0.331301683 -0.017382058 0.050553105 -0.017382058 0.050553105
  • 8. New Reduced Model Assumptions Check -500 0 500 0 50 100 150 200 250 300 Residuals Estimated Budget (in millions) Estimated Budget (in millions) Residual Plot
  • 9. Original Data Set Movie Title Total USA Gross (in millions) Opening Weekend (in millions) Fall Summer Spring Estimated Budget (in millions) Screens Shown in USA The Avengers 623 207 0 0 1 220 4349 Skyfall 304 88 1 0 0 200 3505 The Dark Knight Rises 448 160 0 1 0 250 4404 The Hobbit 303 84 0 0 0 180 4045 Ice Age 161 46 0 1 0 95 3881 Twilight 292 141 1 0 0 120 4070 Amazing Spider Man 262 62 0 1 0 230 4318 Madagascar 3 216 60 0 1 0 145 4258 Men in Black 3 179 54 0 0 1 225 4248 The Hunger Games 408 152 0 0 1 78 4137 This is 40 67 11 0 0 0 35 2913 Argo 136 19 1 0 0 44.5 3232 Ted 218 54 0 1 0 50 3239 21 Jump Street 138 36 0 0 1 42 3121 Prometheus 126 51 0 1 0 51 3396 Dictator 59 17 0 0 1 65 3008 Safe House 126 40 0 0 0 85 3119 The Bourne Legacy 113 38 0 1 0 125 3745 Django 162 30 0 0 0 100 3010 Rise of the Guardians 103 23 1 0 0 145 3653 Paranormal Activity 4 53 29 1 0 0 5 3412 Looper 66 20 1 0 0 30 2992 Dark Shadows 79 29 0 0 1 150 3755 Snow White and the Huntsmen 155 56 0 1 0 170 3773 Dredd 13 6 1 0 0 35 2506 Step up Revolution 35 11 0 1 0 33 2567 Silver Linings Playbook 132 0.4 1 0 0 21 16 Wreck-It Ralph 189 49 1 0 0 165 3752 Cloud Atlas 27 9 1 0 0 102 2008 Les Miserables 148 28 0 0 0 61 2808 Cabin in the Woods 42 14 0 0 1 30 2811 Magic Mike 113 39 0 1 0 7 2930 Lincoln 182 0.9 1 0 0 65 11 Jack Reacher 80 15 0 0 0 60 3352 Flight 93 4 1 0 0 31 1884 Savages 47 16 0 1 0 45 2628 End of Watch 41 13 1 0 0 7 2730 Hotel Transylvania 148 42 1 0 0 85 3349 Expendables 2 85 28 0 1 0 92 3316 LOL 0.04 0.04 0 0 1 11 105 American Reunion 56 21 0 0 1 50 3192 Total Recall 58 25 0 1 0 125 3601 Abraham Lincoln Vampire Slayer 37 16 0 1 0 69 3108 Red Dawn 44 14 1 0 0 65 2725 Project X 54 21 0 0 1 12 3055 Battleship 65 25 0 0 1 209 3690 Chronicle 64 22 0 0 0 12 2907 Here Comes the Boom 45 11 1 0 0 42 3014 The Watch 34 12 0 1 0 68 3168 The Chernobyl Diaries 18 7 0 0 1 1 2433 Alex Cross 25 11 1 0 0 35 2339 Taken 2 139 49 1 0 0 45 3661