SlideShare a Scribd company logo
1 of 9
Modeling GPA of
NCSTATE students using
multiple linear regression
2015
ST 495 FINAL PROJECT
OLUSHEYE ERUJA
NORTH CAROLINA STATE UNIVERSITY
Data Description:
The datasetcontainsthe GPA (Grade pointAverage),Studyhours(Hoursspentstudyingperweek),
Workinghours(Hoursspentworkingperweek) andSleephours(hoursspentsleepingperweek) for30
NorthCarolinaState Universitystudents thatworkandgo tocollege.The variable “GPA”isthe
dependentvariable,while the variables“Study”,“Working”and“Sleep”are the independentvariables
or predictors.
Objectives:
We wishtodetermine if studyhours,workinghoursandsleephoursare useful inpredictingthe GPA of
students.We alsowishtocheck the 95% confidence intervalforthe regressioncoefficientforthe
“Study”parameter.Therefore we modelthe GPA of studentsusingmultiplelinearregression.
Methods:
We use multiple linearregressiontoanalyze the data.We start by usingscatterregression plotstocheck
the linearrelationshipbetweenthe dependentvariable andthe predictors.Thenwe use the “PROC
CORR” to checkthe correlationbetweenall the variables(bothx andy variables).Thenwe alsocheck
the assumptionsof regression:i) linearitybetweenthe yvariable andthe predictorsii) constantvariance
of the residualsiii) normalityof the residuals.Thenwe checkforthe confidenceintervalforthe slope of
the predictor“Study”(Ichose the “Study”parameterbecause itisthe onlyx variable thathas a strong
positive correlation withGPA).Thenfinallywe use MLRto model the GPA of studentsusingthe hoursof
study,hoursof workingandhoursof sleepingaspredictors.
Results:
20 30 40 50 60
Study
2.0
2.5
3.0
3.5
4.0
4.5
GPA
RegressionGPA
10 20 30 40 50 60
Working
2.0
2.5
3.0
3.5
4.0GPA
RegressionGPA
40 50 60 70 80
Sleep
2.0
2.5
3.0
3.5
4.0
GPA
RegressionGPA
Pearson Correlation Coefficients, N = 30
Prob > |r| under H0: Rho=0
GPA Study Working Sleep
GPA
GPA
1.00000 0.88962
<.0001
-0.91516
<.0001
-
0.60165
0.0004
Study
Study
0.88962
<.0001
1.00000 -0.83247
<.0001
-
0.49682
0.0052
Working
Working
-
0.91516
<.0001
-
0.83247
<.0001
1.00000 0.44021
0.0149
Sleep
Sleep
-
0.60165
0.0004
-
0.49682
0.0052
0.44021
0.0149
1.00000
From the scatterplotgraphs and correlationcoefficientsfromthe outputsabove,there appearstobe a
strongpositive linearrelationshipbetweenthe independentvariable “GPA”and “Study”.There appears
to be a strongnegative linearrelationshipbetween“GPA”and“Working”,andlastlythere appearstobe
a mediumtohighnegative linearrelationshipbetween“GPA”and“Sleep”(r= -0.60165). The predictors
“Study”,“Working”and “Sleep”all have anegative linearrelationshipwitheachother.
Analysis of Variance
Source DF
Sum of
Squares
Mean
Square F Value Pr > F
Model 3 12.4826
9
4.1609
0
97.70 <.0001
Error 26 1.10734 0.0425
9
Corrected Total 29 13.5900
3
Root MSE 0.2063
7
R-Square 0.918
5
Dependent Mean 2.9955
0
Adj R-Sq 0.909
1
CoeffVar 6.8894
4
Parameter Estimates
Variable Label DF
Parameter
Estimate
Standard
Error t Value Pr > |t|
Intercept Intercep
t
1 3.73197 0.40635 9.18 <.0001
Study Study 1 0.01935 0.00606 3.19 0.0036
Working Workin
g
1 -0.02649 0.00486 -5.45 <.0001
Sleep Sleep 1 -0.01048 0.00352 -2.98 0.0062
Fit Diagnosticsfor GPA
0.9091Adj R-Square
0.9185R-Square
0.0426M SE
26Error DF
4Parameters
30Observations
Proportion Less
0.0 0.4 0.8
Residual
0.0 0.4 0.8
Fit–Mean
-1.0
-0.5
0.0
0.5
1.0
-0.6 -0.3 0 0.3 0.6
Residual
0
10
20
30
Percent
0 5 10 15 20 25 30
Observation
0.00
0.05
0.10
0.15
Cook'sD
2.0 2.5 3.0 3.5 4.0
Predicted Value
2.0
2.5
3.0
3.5
4.0
GPA
-2 -1 0 1 2
Quantile
-0.4
-0.2
0.0
0.2
0.4
Residual
0.1 0.2 0.3
Leverage
-2
-1
0
1
2
RStudent
2.0 2.5 3.0 3.5 4.0
Predicted Value
-2
-1
0
1
2
RStudent
2.0 2.5 3.0 3.5 4.0
Predicted Value
-0.2
0.0
0.2
0.4
Residual
Parameter Estimates
Variable Label DF
Parameter
Estimate
Standard
Error t Value Pr > |t|
95% Confidence
Limits
Intercept Intercep
t
1 3.73197 0.40635 9.18 <.0001 2.89671 4.56724
Study Study 1 0.01935 0.00606 3.19 0.0036 0.00690 0.03179
Working Workin
g
1 -0.02649 0.00486 -5.45 <.0001 -
0.03647
-
0.01651
Sleep Sleep 1 -0.01048 0.00352 -2.98 0.0062 -
0.01771
-
0.00325
From the residual plotsabove,i.e“Residual vsPredictedvalue”,“Residual vsStudy”,“Residualvs
Working”and “Residual vsSleep”,we observe thatthe dotsare all scatteredrandomlyaroundthe
horizontal bandaboutzero,sowe conclude thatthe residualshave constantvariance.We alsoobserve
that “Residual vsQuantile”plotisfitted,sowe conclude thatthe residualsare normallydistributed.All
variableshave alinearrelationship,the residualshave constantvariance andthe residualsare normally
distributed, therefore none of the assumptionsforMLRseemsto be violated.
Usingthe outputgeneratedfromthe PROCREG statement,we testwhetherStudyhours,Workinghours
and Sleepinghoursare useful forpredictingthe GPA of students(H0:B1 = B2 = B3 = 0 VSHA: B1 = B2 =
B3 isnot equal to0). The F-value is97.70 and the p-value is“< 0.0001”. Since itis lessthana significance
level of 0.05, we have enoughevidence torejectthe Null hypothesisandconclude thatB1=B2=B3 is not
equal to0 or the partial slope of atleastone of Study,Working,Sleepissignificantlydifferentfrom0.
So we conclude thatthe overall model isstatisticallysignificant.
Residual by Regressorsfor GPA
40 50 60 70 80
Sleep
10 20 30 40 50 60
Working
20 30 40 50 60
Study
-0.2
0.0
0.2
0.4
Residual
-0.2
0.0
0.2
0.4
Residual
The R^2 value indicatesthat91.85% of the variationinGPA of students canbe explained bythe study
hours,workinghoursandsleepinghours.
The t-valuesare usedtotest the significance of individualpredictors.H0:B1 = 0, H0: B2 = 0, H0: B3 = 0.
(Each at 0.05 significance level).The predictor“Study”hasa t-value of 3.18 andp-value of 0.0036, so we
rejectthe null hypothesisandconclude that “Study”isa statisticallysignificantpredictorandB1 is not
equal to0. The predictor“Working””has a t-value of -5.45 and a p-value of “<0.0001” whichis lessthan
0.05, so we rejectthe null hypothesisforthispredictorandconclude thatitisa statisticallysignificant
predictor.The “Sleep”predictorhasat-value of -2.98 and a p-value of 0.0062, whichisalsolessthan
0.05. so we rejectthe Null hypothesis(H0:B3=0) and conclude itis a statistically significantlyuseful
predictor.
B1 (partial slope parameterforStudy) is0.01935. Thiscan be interpretedas,Forevery1hr increase in
Studyhours,The GPA is expectedtoincrease by0.01935 hrs.
B2 (partial slope parameterforWorking) is-0.02649. Thiscan be interpretedasforevery1hrincrease in
Workinghrs,GPA isexpectedtodecrease by0.02649hrs.
B3 (partial slope parameterforSleep) is -0.01048. Thiscan be interpretedasforevery1hr increase in
sleepinghours,GPA isexpectedtodecrease by0.01048hrs.
95% confidence intervalforB1(Studyparameter) is(0.00690, 0.03179) whichcan be interpretedas“We
are 90% confidentthatthe true slope of the parameterisbetween0.00690 and 0.03179” or “We are
90% confidentthatforevery1 hr increase inStudyhours,the GPA will increase between0.00690 and
0.03179 hrs onaverage givenafixedvalue of Workingandsleepinghours.
Summary:
In conclusion,we modeledGPA of 30 NCSTATEstudentsusingMLR withtheirstudyhours,working
hoursand sleepinghoursaspredictors.NoMLR assumptionsforresidualwasviolated.All the predictors
are statisticallysignificantaccordingtot-testsconducted.The overall model isstatisticalsignificant
accordingto F and p values,thatmeansat leastone of B1, B2 AND B3 is notequal to 0. The resultsshow
it isadvisable forstudentstoworkandsleeplessandstudymore tohave a highGPA.
Appendix:
proc import out=students
datafile="/folders/myshortcuts/myfolder/sasfinalproject
(1).xlsx"
dbms=xlsx;
getnames=yes;
run;
proc sgplot data=students;/*Examining linear relationship
between GPA and Study hours per week*/
scatter x=Study y=GPA;
reg x=Study y=GPA;
run;
proc sgplot data=students;/*Examining linear relationship
between GPA and Working hours*/
scatter x=Working y=GPA;
reg x=Working y=GPA;
run;
proc sgplot data=students;/*Examining linear relationship
between GPA and Sleep hours*/
scatter x=Sleep y=GPA;
reg x=Sleep y=GPA;
run;
proc corr data=students;/*Examining relationship between all
variables*/
var GPA Study Working Sleep;
run;
proc reg data=students;/*Fitting Multiple regression model*/
model GPA = Study Working Sleep;
run;
proc reg data=students;
model GPA = Study Working Sleep/clb clm cli;
run;
DATA:
GPA Study Working Sleep
3.8 50 10 46
2.34 25 48 53
2.409 28 40 56
2.84 30 40 50
4 55 20 40
3.125 34 25 54
3.67 30 15 40
2.25 25 50 60
2 18 51 80
3.51 32 20 56
2 15 58 40
3.999 55 18 38
2.46 29 38 56
2.7 25 25 56
3.6 35 30 46
2.9 30 40 45
3 35 20 59
4 60 10 40
3.85 55 10 45
2.19 23 46 80
2.049 20 50 70
3.78 40 15 45
2.001 24 40 80
2.96 30 35 76
3.25 32 20 56
3.9 48 10 56
2.432 22 42 42
3 30 30 43
2.98 30 36 56
2.87 35 30 60

More Related Content

Similar to ST307 FINAL PROJECT

How to Measure Uncertainty
How to Measure UncertaintyHow to Measure Uncertainty
How to Measure UncertaintyRandox
 
Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval EstimationShubham Mehta
 
Eugm 2012 pritchett - application of adaptive sample size re-estimation in ...
Eugm 2012   pritchett - application of adaptive sample size re-estimation in ...Eugm 2012   pritchett - application of adaptive sample size re-estimation in ...
Eugm 2012 pritchett - application of adaptive sample size re-estimation in ...Cytel USA
 
Biological variation as an uncertainty component
Biological variation as an uncertainty componentBiological variation as an uncertainty component
Biological variation as an uncertainty componentGH Yeoh
 
Week 7 quiz_review_mini_tab_2011
Week 7 quiz_review_mini_tab_2011Week 7 quiz_review_mini_tab_2011
Week 7 quiz_review_mini_tab_2011Brent Heard
 
Frequentist Operating Characteristics of Bayesian Posterior Designs
Frequentist Operating Characteristics of Bayesian Posterior DesignsFrequentist Operating Characteristics of Bayesian Posterior Designs
Frequentist Operating Characteristics of Bayesian Posterior DesignsBiomedical Statistical Consulting
 
TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)Evangelos Kontopantelis
 
Biostatics part 7.pdf
Biostatics part 7.pdfBiostatics part 7.pdf
Biostatics part 7.pdfNatiphBasha
 
Pandemic Stress - Effect of the COVID-19 Shelter-In-Place Situation on Job Sa...
Pandemic Stress - Effect of the COVID-19 Shelter-In-Place Situation on Job Sa...Pandemic Stress - Effect of the COVID-19 Shelter-In-Place Situation on Job Sa...
Pandemic Stress - Effect of the COVID-19 Shelter-In-Place Situation on Job Sa...Sitie F Ajmal
 
Fatigue Science - Fatigue Risk Assessment
Fatigue Science - Fatigue Risk AssessmentFatigue Science - Fatigue Risk Assessment
Fatigue Science - Fatigue Risk AssessmentJoe Mancini
 
Hypothesis Tests in R Programming
Hypothesis Tests in R ProgrammingHypothesis Tests in R Programming
Hypothesis Tests in R ProgrammingAtacan Garip
 
2_5332511410507220042.ppt
2_5332511410507220042.ppt2_5332511410507220042.ppt
2_5332511410507220042.pptnedalalazzwy
 
Sampling methods theory and practice
Sampling methods theory and practice Sampling methods theory and practice
Sampling methods theory and practice Ravindra Sharma
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendencyJoydeep Hazarika
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Marina Santini
 
Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...
Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...
Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...Randox
 
Math 533 week 6 more help
Math 533 week 6   more helpMath 533 week 6   more help
Math 533 week 6 more helpBrent Heard
 

Similar to ST307 FINAL PROJECT (20)

Statistical Inference
Statistical Inference Statistical Inference
Statistical Inference
 
How to Measure Uncertainty
How to Measure UncertaintyHow to Measure Uncertainty
How to Measure Uncertainty
 
Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval Estimation
 
Eugm 2012 pritchett - application of adaptive sample size re-estimation in ...
Eugm 2012   pritchett - application of adaptive sample size re-estimation in ...Eugm 2012   pritchett - application of adaptive sample size re-estimation in ...
Eugm 2012 pritchett - application of adaptive sample size re-estimation in ...
 
Biological variation as an uncertainty component
Biological variation as an uncertainty componentBiological variation as an uncertainty component
Biological variation as an uncertainty component
 
Week 7 quiz_review_mini_tab_2011
Week 7 quiz_review_mini_tab_2011Week 7 quiz_review_mini_tab_2011
Week 7 quiz_review_mini_tab_2011
 
Frequentist Operating Characteristics of Bayesian Posterior Designs
Frequentist Operating Characteristics of Bayesian Posterior DesignsFrequentist Operating Characteristics of Bayesian Posterior Designs
Frequentist Operating Characteristics of Bayesian Posterior Designs
 
QT1 - 07 - Estimation
QT1 - 07 - EstimationQT1 - 07 - Estimation
QT1 - 07 - Estimation
 
TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)TCI in general pracice - reliability (2006)
TCI in general pracice - reliability (2006)
 
Biostatics part 7.pdf
Biostatics part 7.pdfBiostatics part 7.pdf
Biostatics part 7.pdf
 
Pandemic Stress - Effect of the COVID-19 Shelter-In-Place Situation on Job Sa...
Pandemic Stress - Effect of the COVID-19 Shelter-In-Place Situation on Job Sa...Pandemic Stress - Effect of the COVID-19 Shelter-In-Place Situation on Job Sa...
Pandemic Stress - Effect of the COVID-19 Shelter-In-Place Situation on Job Sa...
 
Fatigue Science - Fatigue Risk Assessment
Fatigue Science - Fatigue Risk AssessmentFatigue Science - Fatigue Risk Assessment
Fatigue Science - Fatigue Risk Assessment
 
Hypothesis Tests in R Programming
Hypothesis Tests in R ProgrammingHypothesis Tests in R Programming
Hypothesis Tests in R Programming
 
2_5332511410507220042.ppt
2_5332511410507220042.ppt2_5332511410507220042.ppt
2_5332511410507220042.ppt
 
Sleep in Mammals Final Report
Sleep in Mammals Final ReportSleep in Mammals Final Report
Sleep in Mammals Final Report
 
Sampling methods theory and practice
Sampling methods theory and practice Sampling methods theory and practice
Sampling methods theory and practice
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Lecture 5: Interval Estimation
Lecture 5: Interval Estimation Lecture 5: Interval Estimation
Lecture 5: Interval Estimation
 
Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...
Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...
Basic QC Statistics - Improving Laboratory Performance Through Quality Contro...
 
Math 533 week 6 more help
Math 533 week 6   more helpMath 533 week 6   more help
Math 533 week 6 more help
 

ST307 FINAL PROJECT

  • 1. Modeling GPA of NCSTATE students using multiple linear regression 2015 ST 495 FINAL PROJECT OLUSHEYE ERUJA NORTH CAROLINA STATE UNIVERSITY
  • 2. Data Description: The datasetcontainsthe GPA (Grade pointAverage),Studyhours(Hoursspentstudyingperweek), Workinghours(Hoursspentworkingperweek) andSleephours(hoursspentsleepingperweek) for30 NorthCarolinaState Universitystudents thatworkandgo tocollege.The variable “GPA”isthe dependentvariable,while the variables“Study”,“Working”and“Sleep”are the independentvariables or predictors. Objectives: We wishtodetermine if studyhours,workinghoursandsleephoursare useful inpredictingthe GPA of students.We alsowishtocheck the 95% confidence intervalforthe regressioncoefficientforthe “Study”parameter.Therefore we modelthe GPA of studentsusingmultiplelinearregression. Methods: We use multiple linearregressiontoanalyze the data.We start by usingscatterregression plotstocheck the linearrelationshipbetweenthe dependentvariable andthe predictors.Thenwe use the “PROC CORR” to checkthe correlationbetweenall the variables(bothx andy variables).Thenwe alsocheck the assumptionsof regression:i) linearitybetweenthe yvariable andthe predictorsii) constantvariance of the residualsiii) normalityof the residuals.Thenwe checkforthe confidenceintervalforthe slope of the predictor“Study”(Ichose the “Study”parameterbecause itisthe onlyx variable thathas a strong positive correlation withGPA).Thenfinallywe use MLRto model the GPA of studentsusingthe hoursof study,hoursof workingandhoursof sleepingaspredictors. Results: 20 30 40 50 60 Study 2.0 2.5 3.0 3.5 4.0 4.5 GPA RegressionGPA
  • 3. 10 20 30 40 50 60 Working 2.0 2.5 3.0 3.5 4.0GPA RegressionGPA 40 50 60 70 80 Sleep 2.0 2.5 3.0 3.5 4.0 GPA RegressionGPA
  • 4. Pearson Correlation Coefficients, N = 30 Prob > |r| under H0: Rho=0 GPA Study Working Sleep GPA GPA 1.00000 0.88962 <.0001 -0.91516 <.0001 - 0.60165 0.0004 Study Study 0.88962 <.0001 1.00000 -0.83247 <.0001 - 0.49682 0.0052 Working Working - 0.91516 <.0001 - 0.83247 <.0001 1.00000 0.44021 0.0149 Sleep Sleep - 0.60165 0.0004 - 0.49682 0.0052 0.44021 0.0149 1.00000 From the scatterplotgraphs and correlationcoefficientsfromthe outputsabove,there appearstobe a strongpositive linearrelationshipbetweenthe independentvariable “GPA”and “Study”.There appears to be a strongnegative linearrelationshipbetween“GPA”and“Working”,andlastlythere appearstobe a mediumtohighnegative linearrelationshipbetween“GPA”and“Sleep”(r= -0.60165). The predictors “Study”,“Working”and “Sleep”all have anegative linearrelationshipwitheachother. Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 3 12.4826 9 4.1609 0 97.70 <.0001 Error 26 1.10734 0.0425 9 Corrected Total 29 13.5900 3 Root MSE 0.2063 7 R-Square 0.918 5 Dependent Mean 2.9955 0 Adj R-Sq 0.909 1 CoeffVar 6.8894 4
  • 5. Parameter Estimates Variable Label DF Parameter Estimate Standard Error t Value Pr > |t| Intercept Intercep t 1 3.73197 0.40635 9.18 <.0001 Study Study 1 0.01935 0.00606 3.19 0.0036 Working Workin g 1 -0.02649 0.00486 -5.45 <.0001 Sleep Sleep 1 -0.01048 0.00352 -2.98 0.0062 Fit Diagnosticsfor GPA 0.9091Adj R-Square 0.9185R-Square 0.0426M SE 26Error DF 4Parameters 30Observations Proportion Less 0.0 0.4 0.8 Residual 0.0 0.4 0.8 Fit–Mean -1.0 -0.5 0.0 0.5 1.0 -0.6 -0.3 0 0.3 0.6 Residual 0 10 20 30 Percent 0 5 10 15 20 25 30 Observation 0.00 0.05 0.10 0.15 Cook'sD 2.0 2.5 3.0 3.5 4.0 Predicted Value 2.0 2.5 3.0 3.5 4.0 GPA -2 -1 0 1 2 Quantile -0.4 -0.2 0.0 0.2 0.4 Residual 0.1 0.2 0.3 Leverage -2 -1 0 1 2 RStudent 2.0 2.5 3.0 3.5 4.0 Predicted Value -2 -1 0 1 2 RStudent 2.0 2.5 3.0 3.5 4.0 Predicted Value -0.2 0.0 0.2 0.4 Residual
  • 6. Parameter Estimates Variable Label DF Parameter Estimate Standard Error t Value Pr > |t| 95% Confidence Limits Intercept Intercep t 1 3.73197 0.40635 9.18 <.0001 2.89671 4.56724 Study Study 1 0.01935 0.00606 3.19 0.0036 0.00690 0.03179 Working Workin g 1 -0.02649 0.00486 -5.45 <.0001 - 0.03647 - 0.01651 Sleep Sleep 1 -0.01048 0.00352 -2.98 0.0062 - 0.01771 - 0.00325 From the residual plotsabove,i.e“Residual vsPredictedvalue”,“Residual vsStudy”,“Residualvs Working”and “Residual vsSleep”,we observe thatthe dotsare all scatteredrandomlyaroundthe horizontal bandaboutzero,sowe conclude thatthe residualshave constantvariance.We alsoobserve that “Residual vsQuantile”plotisfitted,sowe conclude thatthe residualsare normallydistributed.All variableshave alinearrelationship,the residualshave constantvariance andthe residualsare normally distributed, therefore none of the assumptionsforMLRseemsto be violated. Usingthe outputgeneratedfromthe PROCREG statement,we testwhetherStudyhours,Workinghours and Sleepinghoursare useful forpredictingthe GPA of students(H0:B1 = B2 = B3 = 0 VSHA: B1 = B2 = B3 isnot equal to0). The F-value is97.70 and the p-value is“< 0.0001”. Since itis lessthana significance level of 0.05, we have enoughevidence torejectthe Null hypothesisandconclude thatB1=B2=B3 is not equal to0 or the partial slope of atleastone of Study,Working,Sleepissignificantlydifferentfrom0. So we conclude thatthe overall model isstatisticallysignificant. Residual by Regressorsfor GPA 40 50 60 70 80 Sleep 10 20 30 40 50 60 Working 20 30 40 50 60 Study -0.2 0.0 0.2 0.4 Residual -0.2 0.0 0.2 0.4 Residual
  • 7. The R^2 value indicatesthat91.85% of the variationinGPA of students canbe explained bythe study hours,workinghoursandsleepinghours. The t-valuesare usedtotest the significance of individualpredictors.H0:B1 = 0, H0: B2 = 0, H0: B3 = 0. (Each at 0.05 significance level).The predictor“Study”hasa t-value of 3.18 andp-value of 0.0036, so we rejectthe null hypothesisandconclude that “Study”isa statisticallysignificantpredictorandB1 is not equal to0. The predictor“Working””has a t-value of -5.45 and a p-value of “<0.0001” whichis lessthan 0.05, so we rejectthe null hypothesisforthispredictorandconclude thatitisa statisticallysignificant predictor.The “Sleep”predictorhasat-value of -2.98 and a p-value of 0.0062, whichisalsolessthan 0.05. so we rejectthe Null hypothesis(H0:B3=0) and conclude itis a statistically significantlyuseful predictor. B1 (partial slope parameterforStudy) is0.01935. Thiscan be interpretedas,Forevery1hr increase in Studyhours,The GPA is expectedtoincrease by0.01935 hrs. B2 (partial slope parameterforWorking) is-0.02649. Thiscan be interpretedasforevery1hrincrease in Workinghrs,GPA isexpectedtodecrease by0.02649hrs. B3 (partial slope parameterforSleep) is -0.01048. Thiscan be interpretedasforevery1hr increase in sleepinghours,GPA isexpectedtodecrease by0.01048hrs. 95% confidence intervalforB1(Studyparameter) is(0.00690, 0.03179) whichcan be interpretedas“We are 90% confidentthatthe true slope of the parameterisbetween0.00690 and 0.03179” or “We are 90% confidentthatforevery1 hr increase inStudyhours,the GPA will increase between0.00690 and 0.03179 hrs onaverage givenafixedvalue of Workingandsleepinghours. Summary: In conclusion,we modeledGPA of 30 NCSTATEstudentsusingMLR withtheirstudyhours,working hoursand sleepinghoursaspredictors.NoMLR assumptionsforresidualwasviolated.All the predictors are statisticallysignificantaccordingtot-testsconducted.The overall model isstatisticalsignificant accordingto F and p values,thatmeansat leastone of B1, B2 AND B3 is notequal to 0. The resultsshow it isadvisable forstudentstoworkandsleeplessandstudymore tohave a highGPA. Appendix: proc import out=students datafile="/folders/myshortcuts/myfolder/sasfinalproject (1).xlsx" dbms=xlsx; getnames=yes; run; proc sgplot data=students;/*Examining linear relationship between GPA and Study hours per week*/ scatter x=Study y=GPA; reg x=Study y=GPA; run; proc sgplot data=students;/*Examining linear relationship
  • 8. between GPA and Working hours*/ scatter x=Working y=GPA; reg x=Working y=GPA; run; proc sgplot data=students;/*Examining linear relationship between GPA and Sleep hours*/ scatter x=Sleep y=GPA; reg x=Sleep y=GPA; run; proc corr data=students;/*Examining relationship between all variables*/ var GPA Study Working Sleep; run; proc reg data=students;/*Fitting Multiple regression model*/ model GPA = Study Working Sleep; run; proc reg data=students; model GPA = Study Working Sleep/clb clm cli; run; DATA: GPA Study Working Sleep 3.8 50 10 46 2.34 25 48 53 2.409 28 40 56 2.84 30 40 50 4 55 20 40 3.125 34 25 54 3.67 30 15 40 2.25 25 50 60 2 18 51 80 3.51 32 20 56 2 15 58 40 3.999 55 18 38 2.46 29 38 56 2.7 25 25 56 3.6 35 30 46 2.9 30 40 45 3 35 20 59 4 60 10 40
  • 9. 3.85 55 10 45 2.19 23 46 80 2.049 20 50 70 3.78 40 15 45 2.001 24 40 80 2.96 30 35 76 3.25 32 20 56 3.9 48 10 56 2.432 22 42 42 3 30 30 43 2.98 30 36 56 2.87 35 30 60