SlideShare a Scribd company logo
1 of 17
Focus on predicting academic
performance of an elementary
school using attributes
Class size
Enrollment
Poverty
Parent Education
Student performance
Teachers credentials
from California Department of
education’s API2000
dataset
The project aims in constructing a mathematical
model using Multiple regression to estimate the
academic performance based on a set of
predictor variables.
Analysis Software used- SAS(Statistical Analytical
Software)
Variables Used for Analysis
We have 3 independent
variables and 1dependent
variable . We screen variables
based on
Multicollinearity
Heteroscedusticity
&
Normality test
Dependent Variable
api00
Independent
Variables
not_hsg
grad_sch
full
with R-Square 0.7083
Equation for Multiple Regression
Y=394.23-2.82 x1+4.21 x2+3.27 x3
Where
X1=not_hsg i.e. parent not high school graduate
X2=grad-sch i.e. parent grad school
X3=full i.e. pct full credential
Analysis of Variance
Source DF Sum of Mean F Value Pr > F
Squares Square
Model 3 5702793 1900931 317.51<.0001
Error 396 2370879 5987.06896
Corrected Total 399 8073672
Root MSE 77.37615 R-Square 0.7063
Dependent Mean 647.6225 Adj R-Sq 0.7041
Coeff Var 11.94772
Parameter Estimates
Variable Label DF Parameter Standard t Value Pr > |t| Variance
Estimate Error Inflation
Intercept Intercept 1 394.23899 25.23765 15.62<.0001 0
not_hsg parent not hsg 1 -2.82726 0.21549 -13.12<.0001 1.32287
grad_sch parent grad school 1 4.21761 0.35989 11.72<.0001 1.27024
full pct full credential 1 3.27664 0.27704 11.83<.0001 1.14321
Predicted vs residual
The F-Value is 317.51 and P value is <0.0001,
so the regression model is significant.
The P-value for the t-statistic of the selected
variables are all <=0.0001, so all the variables
are significant in the model.
The R-square is 0.7083, which means 70.83%
of the total variability is explained by the
parent not high school, parent grad school, pct
full credential.
Main Points from SAS output:
Explanation of F-test:
General equation of predicted y is
Y=b0+b1*x1+b2*x2+b3*x3
One of the b’ s is zero. When we remove independent
variables from the model, we are restricting its
coefficient to be zero.
H0:b2=b3=b4=0
H1:at least one bi not equal to 0
We call this a test of overall model significance. If we
accept Ho our model has explained nothing. If we
reject Ho  our model has explained something.
Here P(F>317.51)<0.0001we reject H0  our model
has explained something.
Explanation of BETA COEFFICIENT:
implies
i)academic performance & not_hsg are inversely
related due to the coefficient “-2.82”keeping other
variables are fixed .
ii)academic performance increase if grad_sch
increase due to the positive coefficient keeping
other variables are fixed.
iii)similarly the above equation indicates direct
relationship between academic performance & full
variable.
Y=394.23-2.82 x1+4.21 x2+3.27 x3
0
100
200
300
400
500
600
700
800
900
10 20 30 40 50 60 70
p
e
r
f
o
r
m
a
n
c
e
not_hsg
performance vs not_hsg
predicted y
0
100
200
300
400
500
600
700
800
900
1000
10 20 30 40 50 60 70
p
e
r
f
o
r
m
a
n
c
e
grad_sch
performance vs grad_sch
predicted y
640
660
680
700
720
740
760
780
800
820
70 75 80 85 90 95 100
p
e
r
f
o
r
m
a
n
c
e
full
performance vs full
predicted y
Explain effect of each
independent variables
selected by
Regression Model:
Y=394.23-2.82 x1+4.21 x2+3.27 x3
If we consider a set of 50 students from 11 different
school with different educational background of
parents we need different percentage of teaching
credential to achieve same score of academic
performance.
school no: not_hsg grad_sch full y
1 0 50 28.95 700
2 5 45 39.70 700
3 10 40 50.45 700
4 15 35 61.20 700
5 20 30 71.95 700
6 25 25 82.70 700
7 30 20 93.45 700
8 35 15 100 686.21
9 33 17 100 700.30
10 45 5 100 615.76
11 50 0 100 580.54
In brief, to achieve score 700 we consider with full teacher’s credential
at most 33 students whose parent are not high school graduate .If
the number of this kind of students decrease it is easy to reach our
target.
On the other hand, predicted value of y will be maximized if parents of
each student are in graduate school and full=100
If the number of parents are not high school graduate increase 10% of
total number of student & percentage of full teacher’s credential
increase 10% ,following graph shows the change of predicted value.
45, 100
45, 615.76434
0
100
200
300
400
500
600
700
0 10 20 30 40 50
predictedy
not_hsg
predicted y & full vs. not_hsg
full
y
0
50
100
150
200
250
300
350
400
450
0 10 20 30 40 50 60 70 80 90 100
Predicted error%
cumulative distribution of predicted error%
ecdf
CUMULATIVE DISTRIBUTION OF PREDICTION ERROR %
1
0.875
0.75
0.625
0.5
0.375
0.25
0.125
0
The formula is (abs(actual-predicted)*100/actual).Following chart
shows that 75% of cases have <15% error & 87.5% have <22% error
Conclusion
we are able to predict academic
performance &
we have a good R-square of 0.7083
i.e.
70.83% of the variability is
explained by the model &
we are also able to explain the
interpretation of the estimates of
the model .

More Related Content

What's hot

Problem statement mathematical foundations
Problem statement mathematical foundationsProblem statement mathematical foundations
Problem statement mathematical foundationsBangaluru
 
2.2 add integers ws day1
2.2 add integers ws day12.2 add integers ws day1
2.2 add integers ws day1bweldon
 
Combined mean and Weighted Arithmetic Mean
Combined mean and  Weighted Arithmetic MeanCombined mean and  Weighted Arithmetic Mean
Combined mean and Weighted Arithmetic MeanMamatha Upadhya
 
Statistics (Mean, Median, Mode)
Statistics (Mean, Median, Mode)Statistics (Mean, Median, Mode)
Statistics (Mean, Median, Mode)Sherzad Daudzai
 
Ungraded Results in CAP Surveys
Ungraded Results in CAP SurveysUngraded Results in CAP Surveys
Ungraded Results in CAP SurveysBilal Al-kadri
 
Machine Learning Application: Credit Scoring
Machine Learning Application: Credit ScoringMachine Learning Application: Credit Scoring
Machine Learning Application: Credit Scoringeurosigdoc acm
 
Measures of central_tendency._mean,median,mode[1]
Measures of central_tendency._mean,median,mode[1]Measures of central_tendency._mean,median,mode[1]
Measures of central_tendency._mean,median,mode[1]Samuel Roy
 
November 19
November 19November 19
November 19khyps13
 
Cumulative Frequency Revision
Cumulative Frequency RevisionCumulative Frequency Revision
Cumulative Frequency RevisionPuna Ripiye
 

What's hot (16)

Ch4.4 Systems W Matrices
Ch4.4 Systems W MatricesCh4.4 Systems W Matrices
Ch4.4 Systems W Matrices
 
Problem statement mathematical foundations
Problem statement mathematical foundationsProblem statement mathematical foundations
Problem statement mathematical foundations
 
2.2 add integers ws day1
2.2 add integers ws day12.2 add integers ws day1
2.2 add integers ws day1
 
Combined mean and Weighted Arithmetic Mean
Combined mean and  Weighted Arithmetic MeanCombined mean and  Weighted Arithmetic Mean
Combined mean and Weighted Arithmetic Mean
 
Statistics (Mean, Median, Mode)
Statistics (Mean, Median, Mode)Statistics (Mean, Median, Mode)
Statistics (Mean, Median, Mode)
 
An Overview of Simple Linear Regression
An Overview of Simple Linear RegressionAn Overview of Simple Linear Regression
An Overview of Simple Linear Regression
 
Ungraded Results in CAP Surveys
Ungraded Results in CAP SurveysUngraded Results in CAP Surveys
Ungraded Results in CAP Surveys
 
Machine Learning Application: Credit Scoring
Machine Learning Application: Credit ScoringMachine Learning Application: Credit Scoring
Machine Learning Application: Credit Scoring
 
Correlation
CorrelationCorrelation
Correlation
 
Measures of central_tendency._mean,median,mode[1]
Measures of central_tendency._mean,median,mode[1]Measures of central_tendency._mean,median,mode[1]
Measures of central_tendency._mean,median,mode[1]
 
Ed tech.ppt
Ed tech.pptEd tech.ppt
Ed tech.ppt
 
November 19
November 19November 19
November 19
 
Combined mean
Combined meanCombined mean
Combined mean
 
033 lesson 21
033 lesson 21033 lesson 21
033 lesson 21
 
Median and mode
Median and modeMedian and mode
Median and mode
 
Cumulative Frequency Revision
Cumulative Frequency RevisionCumulative Frequency Revision
Cumulative Frequency Revision
 

Similar to Multiple reg presentation

10. The Pearson r and Spearman rho correlation coefficients ar.docx
10. The Pearson r and Spearman rho correlation coefficients ar.docx10. The Pearson r and Spearman rho correlation coefficients ar.docx
10. The Pearson r and Spearman rho correlation coefficients ar.docxhyacinthshackley2629
 
University Ranking Variable Analysis
University Ranking Variable AnalysisUniversity Ranking Variable Analysis
University Ranking Variable AnalysisAlexa Chesser
 
Alumni Donation - Complete exploration and analysis report
Alumni Donation - Complete exploration and analysis reportAlumni Donation - Complete exploration and analysis report
Alumni Donation - Complete exploration and analysis reportJatin Saini
 
Predicting Alumni Donation Rate
Predicting Alumni Donation RatePredicting Alumni Donation Rate
Predicting Alumni Donation RateRavish Kalra
 
Educational Psychology 565 Practice Quiz(use α = .05 unl.docx
Educational Psychology 565 Practice Quiz(use α = .05 unl.docxEducational Psychology 565 Practice Quiz(use α = .05 unl.docx
Educational Psychology 565 Practice Quiz(use α = .05 unl.docxtoltonkendal
 
Newbold_chap14.ppt
Newbold_chap14.pptNewbold_chap14.ppt
Newbold_chap14.pptcfisicaster
 
Detail Study of the concept of Regression model.pptx
Detail Study of the concept of  Regression model.pptxDetail Study of the concept of  Regression model.pptx
Detail Study of the concept of Regression model.pptxtruptikulkarni2066
 
040 the whole module
040 the whole module040 the whole module
040 the whole moduleedwin caniete
 
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 1
Quantitative Methods for Lawyers - Class #22 -  Regression Analysis - Part 1Quantitative Methods for Lawyers - Class #22 -  Regression Analysis - Part 1
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 1Daniel Katz
 
Question 2
Question 2Question 2
Question 2Suet Yet
 
An introduction to the Multivariable analysis.ppt
An introduction to the Multivariable analysis.pptAn introduction to the Multivariable analysis.ppt
An introduction to the Multivariable analysis.pptvigia41
 
Chap13 additional topics in regression analysis
Chap13 additional topics in regression analysisChap13 additional topics in regression analysis
Chap13 additional topics in regression analysisJudianto Nugroho
 

Similar to Multiple reg presentation (20)

1624.pptx
1624.pptx1624.pptx
1624.pptx
 
10. The Pearson r and Spearman rho correlation coefficients ar.docx
10. The Pearson r and Spearman rho correlation coefficients ar.docx10. The Pearson r and Spearman rho correlation coefficients ar.docx
10. The Pearson r and Spearman rho correlation coefficients ar.docx
 
University Ranking Variable Analysis
University Ranking Variable AnalysisUniversity Ranking Variable Analysis
University Ranking Variable Analysis
 
Binary Logistic Regression
Binary Logistic RegressionBinary Logistic Regression
Binary Logistic Regression
 
Yg2298
Yg2298Yg2298
Yg2298
 
Alumni Donation - Complete exploration and analysis report
Alumni Donation - Complete exploration and analysis reportAlumni Donation - Complete exploration and analysis report
Alumni Donation - Complete exploration and analysis report
 
Predicting Alumni Donation Rate
Predicting Alumni Donation RatePredicting Alumni Donation Rate
Predicting Alumni Donation Rate
 
Lori PR 2012-13
Lori PR 2012-13Lori PR 2012-13
Lori PR 2012-13
 
Educational Psychology 565 Practice Quiz(use α = .05 unl.docx
Educational Psychology 565 Practice Quiz(use α = .05 unl.docxEducational Psychology 565 Practice Quiz(use α = .05 unl.docx
Educational Psychology 565 Practice Quiz(use α = .05 unl.docx
 
Newbold_chap14.ppt
Newbold_chap14.pptNewbold_chap14.ppt
Newbold_chap14.ppt
 
Detail Study of the concept of Regression model.pptx
Detail Study of the concept of  Regression model.pptxDetail Study of the concept of  Regression model.pptx
Detail Study of the concept of Regression model.pptx
 
040 the whole module
040 the whole module040 the whole module
040 the whole module
 
Regression analysis on SPSS
Regression analysis on SPSSRegression analysis on SPSS
Regression analysis on SPSS
 
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 1
Quantitative Methods for Lawyers - Class #22 -  Regression Analysis - Part 1Quantitative Methods for Lawyers - Class #22 -  Regression Analysis - Part 1
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 1
 
Question 2
Question 2Question 2
Question 2
 
An introduction to the Multivariable analysis.ppt
An introduction to the Multivariable analysis.pptAn introduction to the Multivariable analysis.ppt
An introduction to the Multivariable analysis.ppt
 
Project 3
Project 3Project 3
Project 3
 
Kaggle KDD Cup Report
Kaggle KDD Cup ReportKaggle KDD Cup Report
Kaggle KDD Cup Report
 
Chap13 additional topics in regression analysis
Chap13 additional topics in regression analysisChap13 additional topics in regression analysis
Chap13 additional topics in regression analysis
 
Csrde discriminant analysis final
Csrde discriminant analysis finalCsrde discriminant analysis final
Csrde discriminant analysis final
 

More from Seth Anandaram Jaipuria College (7)

MBA project
MBA projectMBA project
MBA project
 
Assignment in regression1
Assignment in regression1Assignment in regression1
Assignment in regression1
 
Multivariate1
Multivariate1Multivariate1
Multivariate1
 
Basic statistics
Basic statisticsBasic statistics
Basic statistics
 
Factor Analysis with an Example
Factor Analysis with an ExampleFactor Analysis with an Example
Factor Analysis with an Example
 
Multivariate analysis for 26 rice grain varieties
Multivariate analysis for 26 rice grain varietiesMultivariate analysis for 26 rice grain varieties
Multivariate analysis for 26 rice grain varieties
 
Time series
Time seriesTime series
Time series
 

Recently uploaded

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfSanaAli374401
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.MateoGardella
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...KokoStevan
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 

Recently uploaded (20)

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 

Multiple reg presentation

  • 1.
  • 2. Focus on predicting academic performance of an elementary school using attributes Class size Enrollment Poverty Parent Education Student performance Teachers credentials from California Department of education’s API2000 dataset
  • 3. The project aims in constructing a mathematical model using Multiple regression to estimate the academic performance based on a set of predictor variables. Analysis Software used- SAS(Statistical Analytical Software)
  • 4. Variables Used for Analysis We have 3 independent variables and 1dependent variable . We screen variables based on Multicollinearity Heteroscedusticity & Normality test
  • 6. Equation for Multiple Regression Y=394.23-2.82 x1+4.21 x2+3.27 x3 Where X1=not_hsg i.e. parent not high school graduate X2=grad-sch i.e. parent grad school X3=full i.e. pct full credential
  • 7. Analysis of Variance Source DF Sum of Mean F Value Pr > F Squares Square Model 3 5702793 1900931 317.51<.0001 Error 396 2370879 5987.06896 Corrected Total 399 8073672 Root MSE 77.37615 R-Square 0.7063 Dependent Mean 647.6225 Adj R-Sq 0.7041 Coeff Var 11.94772 Parameter Estimates Variable Label DF Parameter Standard t Value Pr > |t| Variance Estimate Error Inflation Intercept Intercept 1 394.23899 25.23765 15.62<.0001 0 not_hsg parent not hsg 1 -2.82726 0.21549 -13.12<.0001 1.32287 grad_sch parent grad school 1 4.21761 0.35989 11.72<.0001 1.27024 full pct full credential 1 3.27664 0.27704 11.83<.0001 1.14321
  • 9. The F-Value is 317.51 and P value is <0.0001, so the regression model is significant. The P-value for the t-statistic of the selected variables are all <=0.0001, so all the variables are significant in the model. The R-square is 0.7083, which means 70.83% of the total variability is explained by the parent not high school, parent grad school, pct full credential. Main Points from SAS output:
  • 10. Explanation of F-test: General equation of predicted y is Y=b0+b1*x1+b2*x2+b3*x3 One of the b’ s is zero. When we remove independent variables from the model, we are restricting its coefficient to be zero. H0:b2=b3=b4=0 H1:at least one bi not equal to 0 We call this a test of overall model significance. If we accept Ho our model has explained nothing. If we reject Ho  our model has explained something. Here P(F>317.51)<0.0001we reject H0  our model has explained something.
  • 11. Explanation of BETA COEFFICIENT: implies i)academic performance & not_hsg are inversely related due to the coefficient “-2.82”keeping other variables are fixed . ii)academic performance increase if grad_sch increase due to the positive coefficient keeping other variables are fixed. iii)similarly the above equation indicates direct relationship between academic performance & full variable. Y=394.23-2.82 x1+4.21 x2+3.27 x3
  • 12. 0 100 200 300 400 500 600 700 800 900 10 20 30 40 50 60 70 p e r f o r m a n c e not_hsg performance vs not_hsg predicted y 0 100 200 300 400 500 600 700 800 900 1000 10 20 30 40 50 60 70 p e r f o r m a n c e grad_sch performance vs grad_sch predicted y 640 660 680 700 720 740 760 780 800 820 70 75 80 85 90 95 100 p e r f o r m a n c e full performance vs full predicted y
  • 13. Explain effect of each independent variables selected by Regression Model: Y=394.23-2.82 x1+4.21 x2+3.27 x3
  • 14. If we consider a set of 50 students from 11 different school with different educational background of parents we need different percentage of teaching credential to achieve same score of academic performance. school no: not_hsg grad_sch full y 1 0 50 28.95 700 2 5 45 39.70 700 3 10 40 50.45 700 4 15 35 61.20 700 5 20 30 71.95 700 6 25 25 82.70 700 7 30 20 93.45 700 8 35 15 100 686.21 9 33 17 100 700.30 10 45 5 100 615.76 11 50 0 100 580.54
  • 15. In brief, to achieve score 700 we consider with full teacher’s credential at most 33 students whose parent are not high school graduate .If the number of this kind of students decrease it is easy to reach our target. On the other hand, predicted value of y will be maximized if parents of each student are in graduate school and full=100 If the number of parents are not high school graduate increase 10% of total number of student & percentage of full teacher’s credential increase 10% ,following graph shows the change of predicted value. 45, 100 45, 615.76434 0 100 200 300 400 500 600 700 0 10 20 30 40 50 predictedy not_hsg predicted y & full vs. not_hsg full y
  • 16. 0 50 100 150 200 250 300 350 400 450 0 10 20 30 40 50 60 70 80 90 100 Predicted error% cumulative distribution of predicted error% ecdf CUMULATIVE DISTRIBUTION OF PREDICTION ERROR % 1 0.875 0.75 0.625 0.5 0.375 0.25 0.125 0 The formula is (abs(actual-predicted)*100/actual).Following chart shows that 75% of cases have <15% error & 87.5% have <22% error
  • 17. Conclusion we are able to predict academic performance & we have a good R-square of 0.7083 i.e. 70.83% of the variability is explained by the model & we are also able to explain the interpretation of the estimates of the model .