SlideShare a Scribd company logo
1 of 2
Download to read offline
Estimating the Relationship Between the Student-Teacher
Ratio and Test Scores
Mia Attruia
The Florida Legislature enacted
legislation requiring the number of
students in each classroom to be reduced
by at least two students per year
beginning in the 2003-04 school year.
This happened until the maximum
number of students per classroom did
not exceed the requirements by law. The
law was enacted in order to assert that
students perform better in smaller
classrooms. In 2015, opponents are
arguing for repeal of the law due to the
fact that smaller classroom sizes are too
expensive and have not led to higher
student performance. In this report we
will test the relationship between test
scores and classroom size using a simple
bivariate regression model.
1. Data
The Data used comes from the
California Standardized Testing and
Reporting (STAR) dataset which reports
test results for the Stanford 9
standardized test administered to 5th
grade students in 420 California school
districts during 1998 and 1999. We
randomly select a sub-sample of size 210
of the 420 observations in the dataset for
the variables:
Testscr: the average of the math and
reading scores for students within the
district;
Str: the student-teacher ratio measured
as the number of full-time equivalent
teachers in the district, divided by the
number of students.
The means, standard deviations,
and minimum and maximum values of
the 201 randomly selected districts are
reported in Table 1. The correlation
between testscr and str is corr(testscr,
str) = -0.24994096 which is significantly
different from zero with a p-value of
0.0003
Table 1: Sample means, standard deviations,
and minimum and maximum values for the
randomly selected sub-sample.
Mean St.
Dev.
Minimum Maximim
testscr 652.84 18.656 605.55 706.75
str 19.544 2.0443 14.000 25.800
2. Regression Model
The regression model is:
The first assumption means that the
errors are “independently and identically
distributed” with mean 0 and constant
variance. The second assumption means
that the regressor str is independent of
the error term. The third assumption
means that the data are “well-behaved”
in the sense that outliers are rare.
3. Results
The estimation results for our random
subsample are:
= 697.414 - 2.28095str
(12.0393) (0.709333)
T= 210 R2
= 0.062470 F(1,208)=13.85968
=18.10727
The results show a negative
relationship between test scores and the
student-teacher ratio. If the average class
size in a school district increases by one
student we would predict that average
test scores in the district would fall by
100*(-2.28095/652.84)= -0.349% of the
average test score.
The scatter plot of the data
including the estimated regression line
shown in Figure 1 shows the negative
relationship between the student-teacher
ratio and the average test score. The
arrow in the figure points to the sample
means
=(19.544, 652.84)
To confirm that the regression line does
indeed go through the sample means of
the data as requires by the least squares
method.
Figure 1: Scatter plot of the data along with the
fitted regression line (blue). The arrow points to
the sample means of the data.
For a simple bivariate model, we
can confirm the estimation results by:
=-0.24994096(18.656/ 2.0443)=
−2.28092675
and,
= 652.84 – (-2.28095)*	
  19.544=697.4188868
Sstr,testscr Is the sample covariance, rstr,testscr
is the sample correlation and S2
str is the
sample variance, etc.
4. Summary
We used a random sub-sample of size
210 from the original 420 observations
of that California STAR dataset to
estimate a bivariate regression model of
test scores as a linear function of the
student teacher ratio. We find that
increased class sized reduces test scores
by about 2.28092675 points per extra
student. This result is statistically
significant but, since it only amounts to
.6% of the average test score, it may
not be very significant from a political
perspective.
There are other factors other
than class size that effect test
performance and all of these other
factors have been omitted from out
model. Consequently, these omitted
variables are effectively being captured
by the error term, If and of these
omitted factors are correlated with
class size then that would violate
assumption (2) of our model. We need
to use caution when recommending
policy changes based upon such a
simple model.

More Related Content

What's hot

Correlational research 1 1
Correlational research 1 1Correlational research 1 1
Correlational research 1 1
sdwilson88
 
Measuresofcentraltendency 121117004155-phpapp01
Measuresofcentraltendency 121117004155-phpapp01Measuresofcentraltendency 121117004155-phpapp01
Measuresofcentraltendency 121117004155-phpapp01
Jouaine Ombay
 

What's hot (17)

The wilcoxon matched pairs signed-ranks test
The wilcoxon matched pairs signed-ranks testThe wilcoxon matched pairs signed-ranks test
The wilcoxon matched pairs signed-ranks test
 
Measure of central tendency
Measure of central tendencyMeasure of central tendency
Measure of central tendency
 
Oneway ANOVA - Overview
Oneway ANOVA - OverviewOneway ANOVA - Overview
Oneway ANOVA - Overview
 
Correlational research 1 1
Correlational research 1 1Correlational research 1 1
Correlational research 1 1
 
lesson 4 measures of central tendency copy
lesson 4 measures of central tendency   copylesson 4 measures of central tendency   copy
lesson 4 measures of central tendency copy
 
Studying the scientific state of students using the adjusted residuals
 Studying the scientific state of students using the adjusted residuals Studying the scientific state of students using the adjusted residuals
Studying the scientific state of students using the adjusted residuals
 
How to Report Test Results
How to Report Test ResultsHow to Report Test Results
How to Report Test Results
 
Measuresofcentraltendency 121117004155-phpapp01
Measuresofcentraltendency 121117004155-phpapp01Measuresofcentraltendency 121117004155-phpapp01
Measuresofcentraltendency 121117004155-phpapp01
 
MEASURES OF DISPERSION OF UNGROUPED DATA
MEASURES OF DISPERSION OF UNGROUPED DATAMEASURES OF DISPERSION OF UNGROUPED DATA
MEASURES OF DISPERSION OF UNGROUPED DATA
 
Lecture slides stats1.13.l12.air
Lecture slides stats1.13.l12.airLecture slides stats1.13.l12.air
Lecture slides stats1.13.l12.air
 
Poster template
Poster templatePoster template
Poster template
 
Ch2
Ch2Ch2
Ch2
 
Reporting an independent sample t test
Reporting an independent sample t testReporting an independent sample t test
Reporting an independent sample t test
 
Reporting a paired sample t test
Reporting a paired sample t testReporting a paired sample t test
Reporting a paired sample t test
 
Profile Matching in Solving Rank Problem
Profile Matching in Solving Rank ProblemProfile Matching in Solving Rank Problem
Profile Matching in Solving Rank Problem
 
Two factor factorial_design_pdf
Two factor factorial_design_pdfTwo factor factorial_design_pdf
Two factor factorial_design_pdf
 
Measures of Variation
Measures of Variation Measures of Variation
Measures of Variation
 

Viewers also liked

Louis Dreyfus
Louis DreyfusLouis Dreyfus
Louis Dreyfus
Alan Katz
 

Viewers also liked (20)

Evaluation Question 3
Evaluation Question 3Evaluation Question 3
Evaluation Question 3
 
Chuva de ideias
Chuva de ideiasChuva de ideias
Chuva de ideias
 
Grêmio em-forma-ricardo-mello-instituto-sou-da-paz1
Grêmio em-forma-ricardo-mello-instituto-sou-da-paz1Grêmio em-forma-ricardo-mello-instituto-sou-da-paz1
Grêmio em-forma-ricardo-mello-instituto-sou-da-paz1
 
Ppt gremio revisado
Ppt gremio revisadoPpt gremio revisado
Ppt gremio revisado
 
HD LOGO
HD LOGOHD LOGO
HD LOGO
 
Newport’s fairholme estate sells for $16 newport buzz
Newport’s fairholme estate sells for $16 newport buzzNewport’s fairholme estate sells for $16 newport buzz
Newport’s fairholme estate sells for $16 newport buzz
 
Congreso ENC 2015 - UWE para entornos virtuales colaborativos de aprendizaje
Congreso ENC 2015 - UWE para entornos virtuales colaborativos de aprendizajeCongreso ENC 2015 - UWE para entornos virtuales colaborativos de aprendizaje
Congreso ENC 2015 - UWE para entornos virtuales colaborativos de aprendizaje
 
How to train for a 5 k
How to train for a 5 kHow to train for a 5 k
How to train for a 5 k
 
титовская имена на руси
титовская имена на русититовская имена на руси
титовская имена на руси
 
Vigor Grego
Vigor Grego Vigor Grego
Vigor Grego
 
CV_PDhawad
CV_PDhawadCV_PDhawad
CV_PDhawad
 
Linkedin profile audit 2016
Linkedin profile audit 2016Linkedin profile audit 2016
Linkedin profile audit 2016
 
Louis Dreyfus
Louis DreyfusLouis Dreyfus
Louis Dreyfus
 
2009 Marketing Machine Vision (Inbound and Content Marketing) for Unica Internal
2009 Marketing Machine Vision (Inbound and Content Marketing) for Unica Internal2009 Marketing Machine Vision (Inbound and Content Marketing) for Unica Internal
2009 Marketing Machine Vision (Inbound and Content Marketing) for Unica Internal
 
Food policy - EU Climate Change and the impact Dietary Choice Feb 2016
Food policy - EU Climate Change and the impact Dietary Choice Feb 2016Food policy - EU Climate Change and the impact Dietary Choice Feb 2016
Food policy - EU Climate Change and the impact Dietary Choice Feb 2016
 
Vitamin C with Rose Hips - Haya Labs - Analytical Report v.1
Vitamin C with Rose Hips - Haya Labs - Analytical Report v.1Vitamin C with Rose Hips - Haya Labs - Analytical Report v.1
Vitamin C with Rose Hips - Haya Labs - Analytical Report v.1
 
ICF Aerospace in Asia Pacific
ICF Aerospace in Asia Pacific ICF Aerospace in Asia Pacific
ICF Aerospace in Asia Pacific
 
General physiology of receptor
General physiology of receptorGeneral physiology of receptor
General physiology of receptor
 
02-28-16
02-28-1602-28-16
02-28-16
 
Qué son los nutrientes
Qué son los nutrientesQué son los nutrientes
Qué son los nutrientes
 

Similar to Project 3

10. The Pearson r and Spearman rho correlation coefficients ar.docx
10. The Pearson r and Spearman rho correlation coefficients ar.docx10. The Pearson r and Spearman rho correlation coefficients ar.docx
10. The Pearson r and Spearman rho correlation coefficients ar.docx
hyacinthshackley2629
 
Running head COURSE PROJECT –PHASE 3 COURSE PROJECT –PHASE 3.docx
Running head COURSE PROJECT –PHASE 3 COURSE PROJECT –PHASE 3.docxRunning head COURSE PROJECT –PHASE 3 COURSE PROJECT –PHASE 3.docx
Running head COURSE PROJECT –PHASE 3 COURSE PROJECT –PHASE 3.docx
susanschei
 
Question 1 The time required for a citizen to complete t.docx
Question 1 The time required for a citizen to complete t.docxQuestion 1 The time required for a citizen to complete t.docx
Question 1 The time required for a citizen to complete t.docx
IRESH3
 
Module 05 – Hypothesis Tests Using Two SamplesClass Objectives
Module 05 – Hypothesis Tests Using Two SamplesClass ObjectivesModule 05 – Hypothesis Tests Using Two SamplesClass Objectives
Module 05 – Hypothesis Tests Using Two SamplesClass Objectives
IlonaThornburg83
 
Probability and statistics (basic statistical concepts)
Probability and statistics (basic statistical concepts)Probability and statistics (basic statistical concepts)
Probability and statistics (basic statistical concepts)
Don Bosco BSIT
 
Niez - RMDA Final Exam
Niez - RMDA Final ExamNiez - RMDA Final Exam
Niez - RMDA Final Exam
Daniel Niez
 

Similar to Project 3 (20)

Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptx
 
10. The Pearson r and Spearman rho correlation coefficients ar.docx
10. The Pearson r and Spearman rho correlation coefficients ar.docx10. The Pearson r and Spearman rho correlation coefficients ar.docx
10. The Pearson r and Spearman rho correlation coefficients ar.docx
 
Running head COURSE PROJECT –PHASE 3 COURSE PROJECT –PHASE 3.docx
Running head COURSE PROJECT –PHASE 3 COURSE PROJECT –PHASE 3.docxRunning head COURSE PROJECT –PHASE 3 COURSE PROJECT –PHASE 3.docx
Running head COURSE PROJECT –PHASE 3 COURSE PROJECT –PHASE 3.docx
 
Alumni Donation - Complete exploration and analysis report
Alumni Donation - Complete exploration and analysis reportAlumni Donation - Complete exploration and analysis report
Alumni Donation - Complete exploration and analysis report
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Question 1 The time required for a citizen to complete t.docx
Question 1 The time required for a citizen to complete t.docxQuestion 1 The time required for a citizen to complete t.docx
Question 1 The time required for a citizen to complete t.docx
 
IRJET- Analysis of Chi-Square Independence Test for Naïve Bayes Feature Selec...
IRJET- Analysis of Chi-Square Independence Test for Naïve Bayes Feature Selec...IRJET- Analysis of Chi-Square Independence Test for Naïve Bayes Feature Selec...
IRJET- Analysis of Chi-Square Independence Test for Naïve Bayes Feature Selec...
 
spearman correlation.pdf
spearman correlation.pdfspearman correlation.pdf
spearman correlation.pdf
 
1624.pptx
1624.pptx1624.pptx
1624.pptx
 
Dm
DmDm
Dm
 
Module 05 – Hypothesis Tests Using Two SamplesClass Objectives
Module 05 – Hypothesis Tests Using Two SamplesClass ObjectivesModule 05 – Hypothesis Tests Using Two SamplesClass Objectives
Module 05 – Hypothesis Tests Using Two SamplesClass Objectives
 
Statistics(Basic)
Statistics(Basic)Statistics(Basic)
Statistics(Basic)
 
B025209013
B025209013B025209013
B025209013
 
6317ijite01
6317ijite016317ijite01
6317ijite01
 
Chapter3
Chapter3Chapter3
Chapter3
 
Machine Learning Regression Analysis of EDX 2012-13 Data for Identifying the ...
Machine Learning Regression Analysis of EDX 2012-13 Data for Identifying the ...Machine Learning Regression Analysis of EDX 2012-13 Data for Identifying the ...
Machine Learning Regression Analysis of EDX 2012-13 Data for Identifying the ...
 
Probability and statistics (basic statistical concepts)
Probability and statistics (basic statistical concepts)Probability and statistics (basic statistical concepts)
Probability and statistics (basic statistical concepts)
 
German credit score shivaram prakash
German credit score shivaram prakashGerman credit score shivaram prakash
German credit score shivaram prakash
 
Niez - RMDA Final Exam
Niez - RMDA Final ExamNiez - RMDA Final Exam
Niez - RMDA Final Exam
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for Students
 

Project 3

  • 1. Estimating the Relationship Between the Student-Teacher Ratio and Test Scores Mia Attruia The Florida Legislature enacted legislation requiring the number of students in each classroom to be reduced by at least two students per year beginning in the 2003-04 school year. This happened until the maximum number of students per classroom did not exceed the requirements by law. The law was enacted in order to assert that students perform better in smaller classrooms. In 2015, opponents are arguing for repeal of the law due to the fact that smaller classroom sizes are too expensive and have not led to higher student performance. In this report we will test the relationship between test scores and classroom size using a simple bivariate regression model. 1. Data The Data used comes from the California Standardized Testing and Reporting (STAR) dataset which reports test results for the Stanford 9 standardized test administered to 5th grade students in 420 California school districts during 1998 and 1999. We randomly select a sub-sample of size 210 of the 420 observations in the dataset for the variables: Testscr: the average of the math and reading scores for students within the district; Str: the student-teacher ratio measured as the number of full-time equivalent teachers in the district, divided by the number of students. The means, standard deviations, and minimum and maximum values of the 201 randomly selected districts are reported in Table 1. The correlation between testscr and str is corr(testscr, str) = -0.24994096 which is significantly different from zero with a p-value of 0.0003 Table 1: Sample means, standard deviations, and minimum and maximum values for the randomly selected sub-sample. Mean St. Dev. Minimum Maximim testscr 652.84 18.656 605.55 706.75 str 19.544 2.0443 14.000 25.800 2. Regression Model The regression model is: The first assumption means that the errors are “independently and identically distributed” with mean 0 and constant variance. The second assumption means that the regressor str is independent of the error term. The third assumption
  • 2. means that the data are “well-behaved” in the sense that outliers are rare. 3. Results The estimation results for our random subsample are: = 697.414 - 2.28095str (12.0393) (0.709333) T= 210 R2 = 0.062470 F(1,208)=13.85968 =18.10727 The results show a negative relationship between test scores and the student-teacher ratio. If the average class size in a school district increases by one student we would predict that average test scores in the district would fall by 100*(-2.28095/652.84)= -0.349% of the average test score. The scatter plot of the data including the estimated regression line shown in Figure 1 shows the negative relationship between the student-teacher ratio and the average test score. The arrow in the figure points to the sample means =(19.544, 652.84) To confirm that the regression line does indeed go through the sample means of the data as requires by the least squares method. Figure 1: Scatter plot of the data along with the fitted regression line (blue). The arrow points to the sample means of the data. For a simple bivariate model, we can confirm the estimation results by: =-0.24994096(18.656/ 2.0443)= −2.28092675 and, = 652.84 – (-2.28095)*  19.544=697.4188868 Sstr,testscr Is the sample covariance, rstr,testscr is the sample correlation and S2 str is the sample variance, etc. 4. Summary We used a random sub-sample of size 210 from the original 420 observations of that California STAR dataset to estimate a bivariate regression model of test scores as a linear function of the student teacher ratio. We find that increased class sized reduces test scores by about 2.28092675 points per extra student. This result is statistically significant but, since it only amounts to .6% of the average test score, it may not be very significant from a political perspective. There are other factors other than class size that effect test performance and all of these other factors have been omitted from out model. Consequently, these omitted variables are effectively being captured by the error term, If and of these omitted factors are correlated with class size then that would violate assumption (2) of our model. We need to use caution when recommending policy changes based upon such a simple model.