The Well-EQUIPped Classroom: Using the Electronic Quality of Inquiry Protocol to evaluate effects of science inquiry professional development on PreK-3 classroom practice

•Download as PPTX, PDF•

1 like•161 views

Slides from a presentation by Acumen Research and Evaluation, LLC, delivered at the 2016 American Evaluation Association Conference

Education

Presented by
Gale A. Mentzer, PhD
T. Ryan Duckett, MA
Toledo, OH
Research and Evaluation,
LLC
1811 N. Reynolds Road,
Suite 204
Toledo, OH 43615
“Accuracy of observation is the equivalent of accuracy of thinking.”--Wallace Stevens
American Evaluation Association
National Conference
October 29, 2016
Atlanta, GA

• Time Usage
• Instruction
• Discourse
• Assessment
• Curriculum
Marshall, J. C., Horton, B., & White, C. (2009). EQUIPping
teachers: A protocol to guide and improve inquiry-based
instruction. The Science Teacher, 76(4), 46-53.

Developing (2) Proficient (3)Pre-Inquiry (1)
Some More Much More
Exemplary (4)

Rasch at a Glance
exp(Bp - Di)
[1 + exp(Bp - Di)]
P{xpi=1|Bp,Di}=

Higher levels of the trait
Level of Ability
(Person Measures)
Lower levels of the trait
Difficult to Endorse
Items that target the
latent trait
(Item Measures)
Easy to Endorse
3.0
2.0
1.0
0.0
-1.0
-2.0
-3.0

Rasch Process in Four Steps
Step One
Are the items functioning
properly?
Step Two
Step Three Step Four
Are the teachers
participating reasonably?
Are you measuring a single
latent trait?
How do we use what we
know to understand?

Item Fit
Harder to Affirm
Teacher provides depth of content
that connects big picture
Lesson allowed for student
designed investigation
Lesson integrated content
and student investigation
Student organized information in
effective ways to communicate
their learning
Overall assessment of
curriculum factor
Easier to Affirm
Difficulty Fit
2.46
1.21
-0.91
-1.26
-1.50
Reliability = .95
0.63
0.93
0.30
1.58
1.07

Rating Scale
Example of a poorly functioning rating scaleRating scale for Spring 2015 Curriculum factor

Building a Staircase
Pre-Inquiry (1)
Developing (2)
Proficient (3)
Exemplary (4)

Person Fit
Spring 2015 (Pre-intervention)
Fall 2015 (Post-intervention)

Rasch Separation
Separation statistic of 5.77 = 8 distinct groups!

Evaluating the Intervention
Scale t α
Instruction 4.623 <.0001
Discourse 3.701 <.0001
Assessment 6.067 <.0001
Curriculum 3.877 <.0001

Assessing long term effects:
ANOVA
Instruction 1 < 2 (α = .002) 1 < 3 (α < .0001) 2 = 3 (α = 1.0)
Discourse 1 < 2 (α = .013) 1 < 3 (α < .003) 2 = 3 (α = 1.0)
Assessment 1 < 2 (α < .0001) 1 < 3 (α < .0001) 2 = 3 (α = 1.0)
Curriculum 1 < 2 (α = .004) 1 < 3 (α < .0001) 2 = 3 (α = 1.0)

Any Questions?
This presentation will be uploaded to our website:
www.acumenresearcheval.com

What's hot

Dela cruz meaning of evaluationYouise Saculo

DEVELOPMENT AND EVALUATION OF SCALES/INSTRUMENTS IN PSYCHIATRYPawan Sharma

Test standardization and normingHannah Grace Gilo

Quantitative & Qualitative AssessmentAnshuDembla

AssessmentSelf employed

Ed Reform Lecture - University of ArkansasJohn Cronin

Evaluation.2011introKAthy Cea

Chapter 6 validity & reliabilityBean Malicse

Understanding the Scales of MeasurementDrShalooSaini

Scale development khalid-key conceptsKhalid Mahmood

Achievement testsManu Sethi

Rubric design workshopLisa M. Snyder

Types of measurement scales in research methodologyManisha Mani

Evaluation and measurementAkolIvanOluka

7 measurement & questionnaires design (Dr. Mai,2014)Phong Đá

Measuring and scaling of quantitative data khalidKhalid Mahmood

Let's argue over it: Are argumentation skills better learned collaboratively ...Irene-Angelica Chounta

Student Teaching Evaluations: Friend or Foe?Denise Wilson

Measurement phdserena

Validity and Reliability of a TestBella Jao

What's hot (20)

Dela cruz meaning of evaluation

DEVELOPMENT AND EVALUATION OF SCALES/INSTRUMENTS IN PSYCHIATRY

Test standardization and norming

Quantitative & Qualitative Assessment

Assessment

Ed Reform Lecture - University of Arkansas

Evaluation.2011intro

Chapter 6 validity & reliability

Understanding the Scales of Measurement

Scale development khalid-key concepts

Achievement tests

Rubric design workshop

Types of measurement scales in research methodology

Evaluation and measurement

7 measurement & questionnaires design (Dr. Mai,2014)

Measuring and scaling of quantitative data khalid

Let's argue over it: Are argumentation skills better learned collaboratively ...

Student Teaching Evaluations: Friend or Foe?

Measurement

Validity and Reliability of a Test

Viewers also liked

Tecnicas de recoleccion de datosGuillermo Alarcon Bedoya.

The Monarch School and Institute - AboutTracy Burnett, LCSW, CFRE

SAURABH RESUMESAURABH SRIVASTAVA

Husky Pensions for NewbiesHusky Finance

Comparison of therapist to patient judgment bias in low visionGuy Davis

Online framework for video stabilizationIAEME Publication

A survey on weighted clustering techniques in manetsIAEME Publication

5989 8541 esBibiana Portillo

Introduction of GS Caltex_2015Albertus Daniel Tanzil

Crowdfunding for Sustainable Entrepreneurship and InnovationCrowdsourcing Week

Viewers also liked (10)

Tecnicas de recoleccion de datos

The Monarch School and Institute - About

SAURABH RESUME

Husky Pensions for Newbies

Comparison of therapist to patient judgment bias in low vision

Online framework for video stabilization

A survey on weighted clustering techniques in manets

5989 8541 es

Introduction of GS Caltex_2015

Crowdfunding for Sustainable Entrepreneurship and Innovation

Similar to The Well-EQUIPped Classroom: Using the Electronic Quality of Inquiry Protocol to evaluate effects of science inquiry professional development on PreK-3 classroom practice

TESTA MasterclassTansy Jessop

Writing short and long answer exam questions, Liz Norman 2017Liz Norman

Analysing service quality among postgraduate Chinese studentsBusiness Librarians Association

Getting to grips with TESTA methodsTansy Jessop

Rough ready and rapid guide to TESTATansy Jessop

Principal Component Analysis of the Achievement Goal Questionnaire - RevisedScott R. Furtwengler, Ph.D.

Last Curriculum Leadersip Classdbrady3702

Grading criteria and marking schemes, Liz Norman, SAVS-CSU Learning and Teach...Liz Norman

The power of learning analytics to unpack learning and teaching: a critical p...Bart Rienties

E assessmentayoub babar

TOS_Presentation-2024 Assessment of Learning.pdfstayuni800

Qualitative approaches to learning analyticsRebecca Ferguson

Developing assessment patterns that work through TESTATansy Jessop

Psychometrics for Clinical Skills AssessmentINSPIRE_Network

Test appraisalInternational advisers

Assess wk 5dkawcak

ISSOTL PresentationDavid Heath

Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...Bart Rienties

Collaborative Examination Item Review Process in a Team-Taught CourseExamSoft

Foreign Language Classroom Assessment in Support of Teaching and Learning wit...Language Acquisition Resource Center

Similar to The Well-EQUIPped Classroom: Using the Electronic Quality of Inquiry Protocol to evaluate effects of science inquiry professional development on PreK-3 classroom practice (20)

TESTA Masterclass

Writing short and long answer exam questions, Liz Norman 2017

Analysing service quality among postgraduate Chinese students

Getting to grips with TESTA methods

Rough ready and rapid guide to TESTA

Principal Component Analysis of the Achievement Goal Questionnaire - Revised

Last Curriculum Leadersip Class

Grading criteria and marking schemes, Liz Norman, SAVS-CSU Learning and Teach...

The power of learning analytics to unpack learning and teaching: a critical p...

E assessment

TOS_Presentation-2024 Assessment of Learning.pdf

Qualitative approaches to learning analytics

Developing assessment patterns that work through TESTA

Psychometrics for Clinical Skills Assessment

Test appraisal

Assess wk 5

ISSOTL Presentation

Keynote: 7th eSTEeM Annual Conference Critical discussion of Student Evaluati...

Collaborative Examination Item Review Process in a Team-Taught Course

Foreign Language Classroom Assessment in Support of Teaching and Learning wit...

Recently uploaded

The Most Excellent Way | 1 Corinthians 13Steve Thomason

Presiding Officer Training module 2024 lok sabha electionsanshu789521

Mastering the Unannounced Regulatory InspectionSafetyChain Software

9953330565 Low Rate Call Girls In Rohini Delhi NCR9953056974 Low Rate Call Girls In Saket, Delhi NCR

Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita

Software Engineering Methodologies (overview)eniolaolutunde

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique

Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha

Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre

The basics of sentences session 2pptx copy.pptxheathfieldcps1

Introduction to AI in Higher Education_draft.pptxpboyjonauth

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR

Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood

Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari

Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a

Código Creativo y Arte de Software | Unidad 1Maestría en Comunicación Digital Interactiva - UNR

Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019

Recently uploaded (20)

The Most Excellent Way | 1 Corinthians 13

Presiding Officer Training module 2024 lok sabha elections

Mastering the Unannounced Regulatory Inspection

9953330565 Low Rate Call Girls In Rohini Delhi NCR

Class 11 Legal Studies Ch-1 Concept of State .pdf

Software Engineering Methodologies (overview)

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx

Call Girls in Dwarka Mor Delhi Contact Us 9654467111

Organic Name Reactions for the students and aspirants of Chemistry12th.pptx

The basics of sentences session 2pptx copy.pptx

Introduction to AI in Higher Education_draft.pptx

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

Introduction to ArtificiaI Intelligence in Higher Education

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf

Separation of Lanthanides/ Lanthanides and Actinides

18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf

Código Creativo y Arte de Software | Unidad 1

Sanyam Choudhary Chemistry practical.pdf

The Well-EQUIPped Classroom: Using the Electronic Quality of Inquiry Protocol to evaluate effects of science inquiry professional development on PreK-3 classroom practice

1. Presented by Gale A. Mentzer, PhD T. Ryan Duckett, MA Toledo, OH Research and Evaluation, LLC 1811 N. Reynolds Road, Suite 204 Toledo, OH 43615 “Accuracy of observation is the equivalent of accuracy of thinking.”--Wallace Stevens American Evaluation Association National Conference October 29, 2016 Atlanta, GA

4. • Time Usage • Instruction • Discourse • Assessment • Curriculum Marshall, J. C., Horton, B., & White, C. (2009). EQUIPping teachers: A protocol to guide and improve inquiry-based instruction. The Science Teacher, 76(4), 46-53.

5. Example of EQUIP Instructional Factor

6. EQUIP as Evaluation Tool

7. Developing (2) Proficient (3)Pre-Inquiry (1) Some More Much More Exemplary (4)

8. Rasch at a Glance exp(Bp - Di) [1 + exp(Bp - Di)] P{xpi=1|Bp,Di}=

9. Higher levels of the trait Level of Ability (Person Measures) Lower levels of the trait Difficult to Endorse Items that target the latent trait (Item Measures) Easy to Endorse 3.0 2.0 1.0 0.0 -1.0 -2.0 -3.0

10.

11. Rasch Process in Four Steps Step One Are the items functioning properly? Step Two Step Three Step Four Are the teachers participating reasonably? Are you measuring a single latent trait? How do we use what we know to understand?

12. Measuring Item Fit

13. Item Fit Harder to Affirm Teacher provides depth of content that connects big picture Lesson allowed for student designed investigation Lesson integrated content and student investigation Student organized information in effective ways to communicate their learning Overall assessment of curriculum factor Easier to Affirm Difficulty Fit 2.46 1.21 -0.91 -1.26 -1.50 Reliability = .95 0.63 0.93 0.30 1.58 1.07

14. Rating Scale Example of a poorly functioning rating scaleRating scale for Spring 2015 Curriculum factor

15. Building a Staircase Pre-Inquiry (1) Developing (2) Proficient (3) Exemplary (4)

16. Person Fit Spring 2015 (Pre-intervention) Fall 2015 (Post-intervention)

17. Rasch Separation Separation statistic of 5.77 = 8 distinct groups!

18. Dimensionality

19. Assessing Contrast

20. Evaluating the Intervention Scale t α Instruction 4.623 <.0001 Discourse 3.701 <.0001 Assessment 6.067 <.0001 Curriculum 3.877 <.0001

21. Assessing long term effects: ANOVA Instruction 1 < 2 (α = .002) 1 < 3 (α < .0001) 2 = 3 (α = 1.0) Discourse 1 < 2 (α = .013) 1 < 3 (α < .003) 2 = 3 (α = 1.0) Assessment 1 < 2 (α < .0001) 1 < 3 (α < .0001) 2 = 3 (α = 1.0) Curriculum 1 < 2 (α = .004) 1 < 3 (α < .0001) 2 = 3 (α = 1.0)

22. Any Questions? This presentation will be uploaded to our website: www.acumenresearcheval.com

Editor's Notes

As we all know, matching measures to intended outcomes is the foundation of sound evaluation practice and yet adequately achieving this decisive step has been elusive at times—particularly when focused on teacher PD designed to change teaching practice. And, of course, instrument or measurement reliability and validity are necessary precursors to establishing construct validity, interpreting analyses, and making legitimate causal inferences.
A measure must first provide consistent results over time and accurately reflect the construct of concern before inferences concluded from the data can be considered to have validity. In addition to developing measures that match intended characteristics or traits, the interpretation or use of the data must also match intentions Some researchers have claimed, for example, that measuring student achievement to make evaluation inferences about teacher quality is not necessarily an appropriate use of the data, particularly when error rates are not taken into consideration. Additionally, jumping directly to student outcomes without first verifying teacher implementation outcomes weakens cause and effect conclusions
While surveying teachers as to their opinion of the value of PD has been used as a measure of participants’ perception of PD usefulness it is a low level indicator of effective PD according to Guskey. Assessments may indicate level of learning achieved as a result of PD, but neither method verifies that teachers actually include newly learned strategies into instruction. If the goal of the evaluation is to determine the extent to which teaching practice changes, then artifacts of teaching (lesson plans, student assignments), interviews with teachers, and observations must be included. And while artifacts and interviews might indicate how a teacher implements new ideas and concepts, they may not indicate how well new concepts and ideas were implemented. Therefore, an observation using a high quality performance assessment could provide the best indicator of implementation. However, the creation and validation of a performance assessment that includes adequate critical indicators can be a lengthy process and an evaluator may not have the time or resources or time to develop a high quality, reliable tool with evidence of validity. In such a case, using an existing, validated tool may be the best solution. But finding an instrument that matches the evaluator’s intent may not be possible. In this case, validating the instrument for the new use may provide the solution. It is the purpose of this study to show how an instrument designed to provide teachers with formative feedback on inquiry-based instruction can be used to evaluate the extent to which a teacher actually employs inquiry teaching strategies.
The EQUIP, developed by Marshall, Horton, & White, was designed to provide formative feedback to teachers regarding their implementation of inquiry-based instruction. It is based on NGSS and has 5 factors: [listed on the screen]. The creators recommend the tool be used by the teacher (using reflection or a video recording), by a colleague, or by an instructional coach.
Each factor has several constructs and each construct is measured based upon a four point scale: Preinquiry, Developing Inquiry, Proficient Inquiry, Exemplary Inquiry. As you can see, the rubric is quite detailed regarding hallmarks of each level and they vary depending upon the construct. Once the constructs have been rated, a summary or overall factor score is assigned using the same 4 point scale. This score is not necessarily the “average” of the factor scores because it is up to the observer to weight various constructs depending upon the intent of the lesson. The tool also includes a time usage section where teaching is broken into 5 minute segments and scored in general using the inquiry proficiency scale (4 pts) as well as noting the lesson structure and student attention levels.
We used the EQUIP to measure the quality of inquiry-based instruction and examined it pre/post a Summer Institute professional development to determine whether the PD actually improved inquiry based instruction. We did not use the time usage portion of the instrument. Once our team established acceptable inter rater reliability using previously recorded lessons, we conducted the observations in real time. Now some prefer video because it allows one to review; however, I prefer real time as I believe video often hides elements of the experience due to the limited focus of the camera. Our team took detailed notes of the lesson and then completed the scoring of the EQUIP rubric immediately after the lesson and reviewing their notes (not during). Because the data is ordinal and because the use of the instrument for evaluation has not been validated, we decided to use RMM to more carefully examine how the instrument worked for our purposes.
As Dr. Mentzer said, we used the EQUIP tool in an attempt to measure the level of inquiry-based instruction the teacher demonstrated during a lesson. This “level of inquiry” is a quality of the teacher’s instruction and not a quantity. The 1, 2, 3, and 4s we assign are not numbers. They simply represent this “level of inquiry”. So while we can note if a teacher has less (pre-inquiry) or more (exemplary) of this trait, we cannot during the initial assessment determine the distance between each category. And, therefore, we cannot simply sum up the scores for each factor and get a meaningful average.
So how does the Rasch method determine these interval values? Here’s a peak at the mathematical model underlying the Rasch method. Without getting too far astray, this basically just says that the probability of an individual getting the best response on an item (or construct in our example) depends on the difficulty of that item.
But the Rasch method goes much further than that and calculates how individuals perform on items to produce a meaningful comparison of participants and items. [Brief explanation of person and item measures; provide math question examples] [Transcend sample dependence; thereby increase reliability]
Through this logarithmic calculation, Rasch assigns each person and item a difficulty level. This is the Rasch Person-Item Map for the S15 Curriculum factor. C1 = Content depth; C2 = Learner Centrality; C3 = Integration of content and design; C4 = Organizing and recording information. So our analysis mapped all of the participants for each cycle (62 in S15, 119 in F15, and 99 in S16) in relation to the each of the contributing factors for each of the four factors mentioned earlier (6 factors for Inquiry, 6 for Assessment, 6 for Discourse, 5 for curriculum, giving us 23 items overall).
Alright, great. We get a nice representation of person ability and item difficulty. But how do we know it is accurate and valid? I would like to now take a few minutes to show you how the Rasch Measurement model assisted us as evaluators to obtain an authentic understanding of the level of inquiry based instruction. Once we completed the Rasch analysis, this allowed for a meaningful and reliable assessment of the participants’ “score” over the three collection cycles to then evaluate overall impact of the NURTURES professional development program. [Briefly mention the roadmap: first items, then people, then unidimensionality, then putting it together]
The first step to ensure the EQUIP instrument is working properly is to make sure the people using the instrument (us, the evaluators scoring the teachers) are speaking a common language. It is vital that the four evaluators had a common definition of each construct for each of the four contributing factors; i.e. each of us knew what was meant by “learner centrality” for the curriculum factor and further that we had a solid framework for assigning a level within those constructs. The first step was, as Dr. Mentzer mentioned, to establish inter rater reliability. Rasch provides fit statistics that determine how consistently the items performed, and therefore, enhances reliability.
Sticking with the output from the S15 curriculum factor, we see the difficulty of each item expressed in logits [explain]. Next to that we see how well the items fit the model. [Discuss range of .6-1.4]; reason why there might be under or overfit; explain the reliability.
First is an example of a rating scale that is functioning extremely poorly. The participants cannot meaningfully differentiate between the different levels, hence the overlap. Here is how the rating scale functioned for the S15 curriculum, a thing of beauty. Each level has a range where it is the most probable response. [Briefly explain the x-y fields, etc.]
Just beyond the 1.4 – 5.0 logit step scale, but very consistent [they have seen 30 logit step scale]. This points to a place for increased rater reliability and to review the rating scale to see how it is functioning [possibly collapse categories]
Once we get a grasp on how the rating scale and items are functioning, we perform similar investigations on the participants. Fit statistics here show us if participants behaved appropriately and took the items seriously. [Describe measure increase pre and post Summer Institute; what the negative measure score means in the first instance in relation to items; outstanding reliability – i.e. results would hold with any group of similar levels -- Further, these statistics confidently predict that people of similar abilities in future iterations will perform similarly on the observation, thus ensuring repeatability and reliability of the results. ]
[4G+1]/3 = Strata.
Ruler can only measure one dimension at a time (length or width, etc.). We need to make sure our instrument is just measuring ability in inquiry based instruction. We need 60% of variance accounted for by the person and item measures. This means that our items are tapping into a cohesive trait.
Discuss the items that correlate.
Given that the EQUIP tool accurately observed and recorded the goal of the NURTURES program—quality of inquiry science instruction—it remained to be seen whether the NURTURES summer institute and intervention had any statistically significant impact on the quality of inquiry science instruction. Recall that the Spring 2015 observation cycle was a pre-intervention observation period. In order to establish a baseline comparison, teachers were observed before they received any instruction or assistance from the NURTURES program. First a dependent t-test was conducted to test the null hypothesis that there was no statistically significant change in teachers’ EQUIP observation scores after NURTURES intervention. Cumulative inquiry based instruction scores were recorded using EQUP and then converted to normalized logits using the Rasch model for the entire cohort for each respective semester (Spring 2015 N=62, Fall 2015 N=119). Of those individuals, 59 participated in both sessions. Table X reports the results of the dependent t-test. The test revealed a significant difference in the scores for Spring 2015 scores (M=-.2214, SD=2.62317) and Fall 2015 scores (M=1.7078, SD=2.32189); t(58)=-4.884, p =.0001. Thus, there is a statistically significant relationship between participation in the NURTURES summer institute program and cumulative inquiry based instruction scores, with participants performing 1.93 logits better on average after having completed the summer institute.
Next, to better evaluate the sustained impact of the NURTURES program, a repeated measures ANOVA was conducted (DV: teacher EQUIP score, IV: session – S15, F15, S16). The same steps were taken as in the dependent t-test to obtain a normalized score for cumulative science inquiry based instruction. This test determines if scores significantly changed for the group of 40 teachers who participated over all three sessions (n=40, N=280). The research question was stated as: is there a statistically significant change in teacher EQUIP scores before and after participation in the summer instructional institute? H0: x̅1 = x̅2 = x̅3; H1: at least two means are significantly different. Since the dependent t-test already led us to reject the null hypothesis, this repeated measures ANOVA allows us to determine if the impact of NURTURES program sustained, decreased, or increased after a third observation. Thus, this one-way repeated measures ANOVA was conducted to evaluate the null hypothesis that there is no change in participants’ EQUIP scores when measured before and in two subsequent observations after participation in the summer instructional institute (n=40). The results, shown in table x, indicated a significant session effect, Wilks’ Lambda = .583, F (2, 38) = 13.579, p<.0001, effect size of 42%. Maulchy’s test was not statistically significant, indicating that the assumption of sphericity has been met, χ2 (2) = 3.18, p = .204 (table x). Therefore we reject the null hypothesis and note that at least two means are significantly different. Additional post hoc tests were conducted to determine which means differed. Table x shows the estimated means for each observation session (session 1=Spring 2015, etc.). The pairwise comparisons between each session bear out what the estimated means appear to convey; namely, the difference in scores between the first and second and the first and third session are statistically significant. Row two of column two in table x shows a mean difference of -9.175 and -9.200, indicating that the pre-intervention scores were lower by over 9 points on average. This table also shows that relatively small difference in means from Fall 2015 and Spring 2016 is not statistically significant (significance of 1.00> p value). This plateauing of scores reveals that intervention had a lasting effect that did not degrade over time.

The Well-EQUIPped Classroom: Using the Electronic Quality of Inquiry Protocol to evaluate effects of science inquiry professional development on PreK-3 classroom practice

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to The Well-EQUIPped Classroom: Using the Electronic Quality of Inquiry Protocol to evaluate effects of science inquiry professional development on PreK-3 classroom practice

Similar to The Well-EQUIPped Classroom: Using the Electronic Quality of Inquiry Protocol to evaluate effects of science inquiry professional development on PreK-3 classroom practice (20)

Recently uploaded

Recently uploaded (20)

The Well-EQUIPped Classroom: Using the Electronic Quality of Inquiry Protocol to evaluate effects of science inquiry professional development on PreK-3 classroom practice

Editor's Notes