This document summarizes a presentation on effect sizes and meta-analysis in education research. It introduces the presenters and explains how the session came about. It then uses an analogy of comparing oil levels in cars to explain problems with using effect sizes to claim one intervention is more effective than another. Key issues discussed are that effect sizes depend on many factors like sample homogeneity and measures used, and don't necessarily indicate relative effectiveness. The presentation argues many meta-analyses in education fail to meet assumptions needed to use effect sizes in this way. It warns against being "razzle-dazzled" into harming students.
This spreadsheet accompanies Professor Gamoran's February 1 lecture/webcast for the Berman Jewish Policy Archive @ NYU Wagner:
Education researchers have become increasingly aware of the challenges of measuring the impact of educational practices, programs, and policies. Too often what appears to be cause and effect may actually reflect pre-existing differences between program participants and non-participants. A variety of strategies are available to surmount this challenge, but the strategies are often costly and difficult to implement. Examples from general and Jewish education will highlight the challenges, identify strategies that respond to the challenges, and suggest how the difficulties posed by these strategies may be addressed.
Highlights From Future of Education - mSchool + DreamBox LearningDreamBox Learning
In the edWeb.net Blended Learning community’s latest webinar, Elliot Sanchez joined Dr. Tim Hudson, Senior Director of Curriculum Design for DreamBox Learning, Inc., and discussed the future of math education. Elliot, Founder & CEO of mSchool, and one of the 2014 Forbes 30 Under 30, is a leading education innovator with 14 state-funded classrooms that successfully leverage blended learning. Elliot and Tim discussed mSchool’s approach and successes, blended learning, formative assessment, meeting the diverse needs of all students, Common Core State Standards, and digital learning technologies. They provided a recap of insights from the January 22, 2014 The Future of Math Education: A Panel Discussion of Promising Practices webinar, with a focus on blended learning. That panel included NCSM President Valerie Mills, renowned math educator; author Dr. Cathy Fosnot, and past NCTM and AMTE President Dr. Francis (Skip) Fennell. Everyone interested in the success of all students in learning mathematics—educators, parents, and community members— can appreciate the valuable insights and approach to innovation from these education thought leaders.
An INSET course I facilitated for colleagued at YWIES. The presentation focuses on the research Professor John Hattie and the implications for schools of his work.
This spreadsheet accompanies Professor Gamoran's February 1 lecture/webcast for the Berman Jewish Policy Archive @ NYU Wagner:
Education researchers have become increasingly aware of the challenges of measuring the impact of educational practices, programs, and policies. Too often what appears to be cause and effect may actually reflect pre-existing differences between program participants and non-participants. A variety of strategies are available to surmount this challenge, but the strategies are often costly and difficult to implement. Examples from general and Jewish education will highlight the challenges, identify strategies that respond to the challenges, and suggest how the difficulties posed by these strategies may be addressed.
Highlights From Future of Education - mSchool + DreamBox LearningDreamBox Learning
In the edWeb.net Blended Learning community’s latest webinar, Elliot Sanchez joined Dr. Tim Hudson, Senior Director of Curriculum Design for DreamBox Learning, Inc., and discussed the future of math education. Elliot, Founder & CEO of mSchool, and one of the 2014 Forbes 30 Under 30, is a leading education innovator with 14 state-funded classrooms that successfully leverage blended learning. Elliot and Tim discussed mSchool’s approach and successes, blended learning, formative assessment, meeting the diverse needs of all students, Common Core State Standards, and digital learning technologies. They provided a recap of insights from the January 22, 2014 The Future of Math Education: A Panel Discussion of Promising Practices webinar, with a focus on blended learning. That panel included NCSM President Valerie Mills, renowned math educator; author Dr. Cathy Fosnot, and past NCTM and AMTE President Dr. Francis (Skip) Fennell. Everyone interested in the success of all students in learning mathematics—educators, parents, and community members— can appreciate the valuable insights and approach to innovation from these education thought leaders.
An INSET course I facilitated for colleagued at YWIES. The presentation focuses on the research Professor John Hattie and the implications for schools of his work.
Topics:
Quantitative research
Characteristics of Quantitative Research
Strengths of Quantitative Research
Weaknesses of Quantitative Research
Importance of Quantitative Research Across Fields
TYPES OF QUANTITATIVE RESEARCH DESIGN
This presentation to the MoodleMoot UK/I 2017 provides an overview of Learning Analytics for VLE/LMS data and lessons learned in practice from using this data to model student risk and other characteristics. The findings come from fundamental research and application of Blackboard's X-Ray Learning Analytics application.
Learning-oriented assessment in an era of high-stakes and insecure testingMark Carver
Presentation at the EUKS conference in Edinburgh, February 23rd 2019. An introduction to how assessment issues in HE may be relevant to language teachers in the future.
JiTT - Blended Learning Across the Academy - Teaching Prof. Tech - Oct 2015Jeff Loats
A four-person panel discusses the implementation of Just-in-Time Teaching in 18 courses across 5 disciplines. Participation rates and correlations with other outcomes are discussed.
Relationship of Self-Efficacy to Stages of Concerns in the Adoption of Innova...Amber D. Marcu, Ph.D.
In this research, it was proposed that self-efficacy is the missing underlying psychological factor in innovation diffusion models of higher education. This is based upon research conducted in the fields of innovation-diffusion in higher education, technology adoption, self-efficacy, health and behavioral change. It was theorized that if self-efficacy is related to adoption, it could provide a quick-scoring method for adoption efficiency and effectiveness that would be easy to administer. The innovation-diffusion model used in this study was Hall and Hord\'s (1987) Concerns Based Adoption Model (CBAM) and it\'s Seven Stages of Concern (SoC) About an Innovation. The SoC measures a user\'s perception of"and concerns about"an innovation over time. The self-efficacies under study were general, teaching, and technology. The scales used in this research instrument were Chen\'s New General Self-Efficacy (NGSE), Prieto\'s College Teaching Self-Efficacy Scale (CTSES), and Lichty\'s Teaching with Technology Self-efficacy scale (MUTEBI), respectively. This research hoped to uncover a relationship between self-efficacies and a Stage of Concern in the adoption of an instructional technology innovation, Google Apps for Education, at a large university institution. Over 150 quantitative responses were collected from a pool of 1,713 instructional faculty between late Fall 2012 and early Spring 2013 semesters. The response group was not representative of the larger population. Forty-six percent represented non-tenure track faculty compared to the expected 19 percent. Analysis using nominal logistic regression between self-efficacy and Stages of Concern revealed that no statistically significant relationship was found. Of note is that nearly all participants could be classified as being in the early-stages of an innovation adoption, possibly skewing the overall results. Complete dissertation can be obtained from http://hdl.handle.net/10919/19340
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Digital Artifact 2 - Investigating Pavilion Designs
researchED Durham 2018
1. Effect sizes and meta-
analysis
Adrian Simpson and Gary Jones
24 November, 2018
resaarchED Durham
2. Introductions
Adrian Simpson
• Professor, School of Education,
Durham University
• Research Interests
• Assessment in Higher Education
• Mathematics Education
• Psychology of Reasoning
Gary Jones
• Former senior manager in
further education
• Blogger, speaker and author
• www.garyrjones.com
• Evidence-Based School Leadership
and Management: A practical
guide.
6. Sarah’s Garage
• Comparing your oil today to your
oil last week
• Comparing your oil to your
neighbour’s oil
• Comparing the average oil in
some sports cars with the
average oil in hatchbacks
7. The Dipstick Test – pairwise version
Using relative size on one measure
(a dipstick) to stand for relative size
on another measure (oil volume)
requires everything else which
impacts on the measures to be
equal
E.g. engine dimensions/shape; oil
temperature; dipstick angle;
dipstick penetration etc.
8. The dipstick test – group version
Using relative average size on one
measure (a dipstick) to stand for
relative average size on another
measure (oil volume) requires
everything else which impacts on
the measures to be equally
distributed
E.g. engine dimensions/shape; oil
temperature; dipstick angle; oil
agitation; dipstick penetration etc.
9. Sarah’s Garage
• Comparing your oil today to your
oil last week
• Comparing your oil to your
neighbour’s oil
• Comparing the average oil in
some sports cars with the
average oil in hatchbacks
Likely to be OK – worth checking
temperature, penetration, agitation
Very risky – need to check everything
Absurdly unlikely - need very
convincing check
11. Evidence based education
• Larger effect size stands for more
effective intervention
• For single studies (IES & EEF
Projects)
• For meta-analysis (Schneider &
Preckel)
• For meta-synthesis (EEF Toolkit &
Hattie)
12. What factors contribute to effect size?
Depends on: the
choice of measure
Depends on: the choice
of comparison treatment
Depends on: the choice of
sample homogeneity
Depends on:
the choice of
intervention
treatment
d is sometimes converted to the horribly misleading ‘months’ progress’.
13. The US What Works Clearinghouse from the Institute
for Education Sciences uses “the comparability of
effect size estimates across studies … to establish the
criterion for substantively important effects for
intervention rating purposes” (IES, 2017, E-2) with “An
effect size of 0.25 standard deviations or larger …
considered to be substantively important” (p.22)
The Expert Mathematician: d=0.35
Accelerated Math: d=0
14. Does “The Expert Mathematician is more effective
than Accelerated Math” pass the dipstick test?
15. The Dipstick Test – pairwise version
The Expert Mathematician
• Intervention: 196, 40-120 minute LOGO
based ‘generative maths’ lessons
• D=0.35
• Measure: 78 item MCQ including 61
concept & application items
• Sample: Grade 8 suburban middle school
pupils
• Comparison: ‘Transition Mathematics’
textbook, 1-3 pages written explanation
followed by 30 questions for each lesson
Accelerated Math
• Intervention: 15-20 minutes each day on
maths problems (from Acc. Math)
• D=0
• Measure: Delaware Student Testing
program (50 MCQ, 16 short answer, 12
extended)
• Sample: Grade 6 suburban middle school
pupils
• Comparison: 15-20 minutes each day on
maths problems (from Delaware Math)
16. Schneider & Preckel (2017) Variables Associated With
Achievement in Higher Education: A Systematic
Review of Meta-Analyses
Small group learning: average d=0.51
Intelligent tutoring systems: average d=0.35
17. Does “small group learning is more effective than
intelligent tutoring systems” pass the dipstick test?
18. • Intelligent tutoring systems (sometimes main
teaching; sometimes adjunct; wide variety of
systems)
• D=0.35
• 100% teacher set
• Fairly homogenous (students on the same
university course)
• Sometimes human tutoring; sometimes
computer; sometimes lecturing; sometimes
not being taught the topic at all
The dipstick test – group version
Group A Group B
• Interventions: Small group learning
(sometimes for all teaching; sometimes
replacing alternative; sometimes extra)
• D=0.51
• Measures: 80% teacher set exam; 20%
standardised
• Samples: fairly homogenous (students on the
same university course)
• Comparisons: Sometimes individualized;
sometimes whole group
19. What is effect size really?
• A measure of the clarity of the study
• It depends on the whole study design, not just the intervention
• Researchers can (and do) choose to increase it by:
• More passive control treatments
• More homogenous samples
• More treatment inherent measures
• But different fields allow different freedoms
• Passive/zero control may be impossible/unethical in some situations
• Some interventions only make sense on wide samples
• Some fields tend to use standardized tests
20. Do EBE researchers know the Dipstick Test?
“The key assumption is that the research represented in the meta-
synthesis is sufficiently evenly distributed by type, population and
outcome that any differences which emerge represent differences in
the educational themes, rather than differences in the research
methods, measurements and target populations.”
(Higgins, 2018, p49)
21. Berk
People response to criticism that “… the mismatch between a meta-
analysis model and anything real … [is] ... The requisite assumptions are
listed but not defended. A list of assumptions by itself apparently
inoculates the meta-analysis against modelling errors”
(Berk, 2007, p264)
“Statistical assumptions are empirical commitments”
(Berk and Freedman, 2003, p235)
22. The dipstick test for Meta-synthesis
• Is it likely that researchers in feedback, phonics, homework, uniform,
behaviour, class size etc. use the same distribution of
• Comparison treatments
• Sample ranges
• Measures
• Why would the world of research be organized to make this so?
23. Passing the dipstick test
• When is relative effect size a measure of relative effectiveness?
• For individual studies when comparison treatment, measure and sample
range are the same
• For groups of studies when comparison treatments, measures and sample
ranges are distributed in the same way
That is: for current education research, NEVER
24. The final word
“statistical malpractice disguised as statistical razzle-dazzle”
(Berk, 2011, p199)
Let’s not be razzle-dazzled in to harming the education of our pupils
25. Further reading
• Simpson, A. (2017) 'The misdirection of public policy : comparing and
combining standardised effect sizes.', Journal of Education Policy., 32
(4). pp. 450-466.
• Simpson, A. (2018) ‘Princesses are bigger than elephants: Effect size
as a category error in evidence-based education.‘, British Educational
Research Journal Vol. 44, No. 5, October 2018, pp. 897–913
• Jones, G. (2018) Evidence-Based School Leadership: A practical guide,
SAGE, London
26.
27. For more information
Adrian Simpson
• adrian.simpson@durham.ac.uk
Gary Jones
• www.garyrjones.com
• @DrGaryJones
• jones.gary@gmail.com
Editor's Notes
3.07 in the clip
Sarah measures the oil in her car. It is at a lower level than it was last week and she takes this as evidence that the amount of oil has reduced.
That seems clear – but there is a sophisticated idea here: a one dimensional measure (depth) standing for a three dimensional one (volume)
Sarah measures the oil in her car. It is at a lower level than it was last week and she takes this as evidence that the amount of oil has reduced.
That seems clear – but there is a sophisticated idea here: a one dimensional measure (depth) standing for a three dimensional one (volume)
Comparison: active/passive/harmful
Measure: closely tied to the mechanism; more general (also noise e.g. multiple choice test vs open answer)
Range: very similar participants vs very different
It tells us how clear the difference is, not how important the intervention is
One thing I find amusing about this is that this comes about 50 pages in to a 250 page book. Effectively it is saying everything hereafter relies on this being true. Oh, and by the way it isn’t – so you should probably stop reading here.