To all the people who will read this presentation, I hope you will with this. The content of this presentation are get from the Psychological Assessment book. And this is not all mine.
4. Test-retest Reliability Estimate
• Test-retest method- by using the same instrument to measure the same
thing at two points in time.
• Test-retest reliability- is an estimate of reliability obtained by correlating
pairs of scores from the same people on two different administrations of
the same test.
• Coefficient of stability- when the interval between testing is greater
than six months.
Test-retest Reliability Estimate
4
5. Psychology’s Replicability Crisis
In the mid-2000s, academic scientists became concerned that science
was not being performed rigorously enough to prevent spurious results
from reaching consensus within the scientific community.
In 2015, a group of researchers called the Open Science Collaboration
attempted to redo 100 psychology studies that had already been
peerreviewed and published in leading journals (Open Science
Collaboration, 2015).
Psychology’s Replicability Crisis
5
6. Psychology’s Replicability Crisis
• 3 Major Causal Factors for the Emergence of Replicability Crisis:
(1) a general lack of published replication attempts in the
professional literature,
(2) editorial preferences for positive over negative findings, and
(3) questionable research practices on the part of authors of
published studies.
Psychology’s Replicability Crisis
6
7. Psychology’s Replicability Crisis
• Lack of Published Replication Attempts
Journals have long preferred to publish novel results instead of
replications of previous work. In fact, a recent study found that only
1.07% of the published psychological scientific literature sought to
directly replicate previous work (Makel et al., 2012).
Replication by independent parties provides for confidence in a
finding, reducing the likelihood of experimenter bias and statistical
anomaly. Indeed, had scientists been as focused on replication as
they were on hunting down novel results, the field would likely not
be in crisis now.
Psychology’s Replicability Crisis
7
8. Psychology’s Replicability Crisis
• Editorial Preference for Positive over Negative Findings
Journals prefer positive over negative findings. “Positive” in this
context does not refer to how upbeat, beneficial, or heart-warming
the study is. Rather, positive refers to whether the study concluded
that an experimental effect existed.
The net result is that scientists, policy-makers, judges, and anyone
else who has occasion to rely on published research may have a
difficult time determining the actual strength and robustness of a
reported finding.
Psychology’s Replicability Crisis
8
9. Psychology’s Replicability Crisis
• Questionable Research Practices (QRPs)
Included here are questionable scientific practices that do not rise
to the level of fraud but still introduce error into bodies of scientific
evidence.
For example, one variety entails the researcher failing to report all
of the research undertaken in a research program, and then
selectively only reporting the studies that confirm a particular
hypothesis.
Psychology’s Replicability Crisis
9
10. Psychology’s Replicability Crisis
• Lessons Learned from the Replicability Crisis
The replicability crisis represents an important learning opportunity
for scientists and students. Prior to such replicability issues coming
to light, it was typically assumed that science would simply self-
correct over the long run.
Spurred by the recognition of a crisis of replicability, science is
moving to right from both past and potential wrongs. As previously
noted, there are now mechanisms in place for preregistration of
experimental designs and growing acceptance of the importance of
doing so.
Psychology’s Replicability Crisis
10
11. Parallel-Forms and Alternate-Forms Reliability
Estimates
If you have ever taken a makeup exam in which the questions
were not all the same as on the test initially given, you have
had experience with different forms of a test. And if you have
ever wondered whether the two forms of the test were really
equivalent, you have wondered about the alternate- forms or
parallel-forms reliability of the test.
Parallel-Forms and Alternate-Forms Reliability Estimates 11
12. Parallel-Forms and Alternate-Forms Reliability
Estimates
• Parallel-forms coefficient of reliability/coefficient of
equivalence
The degree of the relationship between various forms of a test can
be evaluated by means of an alternate-forms.
Parallel-Forms and Alternate-Forms Reliability Estimates 12
13. Parallel-Forms and Alternate-Forms Reliability
Estimates
• Parallel forms
For each form of the test, the means and the variances of
observed test scores are equal. In theory, the means of
scores obtained on parallel forms correlate equally with the
true score.
Parallel-Forms and Alternate-Forms Reliability Estimates 13
14. Parallel-Forms and Alternate-Forms Reliability
Estimates
• Parallel forms reliability
An estimate of the extent to which item sampling and
other errors have affected test scores on versions of the
same test when, for each form of the test, the means and
variances of observed test scores are equal.
14
15. Parallel-Forms and Alternate-Forms Reliability
Estimates
• Alternate forms
are simply different versions of a test that have been
constructed so as to be parallel. Although they do not meet
the requirements for the legitimate designation “parallel,”
alternate forms of a test are typically designed to be
equivalent with respect to variables such as content and
level of difficulty.
15
16. Parallel-Forms and Alternate-Forms Reliability
Estimates
•Alternate forms reliability
refers to an estimate of the extent to which these different
forms of the same test have been affected by item
sampling error, or other error
InParallel-Forms and Alternate-Forms Reliability Estimatessert Running Title 16
17. Parallel-Forms and Alternate-Forms Reliability
Estimates
• Alternate-forms reliability and Aarallel-forms reliability is
similar in two ways to obtaining an estimate of test-retest
reliability:
(1) Two test administrations with the same group are required,
and
(2) test scores may be affected by factors such as motivation,
fatigue, or intervening events such as practice, learning, or
therapy (although not as much as when the same test is
administered twice).
Parallel-Forms and Alternate-Forms Reliability Estimates 17
18. Parallel-Forms and Alternate-Forms Reliability
Estimates
• Internal consistency estimate of reliability or as an Estimate of
inter-item consistency
Deriving this type of estimate entails an evaluation of the internal
consistency of the test items.
- Through this, an estimate of the reliability of a test can
be obtained without developing an alternate form of the
test and without having to administer the test twice to
the same people.
Parallel-Forms and Alternate-Forms Reliability Estimates 18
19. Split-Half Reliability Estimates
• 3 Steps to compute coefficient of split-half reliability
Step 1. Divide the test into equivalent halves.
Step 2. Calculate a Pearson r between scores on the two halves of the
test.
Step 3. Adjust the half-test reliability using the Spearman–Brown
formula (discussed shortly).
Split-Half Reliability Estimates 19
20. Split-Half Reliability Estimates
• Step 1. Divide the test into equivalent halves.
To split a test is to randomly assign items to one or the other half of
the test.
Another acceptable way to split a test is to assign odd-numbered
items to one half of the test and even-numbered items to the other
half. This method yields an estimate of split-half reliability that is also
referred to as odd-even reliability.
Yet another way to split a test is to divide the test by content so that
each half contains items equivalent with respect to content and
difficulty.
Split-Half Reliability Estimates 20
21. Split-Half Reliability Estimates
• Step 2. Calculate a Pearson r between scores on the two
halves of the test.
In the procedure entails the computation of a Pearson r,
which requires little explanation at this point. However, the
third step requires the use of the Spearman–Brown formula.
Split-Half Reliability Estimates 21
22. Summary
• The test-retest measure is appropriate when evaluating the reliability of a test that
purports to measure something that is relatively stable over time, such as a
personality trait.
• Spurred by the recognition of a crisis of replicability, science is moving to right
from both past and potential wrongs.
• Alternate or parallel form of a test it is advantageous to the test user in several
ways.
• Split-Half Reliability Estimates is a useful measure of reliability when it is
impractical or undesirable to assess reliability with two tests or to administer a test
twice.
Summary 22
23. References
• Cohen, R., A Swerdik, M. (2018). PsychologicalTesting and Assessment (9th Ed).
NewYork.
Summary 23