SlideShare a Scribd company logo
1 of 63
Seminar in
English research
Second Semester
2020
3/23/2020 1
Outline of Chapter 8 and 9
• Main Points in Chapter 8 and 9 with
immediate Examples.
• Expansion, themes and explanation
• Questions & Related Videos
3/23/2020 3
VALIDITY & RELIABILITY
• Validity refers to the appropriateness, meaningfulness,
correctness, and usefulness of the inferences which a
researcher makes.
• An example, a test of intelligence should measure
intelligence and not something else (such as memory)
• Reliability refers to the consistency of scores or
answers from one administration of an instrument to
another.
• An example, measurements of people's height and
weight are often extremely reliable.
3/23/2020 4
3/23/2020 5
Validity: determines whether the
research truly measures that
which it was intended to or how
truthful the research results are.
Reliability and validity are
conceptualized as trustworthiness,
rigor and quality in qualitative
paradigm. That can be achieved
by eliminating bias and increasing
the researcher’s trustfulness of a
proposition about some social
phenomena using triangulation.
3/23/2020 6
 It is the most important idea
to consider when preparing
or selecting an instrument.
 Validation is the process of
collecting and analyzing
evidences to support such
inferences.
 The validity of a
measurement tool is the
degree to which the tool
measures what it claims to
measure.
3/23/2020 7
The Importance of Valid Instruments
• The quality of instruments used in
research is very important because
conclusions drawn are based on
the information obtained by these
instruments.
• Researchers follow certain
procedures to make sure that the
inferences they draw based on the
data collected are valid and
reliable.
• Researchers should keep in mind
these two terms, validity and
reliability, in preparing data
collection instruments.3/23/2020 8
How can validity be established?
• Quantitative studies:
– measurements, scores, instruments
used, research design.
An example: The survey conducted to
understand the amount of time a doctor
takes to tend to a patient when the patient
walks into the hospital
• Qualitative studies:
– ways that researchers have devised to
establish credibility: member checking,
triangulation, thick description, peer
reviews, external audits.
– Examples: diary accounts, open-ended
questionnaires, documents, participant
observation and ethnography.
3/23/2020 9
Evidence of Validitity
 There are 3 types of evidence a researcher might
collect:
 Content-related evidence of validity
 Content and format of the instrument
 An example: Is the test fully representative of
what it aims to measure?
 Criterion-related evidence of validity
 Relationship between scores obtained using
the instrument and scores obtained
 An example: Do the results correspond to a
different test of the same thing?
 Construct-related evidence of validity
 Psychological construct being measured by
the instrument
 An example: Does the test measure the
concept that it’s intended to measure?
10
 A key element is the adequacy of the
sampling of the domain that is supposed to
represent.
 The other aspect of content validation is the
format of the instrument.
 Attempts to obtain evidence that the items
measure what they are supposed to measure
typify the process of content-related evidence.
Content-related Evidence
3/23/2020 11
• Content-related evidence of validity:
Content and format of the instrument
• Examples:
• How appropriate is the content?
• How comprehensive is the
content?
• Does the content get at the
intended variable?
• How adequately does the
sample of items or questions
represent the content to be
assessed?
• Is the format of the instrument
appropriate?
3/23/2020 12
• Content-related evidence, for example:
• The effects of a new listening program on speaking
ability of fifth-graders:
• Adequacy of sampling: The nature of the
psychological construct or characteristic
being measured by the instrument.
• Format of the instrument: The clarity of printing,
size of type, adequacy of work space,
appropriateness of language, clarity of
directions etc.
3/23/2020 13
– How can we obtain content-related evidence of validity?
• Have someone who knows enough about what is being
measured to be a competent judge. In other words, the
researcher should get more than one judge’s opinions
about the content and format of the instrument to be
applied before administering it.
• The researcher evaluates the feedback from the judges
and makes necessary modifications in the instrument.
3/23/2020 14
How to establish Content
Validity?
1) Instructional objectives.
An example:
• At the end of the chapter, the student will be able
to do the following:
1. Explain what ‘stars’ are
2. Discuss the type of stars and galaxies in our universe
3. Categorize different constellations by looking at the stars
4. Differentiate between our stars, the sun, and all other
stars
3/23/2020 15
2) Table of Specification.
(An example):
3/23/2020 16
 A criterion is a second test presumed
to measure the same variable.
Criterion-related evidence of validity: The
relationship between the scores obtained
by an instrument and the scores obtained
by another instrument (a criterion).
• Examples:
How strong is this relationship?
How well do such scores estimate
the present or future performance of
a certain type?
Criterion-related Evidence
3/23/2020 17
• When a correlation coefficient is used to describe the relationship
between a set of scores obtained by the same group of
individuals on a particular instrument (the predictor) and their
scores on some criterion measure (the criterion), it is called a
validity coefficient.
• Expectancy table: (See Table 8.1, p. 164)
– Criterion-related evidence:
• Compare performance on one instrument with performance
on some other, independent criterion. Academic ability
scores of the students on the instrument compared with
students’ grade point averages. High scores on the
instrument will correspond to high grade point averages.
3/23/2020 18
– Two forms of criterion-related validity:
1) Predictive validity: Student scores on a science
aptitude test administered at the beginning of the
semester are compared with the end-of-the-semester
grades.
• An example: the validity of a cognitive test for job
performance is the correlation between test scores
and, for example, supervisor performance ratings.
2) Concurrent validity: Instrument data and criterion data
are collected at nearly the same times and the results are
compared to obtain evidence of concurrent validity.
• An example: Researchers give a group of students a
new test, designed to measure mathematical aptitude.
They then compare this with the test scores already
held by the school, a recognized and reliable judge of
mathematical ability.
3/23/2020 19
 Considered the broadest of the three categories.
 There is no single piece of evidence that satisfies
construct-related validity.
 Researchers attempt to collect a variety of types
of evidence, including both content-related and
criterion-related evidence.
 The more evidence researchers have from
different sources, the more confident they become
about the interpretation of the instrument.
Construct-related Evidence
3/23/2020 20
The nature of the psychological construct or characteristic being
measured by the instrument.
Examples:
• How well does a measure of the construct explain the
differences in the behavior of individuals or their performance
on certain tasks?
• A women’s studies program may design a cumulative
assessment of learning throughout the major. The questions
are written with complicated wording and phrasing. This can
cause the test inadvertently becoming a test of reading
comprehension, rather than a test of women’s studies. It is
important that the measure is actually assessing the intended
construct, rather than an extraneous factor.
3/23/2020 21
In obtaining construct-related evidence
of validity, there are three steps
involved.
1. the variable being measured is clearly defined.
2. hypotheses, based on a theory underlying the
variable, are formed about how people who
possess a lot versus a little of the variable will
behave in a particular situation.
3. the hypotheses are tested both logically and
empirically.
3/23/2020 22
Construct-related Evidence
– Does the test measure the ‘human’ characteristic(s)?
it is supposed to measure:
– Verbal reasoning
– Mathematical reasoning
– Musical ability
– Spatial ability
– Mechanical aptitude
– Motivation
*Applicable to authentic assessment, each construct is broken
down into its component parts
An example: ‘motivation’ can be broken down to:
– Interest
– Attention span
– Hours spent
– Assignments undertaken and submitted, etc.
All of these sub-constructs put together – measure ‘motivation’
3/23/2020 23
Factors that can lower Validity
• Unclear directions
• Difficult reading vocabulary and sentence structure
• Ambiguity in statements
• Inadequate time limits
• Inappropriate level of difficulty
• Poorly constructed test items
• Test items inappropriate for the outcomes being
measured
• Tests that are too short
• Improper arrangement of items (complex to easy?)
• Identifiable patterns of answers
• Nature of criterion
3/23/2020 24
3/23/2020 25
RELIABILITY
• Reliability refers to the consistency
of scores obtained from one
administration of an instrument to
another and from one set of items to
another.
• If the test is reliable, we would expect
a student who receives a high score
on the test for measuring typing ability
at first instance, to receive a high
score the next time he takes the test.
The scores may not be identical, but
they should be close.
3/23/2020 26
How can reliability be established?
• Quantitative studies?
– Assumption of repeatability
• Qualitative studies?
– Reframe as dependability and
confirmability
3/23/2020 27
ERRORS OF MEASUREMENT
• Whenever people take the same test twice,
they will seldom perform exactly the same,
that is, their scores or answers will not be
identical. It is inevitable due to a variety of
factors such as motivation, energy, anxiety,
a different testing situation etc. Such factors
result in errors of measurement.
3/23/2020 28
The scores obtained
from an instrument can
be quite reliable, but
not valid. The test on
the Constitution of the
US versus success in
the physical education.
If the data are
unreliable, they cannot
lead to valid
inferences.
(See Figure 8.2, p.166)
3/23/2020 29
Validity and Reliability
Neither Valid
nor Reliable
Reliable
but not
Valid
Valid &
Reliable
Fairly Valid but
not very
Reliable
Think in terms of ‘the
purpose of tests’ and
the ‘consistency’ with
which the purpose is
fulfilled/met
Reliability of
Measurement
3/23/2020 31
Reliability Coefficient
• Reliability Coefficient can be measured by
three best known ways:
• Test – retest
Give the same test twice to the same group with any time
interval between tests
• Equivalent forms (similar in content, difficulty level,
arrangement, type of assessment, etc.)
Give two forms of the test to the same group in close
succession.
• The internal Consistency methods (subjective scoring)
Calculate percent of exact agreement by using Pearson's
product moment and find out the coefficient of
determination (SPSS)
3/23/2020 32
• The test-retest method involves administering the
same test to the same group after a certain time has
elapsed. A reliability coefficient is then calculated to
indicate the relationship between the two sets of
scores obtained. For most educational research,
stability of scores over a two- to three-month period is
usually viewed as sufficient evidence of test-retest
reliability.
• An example: test on a Monday, then again the
following Monday.
3/23/2020 33
Validity & Reliability Coefficients
Equivalent-forms method
• When the equivalent-forms method is used, two
different but equivalent (parallel or alternate) forms of
an instrument are administered to the same group of
individuals during the same time period. A high
coefficient would indicate strong evidence of reliability
– that two forms are measuring the same thing.
• An example: uses one set of questions divided into
two equivalent sets (“forms”), where both sets
contain questions that measure the same construct,
knowledge or skill
3/23/2020 34
INTERNAL CONSISTENCY METHODS
• The methods we have seen so far require two
administrations or testing sessions. There are several
internal-consistency methods of estimating reliability that
require only a single administration of an instrument.
» Split-half procedure: This procedure involves scoring two halves
(odd items versus even items) of a test separately for each person
and then calculating a correlation coefficient for the two sets of
scores. The coefficient indicates the degree to which two halves
of the test provide the same results and hence describes the
internal consistency of the test.
An example: If we are interested in the perceived practicality of
electric cars and gasoline-powdered cars, we could use a split-
half method and ask the "same" question two different ways.
3/23/2020 35
INTERNAL CONSISTENCY METHODS
Kuder-Richardson Approaches: The most frequently used method for
determining internal consistency is the Kuder-Richardson approach,
particularly formulas KR20 and KR21.
An example: reliability for a binary test (i.e. one with right or wrong
answers).
KR21 requires (1) the number of items on the test, (2) the mean, and (3)
the standard deviation if the items on the test are of equal difficulty. (See
KR21 formula in P.167)
Alpha Coefficient – Another check on the internal consistency of an
instrument is to calculate an alpha coefficient frequently called
Cronbach alpha symbolized as ().
An example: measures internal reliability for tests with multiple possible
answers.
3/23/2020 36
INTERNAL CONSISTENCY
METHODS
• The standard error of measurement
(SEM) is a measure of how much
measured test scores are spread
around a “true” score. It’s an index
shows the extent to which a
measurement would vary under
changed circumstances (the amount of
measurement error)
3/23/2020 37
Validity & Reliability Coefficients
• A validity coefficient expresses the
relationship between scores of the same
individuals on two different
instruments.
• A reliability coefficient expresses the
relationship between the scores of the
same individuals on the same
instrument at two different times or
between two parts of the same
instrument.
• Reliability coefficients must range from
.00 to 1.00, that is, with no negative
values.3/23/2020 38
Conclusion
• Reliability and validity are concepts used to
evaluate the quality of research. They indicate
how well a method, technique measures
something. Reliability is about the consistency of
a measure, and validity is about the accuracy of a
measure.
• It’s important to consider reliability and validity
when you are creating your research design,
planning your methods, and writing up your
results, especially in quantitative research.
3/23/2020 39
3/23/2020 40
3/23/2020 41
“Well, I have been drinking more lately,
and I’m on a new diet. Also my wife and I
haven’t been getting along of late.”
“You think that the increase in your
blood pressure is due to the new class
you’ve been assigned. Is anything else
different?”
M.D. Teacher
INTERNAL VALIDITY
• When a study has internal validity, it means that any
relationship observed between two or more variables
should be unambiguous as to what it means rather than
being due to “something else” – (alternative hypothesis).
• Defending against sources of bias arising in research
design.
• An example: let’s suppose you ran an experiment to see if
mice lost weight when they exercised on a wheel. You
used good experimental practices, like random samples,
and you used control variables to account for other things
that might cause weight loss (change in diet, disease, age
etc.). In other words, you accounted for the confounding
variables that might affect your data and your experiment
has high validity.
3/23/2020 42
Threats to Internal Validity
• Subject characteristics
• Mortality
• Location
• Instrumentation
• Testing
• History
• Maturation
• Attitude of Subjects
• Regression
• Implementation
3/23/2020 43
Subject characteristics
Participants in a study may have different characteristics and
those differences may affect the results.
Examples:
Age Intelligence Vocabulary
Strength Attitude Fluency
Maturity Fluency Ethnicity
Coordination Speed
Socioeconomic status
3/23/2020 44
Mortality
Loss of subjects from the study due to:
Examples:
– Illness
– Family relocation
– Requirements of other activities
– Absent during collection of data or fail to
complete tests
3/23/2020 45
Location
Place in which data is collected, or
an intervention is carried out, may
influence the results.
An example: Studying the behavior of
animals in a zoo may make it easier
to draw valid causal inferences within
that context, but these inferences
may not generalize to the behavior of
animals in the wild.
3/23/2020 46
Instrumentation
Inconsistent use of the measurement instrument.
1. Instrument Decay – instrument changes in some way
2. Data Collector Characteristics – affects the nature of the
data they obtain. Such as gender, age, ethnicity, age,
language patterns, etc.
3. Data Collector Bias – unconsciously influence the
outcome of the data.
An example: Two examiners for an instructional experiment
administered the post-test with different instructions and
procedures.
3/23/2020 47
Testing
• In an experiment in which performance on a logical
reasoning test is the dependent variable, a pre-test cues the
subjects about the post-test.
• So pre-tests may influence the result of the post-test.
• A group may perform better on a post-test because the pre-
test primed them to perform better.
• An example: Participants may remember the correct
answers or may be conditioned to know that they are being
tested. Repeatedly taking (the same or similar) intelligence
tests usually leads to score gains.
3/23/2020 48
History
Occurrence of events that could alter the outcome or
the results of the study
Previous – events that occurred prior to the study
Concurrent – happening during the study.
An example: What if the children in one group differ
from those in the other in their television habits.
Perhaps the experimental group children watch a
certain program more frequently than those in the
control group do. There’s really an effect of the two
groups differentially experiencing a relevant event
in this case -certain program- between the
pretest and posttest.
3/23/2020 49
Maturation
Any changes that occur in the subjects during the
course of the study that are not part of the study
and that might affect the results of the study; such
as changes due to aging and experience.
An example: The sample can go from being in a
good mood or a bad one. Factors such as subject
tiredness, boredom, hunger and inattention can
also occur. These factors can be driven by the
research participant or the experiment.
3/23/2020 50
Attitude of Subjects
• Subjects opinion and participation can influence the
outcome.
• Observing or studying subjects can affect their
responses a.k.a. Hawthorne effect.
• Subjects receiving experimental treatment may
perform better due to “receiving” treatment.
• Subjects in the control group may perform more
poorly than the treatment group.
• An example: The students may feel that they are
being tested too often. This feeling could cause them
to get tired of having to take another test and not do
their best. This change in attitude could cause the
results of the study to be biased.
3/23/2020 51
Regression
Groups that are chosen due to performance
characteristics, either high or low, will on the
average score closer to the mean on subsequent
testing regardless of what transpires during the
experiment.
An example: A group of students who score low on
a mathematics test are given additional help. Six
weeks later they are given an exam similar
problems on another test, their average score has
improved. Is it due to the additional help or other
influences?
3/23/2020 52
Implementation
Personal bias in favor of one method or another. Preference
for the method may account for better performance by the
subjects; how well you implement your methodology whether it
has measured what it your research question has intended to
or not.
An example: If you were interested in the impact of two
different teaching methods, namely students receiving lectures
and seminar classes compared to students receiving lectures
only on the exam performance of students, you may also want
to ensure that the teachers involved in the study had
a similar educational background, teaching experience , and
so forth.
3/23/2020 53
Methods to Minimize Threats
 Standardization of the conditions under which the
research study is carried out will help minimize
threats from history and instrumentation.
 Obtain as much information as possible about the
participants in the research study minimizes
threats from mortality and selection.
 Obtain as much information as possible about the
procedural details of the research study, for
example, where and when the study occurs,
minimizing threats from history and
instrumentation.
 Choose an appropriate research design which can
help control most other threats.
3/23/2020 54
An Example: Is the test appropriate to the
population?
• What is the composition of the test-taking population?
• To what extent can the assessment be administered without
encumbrance to all members of the population?
• Is there a translated version, adapted version or accommodated
version of the test?
• Are there recommendations for alternative testing procedures?
• Has the planned accommodation been assessed in terms of it’s
impact on validity and reliability of test scores?
3/23/2020 55
Conclusion
• Internal validity refers to how well a piece of research allows
you to choose among alternate explanations of something. A
research study with high internal validity lets you choose one
explanation over another with a lot of confidence, because it
avoids (many possible) confounds.
• The more valid and reliable research instruments are the more
likely one will draw the appropriate conclusions from the
collected data and solve the research problem in a creditable
fashion.
3/23/2020 56
how do I go about
establishing /
ensuring validity
& reliability in my
own test papers?
3/23/2020 57
What do you think…?
• Forced-choice assessment forms are high in reliability, but
weak in validity (true/false)
• Performance-based assessment forms are high in both
validity and reliability (true/false)
• A test item is said to be unreliable when most students
answered the item wrongly (true/false)
• When a test contains items that do not represent the
content covered during instruction, it is known as an
unreliable test (true/false)
• Test items that do not successfully measure the intended
learning outcomes (objectives) are invalid items
(true/false)
• Assessment that does not represent student learning well
enough are definitely invalid and unreliable (true/false)
• A valid test can sometimes be unreliable (true/false)
– If a test is valid, it is reliable! (by-product)
3/23/2020 58
Related Videos
3/23/2020 59
Related Videos
3/23/2020 60
Related Videos
3/23/2020 61
Bibliography
Fraenkel, J.R. & Wallen, N.E. (2002). Validity & Reliability,
Internal Validity. How to design and evaluate research in
education( fifth edition).
Martin, Wendy (1997). Single Group Threats to Internal
Validity. Retrieved October 15, 2006 from the World
Wide Web:
http://www.socialresearchmethods.net/tutorial/Martin/intva
l1.html.
https://www.youtube.com/watch?v=KuT2n1w0Ixc
https://www.youtube.com/watch?v=F6LGa8jsdjo
https://www.youtube.com/watch?v=2fK1ClycBTM
3/23/2020 62
3/23/2020 63

More Related Content

What's hot

Test Reliability and Validity
Test Reliability and ValidityTest Reliability and Validity
Test Reliability and ValidityBrian Ebie
 
Test standardization and norming
Test standardization and normingTest standardization and norming
Test standardization and normingHannah Grace Gilo
 
Validity & reliability seminar
Validity & reliability seminarValidity & reliability seminar
Validity & reliability seminarmrikara185
 
Presentation validity
Presentation validityPresentation validity
Presentation validityAshMusavi
 
VALIDITY
VALIDITYVALIDITY
VALIDITYANCYBS
 
Validity in Research
Validity in ResearchValidity in Research
Validity in ResearchEcem Ekinci
 
Validity and reliability in assessment.
Validity and reliability in assessment. Validity and reliability in assessment.
Validity and reliability in assessment. Tarek Tawfik Amin
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validityKaimrc_Rss_Jd
 
Validity, its types, measurement & factors.
Validity, its types, measurement & factors.Validity, its types, measurement & factors.
Validity, its types, measurement & factors.Maheen Iftikhar
 
validity and reliability
validity and reliabilityvalidity and reliability
validity and reliabilityaffera mujahid
 
reliablity and validity in social sciences research
reliablity and validity  in social sciences researchreliablity and validity  in social sciences research
reliablity and validity in social sciences researchSourabh Sharma
 
What is Reliability and its Types?
What is Reliability and its Types? What is Reliability and its Types?
What is Reliability and its Types? Dr. Amjad Ali Arain
 

What's hot (20)

Test Reliability and Validity
Test Reliability and ValidityTest Reliability and Validity
Test Reliability and Validity
 
Test standardization and norming
Test standardization and normingTest standardization and norming
Test standardization and norming
 
Validity & reliability seminar
Validity & reliability seminarValidity & reliability seminar
Validity & reliability seminar
 
Presentation validity
Presentation validityPresentation validity
Presentation validity
 
VALIDITY
VALIDITYVALIDITY
VALIDITY
 
01 validity and its type
01 validity and its type01 validity and its type
01 validity and its type
 
Validity in Research
Validity in ResearchValidity in Research
Validity in Research
 
Validity and reliability in assessment.
Validity and reliability in assessment. Validity and reliability in assessment.
Validity and reliability in assessment.
 
Reliability
ReliabilityReliability
Reliability
 
Validity & reliability
Validity & reliabilityValidity & reliability
Validity & reliability
 
Validity & Reliability
Validity & ReliabilityValidity & Reliability
Validity & Reliability
 
Reliability and validity
Reliability and validityReliability and validity
Reliability and validity
 
Validity in Assessment
Validity in AssessmentValidity in Assessment
Validity in Assessment
 
Reliability and validity
Reliability and  validityReliability and  validity
Reliability and validity
 
Validity
ValidityValidity
Validity
 
Validity, its types, measurement & factors.
Validity, its types, measurement & factors.Validity, its types, measurement & factors.
Validity, its types, measurement & factors.
 
Standardization
StandardizationStandardization
Standardization
 
validity and reliability
validity and reliabilityvalidity and reliability
validity and reliability
 
reliablity and validity in social sciences research
reliablity and validity  in social sciences researchreliablity and validity  in social sciences research
reliablity and validity in social sciences research
 
What is Reliability and its Types?
What is Reliability and its Types? What is Reliability and its Types?
What is Reliability and its Types?
 

Similar to Validity, reliability & Internal validity in Researches

Validity of test
Validity of testValidity of test
Validity of testSarat Rout
 
The validity of Assessment.pptx
The validity of Assessment.pptxThe validity of Assessment.pptx
The validity of Assessment.pptxNurulKhusna13
 
RCH 8301, Quantitative Research Methods 1 Course L
  RCH 8301, Quantitative Research Methods 1 Course L  RCH 8301, Quantitative Research Methods 1 Course L
RCH 8301, Quantitative Research Methods 1 Course LVannaJoy20
 
Session 2 2018
Session 2 2018Session 2 2018
Session 2 2018Sue Hines
 
Research instruments
Research instrumentsResearch instruments
Research instrumentsJihan Zayed
 
JC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptxJC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptxsaurami
 
research-instruments (1).pptx
research-instruments (1).pptxresearch-instruments (1).pptx
research-instruments (1).pptxJCronus
 
Validity.pptx
Validity.pptxValidity.pptx
Validity.pptxrupasi13
 
MAAM DAGUDAG -RESEARCH REPORT 2019.pptx
MAAM DAGUDAG -RESEARCH REPORT 2019.pptxMAAM DAGUDAG -RESEARCH REPORT 2019.pptx
MAAM DAGUDAG -RESEARCH REPORT 2019.pptxRODELAZARES3
 
1 Assessing the Validity of Inferences Made from Assess.docx
1  Assessing the Validity of Inferences Made from Assess.docx1  Assessing the Validity of Inferences Made from Assess.docx
1 Assessing the Validity of Inferences Made from Assess.docxoswald1horne84988
 
Week 1-2 -INTRODUCTION TO QUANTITATIVE RESEARCH.pptx
Week 1-2 -INTRODUCTION TO QUANTITATIVE RESEARCH.pptxWeek 1-2 -INTRODUCTION TO QUANTITATIVE RESEARCH.pptx
Week 1-2 -INTRODUCTION TO QUANTITATIVE RESEARCH.pptxChristineTorrepenida1
 

Similar to Validity, reliability & Internal validity in Researches (20)

Validity and Reliability.pdf
Validity and Reliability.pdfValidity and Reliability.pdf
Validity and Reliability.pdf
 
Validity and Reliability.pdf
Validity and Reliability.pdfValidity and Reliability.pdf
Validity and Reliability.pdf
 
Validity of test
Validity of testValidity of test
Validity of test
 
The validity of Assessment.pptx
The validity of Assessment.pptxThe validity of Assessment.pptx
The validity of Assessment.pptx
 
MBA-12-02
MBA-12-02MBA-12-02
MBA-12-02
 
RCH 8301, Quantitative Research Methods 1 Course L
  RCH 8301, Quantitative Research Methods 1 Course L  RCH 8301, Quantitative Research Methods 1 Course L
RCH 8301, Quantitative Research Methods 1 Course L
 
Session 2 2018
Session 2 2018Session 2 2018
Session 2 2018
 
RM-3 SCY.pdf
RM-3 SCY.pdfRM-3 SCY.pdf
RM-3 SCY.pdf
 
Research instruments
Research instrumentsResearch instruments
Research instruments
 
Criteria in social research
Criteria in social researchCriteria in social research
Criteria in social research
 
JC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptxJC-16-23June2021-rel-val.pptx
JC-16-23June2021-rel-val.pptx
 
research-instruments (1).pptx
research-instruments (1).pptxresearch-instruments (1).pptx
research-instruments (1).pptx
 
Validity.pptx
Validity.pptxValidity.pptx
Validity.pptx
 
MAAM DAGUDAG -RESEARCH REPORT 2019.pptx
MAAM DAGUDAG -RESEARCH REPORT 2019.pptxMAAM DAGUDAG -RESEARCH REPORT 2019.pptx
MAAM DAGUDAG -RESEARCH REPORT 2019.pptx
 
Quantitative Research
Quantitative ResearchQuantitative Research
Quantitative Research
 
Qualitative Research Methods
Qualitative Research MethodsQualitative Research Methods
Qualitative Research Methods
 
1 Assessing the Validity of Inferences Made from Assess.docx
1  Assessing the Validity of Inferences Made from Assess.docx1  Assessing the Validity of Inferences Made from Assess.docx
1 Assessing the Validity of Inferences Made from Assess.docx
 
Week 1-2 -INTRODUCTION TO QUANTITATIVE RESEARCH.pptx
Week 1-2 -INTRODUCTION TO QUANTITATIVE RESEARCH.pptxWeek 1-2 -INTRODUCTION TO QUANTITATIVE RESEARCH.pptx
Week 1-2 -INTRODUCTION TO QUANTITATIVE RESEARCH.pptx
 
Measurement.pptx
 Measurement.pptx Measurement.pptx
Measurement.pptx
 
Chapter 3 Quantitative Research Designs
Chapter 3 Quantitative Research DesignsChapter 3 Quantitative Research Designs
Chapter 3 Quantitative Research Designs
 

Recently uploaded

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 

Recently uploaded (20)

Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 

Validity, reliability & Internal validity in Researches

  • 1. Seminar in English research Second Semester 2020 3/23/2020 1
  • 2.
  • 3. Outline of Chapter 8 and 9 • Main Points in Chapter 8 and 9 with immediate Examples. • Expansion, themes and explanation • Questions & Related Videos 3/23/2020 3
  • 4. VALIDITY & RELIABILITY • Validity refers to the appropriateness, meaningfulness, correctness, and usefulness of the inferences which a researcher makes. • An example, a test of intelligence should measure intelligence and not something else (such as memory) • Reliability refers to the consistency of scores or answers from one administration of an instrument to another. • An example, measurements of people's height and weight are often extremely reliable. 3/23/2020 4
  • 5. 3/23/2020 5 Validity: determines whether the research truly measures that which it was intended to or how truthful the research results are. Reliability and validity are conceptualized as trustworthiness, rigor and quality in qualitative paradigm. That can be achieved by eliminating bias and increasing the researcher’s trustfulness of a proposition about some social phenomena using triangulation.
  • 7.  It is the most important idea to consider when preparing or selecting an instrument.  Validation is the process of collecting and analyzing evidences to support such inferences.  The validity of a measurement tool is the degree to which the tool measures what it claims to measure. 3/23/2020 7
  • 8. The Importance of Valid Instruments • The quality of instruments used in research is very important because conclusions drawn are based on the information obtained by these instruments. • Researchers follow certain procedures to make sure that the inferences they draw based on the data collected are valid and reliable. • Researchers should keep in mind these two terms, validity and reliability, in preparing data collection instruments.3/23/2020 8
  • 9. How can validity be established? • Quantitative studies: – measurements, scores, instruments used, research design. An example: The survey conducted to understand the amount of time a doctor takes to tend to a patient when the patient walks into the hospital • Qualitative studies: – ways that researchers have devised to establish credibility: member checking, triangulation, thick description, peer reviews, external audits. – Examples: diary accounts, open-ended questionnaires, documents, participant observation and ethnography. 3/23/2020 9
  • 10. Evidence of Validitity  There are 3 types of evidence a researcher might collect:  Content-related evidence of validity  Content and format of the instrument  An example: Is the test fully representative of what it aims to measure?  Criterion-related evidence of validity  Relationship between scores obtained using the instrument and scores obtained  An example: Do the results correspond to a different test of the same thing?  Construct-related evidence of validity  Psychological construct being measured by the instrument  An example: Does the test measure the concept that it’s intended to measure? 10
  • 11.  A key element is the adequacy of the sampling of the domain that is supposed to represent.  The other aspect of content validation is the format of the instrument.  Attempts to obtain evidence that the items measure what they are supposed to measure typify the process of content-related evidence. Content-related Evidence 3/23/2020 11
  • 12. • Content-related evidence of validity: Content and format of the instrument • Examples: • How appropriate is the content? • How comprehensive is the content? • Does the content get at the intended variable? • How adequately does the sample of items or questions represent the content to be assessed? • Is the format of the instrument appropriate? 3/23/2020 12
  • 13. • Content-related evidence, for example: • The effects of a new listening program on speaking ability of fifth-graders: • Adequacy of sampling: The nature of the psychological construct or characteristic being measured by the instrument. • Format of the instrument: The clarity of printing, size of type, adequacy of work space, appropriateness of language, clarity of directions etc. 3/23/2020 13
  • 14. – How can we obtain content-related evidence of validity? • Have someone who knows enough about what is being measured to be a competent judge. In other words, the researcher should get more than one judge’s opinions about the content and format of the instrument to be applied before administering it. • The researcher evaluates the feedback from the judges and makes necessary modifications in the instrument. 3/23/2020 14
  • 15. How to establish Content Validity? 1) Instructional objectives. An example: • At the end of the chapter, the student will be able to do the following: 1. Explain what ‘stars’ are 2. Discuss the type of stars and galaxies in our universe 3. Categorize different constellations by looking at the stars 4. Differentiate between our stars, the sun, and all other stars 3/23/2020 15
  • 16. 2) Table of Specification. (An example): 3/23/2020 16
  • 17.  A criterion is a second test presumed to measure the same variable. Criterion-related evidence of validity: The relationship between the scores obtained by an instrument and the scores obtained by another instrument (a criterion). • Examples: How strong is this relationship? How well do such scores estimate the present or future performance of a certain type? Criterion-related Evidence 3/23/2020 17
  • 18. • When a correlation coefficient is used to describe the relationship between a set of scores obtained by the same group of individuals on a particular instrument (the predictor) and their scores on some criterion measure (the criterion), it is called a validity coefficient. • Expectancy table: (See Table 8.1, p. 164) – Criterion-related evidence: • Compare performance on one instrument with performance on some other, independent criterion. Academic ability scores of the students on the instrument compared with students’ grade point averages. High scores on the instrument will correspond to high grade point averages. 3/23/2020 18
  • 19. – Two forms of criterion-related validity: 1) Predictive validity: Student scores on a science aptitude test administered at the beginning of the semester are compared with the end-of-the-semester grades. • An example: the validity of a cognitive test for job performance is the correlation between test scores and, for example, supervisor performance ratings. 2) Concurrent validity: Instrument data and criterion data are collected at nearly the same times and the results are compared to obtain evidence of concurrent validity. • An example: Researchers give a group of students a new test, designed to measure mathematical aptitude. They then compare this with the test scores already held by the school, a recognized and reliable judge of mathematical ability. 3/23/2020 19
  • 20.  Considered the broadest of the three categories.  There is no single piece of evidence that satisfies construct-related validity.  Researchers attempt to collect a variety of types of evidence, including both content-related and criterion-related evidence.  The more evidence researchers have from different sources, the more confident they become about the interpretation of the instrument. Construct-related Evidence 3/23/2020 20
  • 21. The nature of the psychological construct or characteristic being measured by the instrument. Examples: • How well does a measure of the construct explain the differences in the behavior of individuals or their performance on certain tasks? • A women’s studies program may design a cumulative assessment of learning throughout the major. The questions are written with complicated wording and phrasing. This can cause the test inadvertently becoming a test of reading comprehension, rather than a test of women’s studies. It is important that the measure is actually assessing the intended construct, rather than an extraneous factor. 3/23/2020 21
  • 22. In obtaining construct-related evidence of validity, there are three steps involved. 1. the variable being measured is clearly defined. 2. hypotheses, based on a theory underlying the variable, are formed about how people who possess a lot versus a little of the variable will behave in a particular situation. 3. the hypotheses are tested both logically and empirically. 3/23/2020 22
  • 23. Construct-related Evidence – Does the test measure the ‘human’ characteristic(s)? it is supposed to measure: – Verbal reasoning – Mathematical reasoning – Musical ability – Spatial ability – Mechanical aptitude – Motivation *Applicable to authentic assessment, each construct is broken down into its component parts An example: ‘motivation’ can be broken down to: – Interest – Attention span – Hours spent – Assignments undertaken and submitted, etc. All of these sub-constructs put together – measure ‘motivation’ 3/23/2020 23
  • 24. Factors that can lower Validity • Unclear directions • Difficult reading vocabulary and sentence structure • Ambiguity in statements • Inadequate time limits • Inappropriate level of difficulty • Poorly constructed test items • Test items inappropriate for the outcomes being measured • Tests that are too short • Improper arrangement of items (complex to easy?) • Identifiable patterns of answers • Nature of criterion 3/23/2020 24
  • 26. RELIABILITY • Reliability refers to the consistency of scores obtained from one administration of an instrument to another and from one set of items to another. • If the test is reliable, we would expect a student who receives a high score on the test for measuring typing ability at first instance, to receive a high score the next time he takes the test. The scores may not be identical, but they should be close. 3/23/2020 26
  • 27. How can reliability be established? • Quantitative studies? – Assumption of repeatability • Qualitative studies? – Reframe as dependability and confirmability 3/23/2020 27
  • 28. ERRORS OF MEASUREMENT • Whenever people take the same test twice, they will seldom perform exactly the same, that is, their scores or answers will not be identical. It is inevitable due to a variety of factors such as motivation, energy, anxiety, a different testing situation etc. Such factors result in errors of measurement. 3/23/2020 28
  • 29. The scores obtained from an instrument can be quite reliable, but not valid. The test on the Constitution of the US versus success in the physical education. If the data are unreliable, they cannot lead to valid inferences. (See Figure 8.2, p.166) 3/23/2020 29
  • 30. Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose of tests’ and the ‘consistency’ with which the purpose is fulfilled/met
  • 32. Reliability Coefficient • Reliability Coefficient can be measured by three best known ways: • Test – retest Give the same test twice to the same group with any time interval between tests • Equivalent forms (similar in content, difficulty level, arrangement, type of assessment, etc.) Give two forms of the test to the same group in close succession. • The internal Consistency methods (subjective scoring) Calculate percent of exact agreement by using Pearson's product moment and find out the coefficient of determination (SPSS) 3/23/2020 32
  • 33. • The test-retest method involves administering the same test to the same group after a certain time has elapsed. A reliability coefficient is then calculated to indicate the relationship between the two sets of scores obtained. For most educational research, stability of scores over a two- to three-month period is usually viewed as sufficient evidence of test-retest reliability. • An example: test on a Monday, then again the following Monday. 3/23/2020 33
  • 34. Validity & Reliability Coefficients Equivalent-forms method • When the equivalent-forms method is used, two different but equivalent (parallel or alternate) forms of an instrument are administered to the same group of individuals during the same time period. A high coefficient would indicate strong evidence of reliability – that two forms are measuring the same thing. • An example: uses one set of questions divided into two equivalent sets (“forms”), where both sets contain questions that measure the same construct, knowledge or skill 3/23/2020 34
  • 35. INTERNAL CONSISTENCY METHODS • The methods we have seen so far require two administrations or testing sessions. There are several internal-consistency methods of estimating reliability that require only a single administration of an instrument. » Split-half procedure: This procedure involves scoring two halves (odd items versus even items) of a test separately for each person and then calculating a correlation coefficient for the two sets of scores. The coefficient indicates the degree to which two halves of the test provide the same results and hence describes the internal consistency of the test. An example: If we are interested in the perceived practicality of electric cars and gasoline-powdered cars, we could use a split- half method and ask the "same" question two different ways. 3/23/2020 35
  • 36. INTERNAL CONSISTENCY METHODS Kuder-Richardson Approaches: The most frequently used method for determining internal consistency is the Kuder-Richardson approach, particularly formulas KR20 and KR21. An example: reliability for a binary test (i.e. one with right or wrong answers). KR21 requires (1) the number of items on the test, (2) the mean, and (3) the standard deviation if the items on the test are of equal difficulty. (See KR21 formula in P.167) Alpha Coefficient – Another check on the internal consistency of an instrument is to calculate an alpha coefficient frequently called Cronbach alpha symbolized as (). An example: measures internal reliability for tests with multiple possible answers. 3/23/2020 36
  • 37. INTERNAL CONSISTENCY METHODS • The standard error of measurement (SEM) is a measure of how much measured test scores are spread around a “true” score. It’s an index shows the extent to which a measurement would vary under changed circumstances (the amount of measurement error) 3/23/2020 37
  • 38. Validity & Reliability Coefficients • A validity coefficient expresses the relationship between scores of the same individuals on two different instruments. • A reliability coefficient expresses the relationship between the scores of the same individuals on the same instrument at two different times or between two parts of the same instrument. • Reliability coefficients must range from .00 to 1.00, that is, with no negative values.3/23/2020 38
  • 39. Conclusion • Reliability and validity are concepts used to evaluate the quality of research. They indicate how well a method, technique measures something. Reliability is about the consistency of a measure, and validity is about the accuracy of a measure. • It’s important to consider reliability and validity when you are creating your research design, planning your methods, and writing up your results, especially in quantitative research. 3/23/2020 39
  • 41. 3/23/2020 41 “Well, I have been drinking more lately, and I’m on a new diet. Also my wife and I haven’t been getting along of late.” “You think that the increase in your blood pressure is due to the new class you’ve been assigned. Is anything else different?” M.D. Teacher
  • 42. INTERNAL VALIDITY • When a study has internal validity, it means that any relationship observed between two or more variables should be unambiguous as to what it means rather than being due to “something else” – (alternative hypothesis). • Defending against sources of bias arising in research design. • An example: let’s suppose you ran an experiment to see if mice lost weight when they exercised on a wheel. You used good experimental practices, like random samples, and you used control variables to account for other things that might cause weight loss (change in diet, disease, age etc.). In other words, you accounted for the confounding variables that might affect your data and your experiment has high validity. 3/23/2020 42
  • 43. Threats to Internal Validity • Subject characteristics • Mortality • Location • Instrumentation • Testing • History • Maturation • Attitude of Subjects • Regression • Implementation 3/23/2020 43
  • 44. Subject characteristics Participants in a study may have different characteristics and those differences may affect the results. Examples: Age Intelligence Vocabulary Strength Attitude Fluency Maturity Fluency Ethnicity Coordination Speed Socioeconomic status 3/23/2020 44
  • 45. Mortality Loss of subjects from the study due to: Examples: – Illness – Family relocation – Requirements of other activities – Absent during collection of data or fail to complete tests 3/23/2020 45
  • 46. Location Place in which data is collected, or an intervention is carried out, may influence the results. An example: Studying the behavior of animals in a zoo may make it easier to draw valid causal inferences within that context, but these inferences may not generalize to the behavior of animals in the wild. 3/23/2020 46
  • 47. Instrumentation Inconsistent use of the measurement instrument. 1. Instrument Decay – instrument changes in some way 2. Data Collector Characteristics – affects the nature of the data they obtain. Such as gender, age, ethnicity, age, language patterns, etc. 3. Data Collector Bias – unconsciously influence the outcome of the data. An example: Two examiners for an instructional experiment administered the post-test with different instructions and procedures. 3/23/2020 47
  • 48. Testing • In an experiment in which performance on a logical reasoning test is the dependent variable, a pre-test cues the subjects about the post-test. • So pre-tests may influence the result of the post-test. • A group may perform better on a post-test because the pre- test primed them to perform better. • An example: Participants may remember the correct answers or may be conditioned to know that they are being tested. Repeatedly taking (the same or similar) intelligence tests usually leads to score gains. 3/23/2020 48
  • 49. History Occurrence of events that could alter the outcome or the results of the study Previous – events that occurred prior to the study Concurrent – happening during the study. An example: What if the children in one group differ from those in the other in their television habits. Perhaps the experimental group children watch a certain program more frequently than those in the control group do. There’s really an effect of the two groups differentially experiencing a relevant event in this case -certain program- between the pretest and posttest. 3/23/2020 49
  • 50. Maturation Any changes that occur in the subjects during the course of the study that are not part of the study and that might affect the results of the study; such as changes due to aging and experience. An example: The sample can go from being in a good mood or a bad one. Factors such as subject tiredness, boredom, hunger and inattention can also occur. These factors can be driven by the research participant or the experiment. 3/23/2020 50
  • 51. Attitude of Subjects • Subjects opinion and participation can influence the outcome. • Observing or studying subjects can affect their responses a.k.a. Hawthorne effect. • Subjects receiving experimental treatment may perform better due to “receiving” treatment. • Subjects in the control group may perform more poorly than the treatment group. • An example: The students may feel that they are being tested too often. This feeling could cause them to get tired of having to take another test and not do their best. This change in attitude could cause the results of the study to be biased. 3/23/2020 51
  • 52. Regression Groups that are chosen due to performance characteristics, either high or low, will on the average score closer to the mean on subsequent testing regardless of what transpires during the experiment. An example: A group of students who score low on a mathematics test are given additional help. Six weeks later they are given an exam similar problems on another test, their average score has improved. Is it due to the additional help or other influences? 3/23/2020 52
  • 53. Implementation Personal bias in favor of one method or another. Preference for the method may account for better performance by the subjects; how well you implement your methodology whether it has measured what it your research question has intended to or not. An example: If you were interested in the impact of two different teaching methods, namely students receiving lectures and seminar classes compared to students receiving lectures only on the exam performance of students, you may also want to ensure that the teachers involved in the study had a similar educational background, teaching experience , and so forth. 3/23/2020 53
  • 54. Methods to Minimize Threats  Standardization of the conditions under which the research study is carried out will help minimize threats from history and instrumentation.  Obtain as much information as possible about the participants in the research study minimizes threats from mortality and selection.  Obtain as much information as possible about the procedural details of the research study, for example, where and when the study occurs, minimizing threats from history and instrumentation.  Choose an appropriate research design which can help control most other threats. 3/23/2020 54
  • 55. An Example: Is the test appropriate to the population? • What is the composition of the test-taking population? • To what extent can the assessment be administered without encumbrance to all members of the population? • Is there a translated version, adapted version or accommodated version of the test? • Are there recommendations for alternative testing procedures? • Has the planned accommodation been assessed in terms of it’s impact on validity and reliability of test scores? 3/23/2020 55
  • 56. Conclusion • Internal validity refers to how well a piece of research allows you to choose among alternate explanations of something. A research study with high internal validity lets you choose one explanation over another with a lot of confidence, because it avoids (many possible) confounds. • The more valid and reliable research instruments are the more likely one will draw the appropriate conclusions from the collected data and solve the research problem in a creditable fashion. 3/23/2020 56
  • 57. how do I go about establishing / ensuring validity & reliability in my own test papers? 3/23/2020 57
  • 58. What do you think…? • Forced-choice assessment forms are high in reliability, but weak in validity (true/false) • Performance-based assessment forms are high in both validity and reliability (true/false) • A test item is said to be unreliable when most students answered the item wrongly (true/false) • When a test contains items that do not represent the content covered during instruction, it is known as an unreliable test (true/false) • Test items that do not successfully measure the intended learning outcomes (objectives) are invalid items (true/false) • Assessment that does not represent student learning well enough are definitely invalid and unreliable (true/false) • A valid test can sometimes be unreliable (true/false) – If a test is valid, it is reliable! (by-product) 3/23/2020 58
  • 62. Bibliography Fraenkel, J.R. & Wallen, N.E. (2002). Validity & Reliability, Internal Validity. How to design and evaluate research in education( fifth edition). Martin, Wendy (1997). Single Group Threats to Internal Validity. Retrieved October 15, 2006 from the World Wide Web: http://www.socialresearchmethods.net/tutorial/Martin/intva l1.html. https://www.youtube.com/watch?v=KuT2n1w0Ixc https://www.youtube.com/watch?v=F6LGa8jsdjo https://www.youtube.com/watch?v=2fK1ClycBTM 3/23/2020 62