VALIDITY AND
RELIABILITY
Validity refers to the appropriateness,

meaningfulness, correctness and usefulness of the
inferences a researcher makes....
Validation is the process of collecting and analyzing

evidence to support inferences.
The term validity refers to the d...
An appropriate inference would be the one that is

relevant ( related to the purposes of the study).
A meaningful infere...
1) Content-related evidence of validity
2) Criterion-related evidence of
validity
3) Construct-related evidence of validit...
This evidence refers to the content and format of the

instrument.
The content and format must be consistent with the

d...
Content validation is a matter of determining if the

content that the instrument contains is an adequate
sample of the d...
A common way to do this is to have someone look at

the content and format of the instrument and judge
whether or not it ...
Criterion – A second test or other assessment

procedure presumed to measure the same variable.
Researcher usually will ...
Concurrent
Allow a time interval to
elapse between
administration of
instrument and obtaining
criterion score

Both data f...
A key index to both forms of criterion-related

evidence.
Symbolized by the letter r
Indicating the degree of relations...
 More typically associated with research studies than

testing.
 Multiple sources are used to collect evidence.
 A comb...
 The variable being measured is clearly

defined.
 Hypotheses, are formed about how people
who possess a lot versus a li...
Reliability refers to the consistency of the scores

obtained.
If scores are completely inconsistent for a person,

they...
There are many factors lead to errors of measurements

such as:
1) Differences in motivation
2) Energy
3) Anxiety
4) Diff...
1) Test-retest method
2) Equivalent forms method
3) Internal-consistency methods
 Test-retest reliability is a measure of reliability obtained

by administering the same test twice over a period of time...
In reporting test-retest reliability coefficients, the

time interval between the two testings should always
be reported....
The shorter the time gap, the higher the correlation;

the longer the time gap, the lower the correlation.
 This is beca...
Two different but equivalent forms of an instrument

are administered to the same group of people.
Knows as ‘alternate’ ...
Correlation coefficient will be calculated from the

scoring two halves (usually the odd vs. the even) of
the test.
The ...
Most frequently applied method in determining the

internal consistency.
Need 3 information
 Number of items on the tes...
Also known as Cronbach alpha
Is a general form of the KR20 formula
To be used in calculating the reliability of items t...
 An index that shows the extent to which a measurement

would vary under changed circumstances.
 Hence, there are many p...
 Instrument s that use direct observation are highly

vulnerable to observer differences.
 Researchers are obliged to in...
 All researches should ensure that any inferences they

draw that are based on data obtained through the use
of an instru...
Internal validity refers to the degree to which

observed differences on the dependent variable are
directed to the indep...
 The “something else” may be any one (or more) of a

number of factors, such as the age or the ability of the
subjects, t...
Regardless of whether the study is qualitative or

quantitative, if these “rival hypotheses” are not
controlled or accoun...
1) Subject characteristics
2) Loss of subjects (Mortality)
3) Location
4) Instrumentation
5) Data collector characteristic...
6) Testing
7) History
8) Maturation
9) Attitude of subjects
10)Regression
11) Implementation
Selection bias happens when the selection of people

for a study may result in the individuals (or groups)
differing from...
5) Coordination
6) Speed
7) Intelligence
8) Vocabulary
9) Attitude
10) Reading ability
11) Fluency
12) Manual dexterity
13...
Mortality threat refers to the possibility that results

are due to the fact that subjects who are for whatever
reason “l...
 Loss of subjects not only limits generalizability but also

introduce bias- if those subjects who are lost would have
re...
Experimental mortality which is also known as the loss of

subjects.

Example:
In a Web-based instruction project entitl...
Location threat happens when the particular

locations in which an intervention is carried out, may
create alternative ex...
The particular location in which data are collected or

in which an intervention is carried out, may create
alternative e...
Student performance on tests may be lower if tests are

given in noisy or poorly lighted rooms. Observations of
student i...
Instrumentation is the process where instruments and

procedures are used in collecting data in a study.

Instrument dec...
Instruments are devices used by researchers to collect

information.
Examples are: questionnaires, surveys, tests,

obse...
Fatigue often happens when a researcher scores a

number of tests one after the other; he/she becomes
tired and scores th...
Example:
People may be more willing to be interviewed by females
rather than males. Other characteristics could be
langua...
The characteristics of the data gatherers may tamper

the data.
 Female data collector will elicit more of a confession
...
The data collectors or the scorer may unconsciously

distort the data
 More time allowed in the exam
 Interviewers aski...
Testing : the use of any instrumentation.
Testing Threat : the subjects already ‘practiced’ the

post-test using the pre...
One or more unanticipated and unplanned event

occur during the course of the study that can affect
the result/outcome
 ...
During an intervention, the change happen with the

influence of time rather than the intervention itself.
It is a serio...
 Hawthorne Effect
•

This positive effect, resulting from increased attention and recognition
of subjects
eg: productivit...
 Presence of regression threat
 Change is studied in a group that is extremely low or

high in its preintervention perfo...
 A possibility where an experimental group may be

treated in ways that are unintended and not
necessarily part of the me...
Evaluate the individuals who implement each method

on pertinent (relevant) characteristic, and then try to
equate the tr...
Some teachers may have different abilities to

implement the different methods.
 Use several different individuals to im...
 Standardizing the conditions under which the study

occurs.
(Location, Instrumentation, Subject attitude and Implementat...
The degree to which results are generalizable, or

applicable, to groups and environments outside the
research setting.
...
A study that has a large, randomly selected sample or

a carefully matched sample is said to have external
validity.
Rec...
Population refers to any set of people or events from

which the sample is selected and to which the study
results will g...
This is why trying to find a representative sample is so

important because researchers usually want the
results of an in...
The first approach is the Sampling Model.
In the sampling model, you start by identifying the

population you would like...
However, there are several problems with this

approach.
First, perhaps you don't know at the time of your

study who yo...
A non-random sample reduces the external validity of

the study.
Much medical research is done on the patients one

sees...
The sample size required depends on the

requirements of the study and size of the population.
As a rule the bigger the ...
Researches should describe the sample as thoroughly

ad possible (in detail; age, gender, ethnicity and others) so that
i...
 Have not been used;
 Educational researches may be unaware of the hazards involved in
generalizing when one does not ha...
The degree to which the result of a study can be

extended to other settings or condition.
The researcher must ensure th...
THANK YOU 
Chapter 8 compilation
Chapter 8 compilation
Chapter 8 compilation
Chapter 8 compilation
Chapter 8 compilation
Chapter 8 compilation
Upcoming SlideShare
Loading in …5
×

Chapter 8 compilation

341 views

Published on

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
341
On SlideShare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Planned Ignorance : Data collector made unaware of the hypothesis or any information regarding the research.
  • Chapter 8 compilation

    1. 1. VALIDITY AND RELIABILITY
    2. 2. Validity refers to the appropriateness, meaningfulness, correctness and usefulness of the inferences a researcher makes. Reliability refers to the consistency of scores or answers from one administration of an instrument to another, and from one set of items to another.
    3. 3. Validation is the process of collecting and analyzing evidence to support inferences. The term validity refers to the degree to which evidence supports any inferences a researcher makes based on the data he/she collects using a particular instrument. The validation process is on not on the instrument but on the inferences a researcher makes.
    4. 4. An appropriate inference would be the one that is relevant ( related to the purposes of the study). A meaningful inference says something about the meaning of the information (such as test scores) obtained through the use of an instrument. Validity depends on the amount and type of evidence there is to support the interpretations researchers wish to make concerning data they have collected.
    5. 5. 1) Content-related evidence of validity 2) Criterion-related evidence of validity 3) Construct-related evidence of validity
    6. 6. This evidence refers to the content and format of the instrument. The content and format must be consistent with the definition of the variable and the sample of subjects to be measured. One key element in this kind of evidence is the adequacy of the sampling.
    7. 7. Content validation is a matter of determining if the content that the instrument contains is an adequate sample of the domain of content it is supposed to represent. The other aspect of content validation has to do with the format of the instrument such as clarity of printing and appropriateness of language. However, valid results cannot be obtained if an adequate questions in an instrument presented in an inappropriate format ( such as giving a test written in English to children whose English is minimal).
    8. 8. A common way to do this is to have someone look at the content and format of the instrument and judge whether or not it is appropriate. However, the qualifications of the judges are always in important consideration, and the judges must keep in mind the characteristics of the intended sample.
    9. 9. Criterion – A second test or other assessment procedure presumed to measure the same variable. Researcher usually will compare the performance from one instrument with other performance obtained from another criterion.
    10. 10. Concurrent Allow a time interval to elapse between administration of instrument and obtaining criterion score Both data from the instrument and criterion are collected in near same time
    11. 11. A key index to both forms of criterion-related evidence. Symbolized by the letter r Indicating the degree of relationship that exist between the scores individuals obtained by the two instrument Can have positive and negative relationship. The difference between +1.00 and -1.00 If the r is .00, it means that there is no relation.
    12. 12.  More typically associated with research studies than testing.  Multiple sources are used to collect evidence.  A combination of observation, surveys, focus groups, and other measures are used to identify how much of the trait being measured is possessed by the observee.
    13. 13.  The variable being measured is clearly defined.  Hypotheses, are formed about how people who possess a lot versus a little of the variable will behave in a particular situation.  The hypotheses are tested both logically and empirically.
    14. 14. Reliability refers to the consistency of the scores obtained. If scores are completely inconsistent for a person, they provide no useful information. The distinction between reliability and validity is shown in Figure 8.2 Indeed, if the data is unreliable, it cannot be valid but if the data is valid, it will always be reliable to be used.
    15. 15. There are many factors lead to errors of measurements such as: 1) Differences in motivation 2) Energy 3) Anxiety 4) Different testing situation A reliability coefficient expresses relationship between scores of the same individuals on the same instrument at two different times, or on two parts of the same instrument.
    16. 16. 1) Test-retest method 2) Equivalent forms method 3) Internal-consistency methods
    17. 17.  Test-retest reliability is a measure of reliability obtained by administering the same test twice over a period of time to a group of individuals. The scores from Time 1 and Time 2 can then be correlated in order to evaluate the test for stability over time.  A reliability coefficient is then calculated to indicate the relationship between the two sets of scores obtained.  Stability of scores over a two- to three-month is usually viewed as sufficient evidence of test-retest reliability for most educational research.
    18. 18. In reporting test-retest reliability coefficients, the time interval between the two testings should always be reported. As an example, a test designed to assess student learning in psychology could be given to a group of students twice, with the second administration perhaps coming a week after the first. The obtained correlation coefficient would indicate the stability of the scores.
    19. 19. The shorter the time gap, the higher the correlation; the longer the time gap, the lower the correlation.  This is because the two observations are related over time -- the closer in time we get the more similar the factors that contribute to error. Since this correlation is the test-retest estimate of reliability, you can obtain considerably different estimates depending on the interval.
    20. 20. Two different but equivalent forms of an instrument are administered to the same group of people. Knows as ‘alternate’ or ‘parallel’. Containing the same content but constructed differently High coefficient will indicate a strong evidence of reliability and vice versa Can be combined with test-retest where the coefficient will also cover the consistency over time.
    21. 21. Correlation coefficient will be calculated from the scoring two halves (usually the odd vs. the even) of the test. The internal consistency of the test will be described by the relativity of the two halves. Calculated using the Spearman-Brown Prophecy Formula (pg. 156). The reliability of the test may be increased by adding more item to it (have to be similar to the original item.)
    22. 22. Most frequently applied method in determining the internal consistency. Need 3 information  Number of items on the test – K  The mean – M  Standard deviation of the test – SD Example – pg. 157
    23. 23. Also known as Cronbach alpha Is a general form of the KR20 formula To be used in calculating the reliability of items that are not scored by right vs. wrong (more than one answer is possible)
    24. 24.  An index that shows the extent to which a measurement would vary under changed circumstances.  Hence, there are many possible standard errors for a given score. eg: IQ tests; SMEas over one year period with different content = 5 points 10-year period = 8 points’  We doubled standard errors in measurement in computing the ranges within which the second score is expected to fall.  This was done so 95% sure that our estimates were correct.
    25. 25.  Instrument s that use direct observation are highly vulnerable to observer differences.  Researchers are obliged to investigate and report the degree of scoring agreement Enhanced by training the observers and increasing the number of observation periods.
    26. 26.  All researches should ensure that any inferences they draw that are based on data obtained through the use of an instrument are appropriate, credible, and backed up by evidence.
    27. 27. Internal validity refers to the degree to which observed differences on the dependent variable are directed to the independent variable, not to some other (uncontrolled) variable. When a study has internal validity, it means that any relationship observed between two or more variables should be unambiguous as to what it means rather than being due to “something else”.
    28. 28.  The “something else” may be any one (or more) of a number of factors, such as the age or the ability of the subjects, the conditions under which the study is conducted, or the types of materials used.  If these factors are not in some way or another controlled or accounted for, the researcher can never be sure that they are not the reason for any observed results. In qualitative research, a study is said to have good internal validity if the alternative explanations (the “something else”) have been systematically ruled out.
    29. 29. Regardless of whether the study is qualitative or quantitative, if these “rival hypotheses” are not controlled or accounted for in some way, the researcher can never be sure that they are not the reason for any observed results.
    30. 30. 1) Subject characteristics 2) Loss of subjects (Mortality) 3) Location 4) Instrumentation 5) Data collector characteristics
    31. 31. 6) Testing 7) History 8) Maturation 9) Attitude of subjects 10)Regression 11) Implementation
    32. 32. Selection bias happens when the selection of people for a study may result in the individuals (or groups) differing from one another in unintended ways that are related to the variables to be studied. Some examples that might affect the results of a study include: 1) Age 2) Strength 3) Maturity 4) Gender ethnicity
    33. 33. 5) Coordination 6) Speed 7) Intelligence 8) Vocabulary 9) Attitude 10) Reading ability 11) Fluency 12) Manual dexterity 13) Socioeconomic status 14) Religious beliefs 15) Political beliefs
    34. 34. Mortality threat refers to the possibility that results are due to the fact that subjects who are for whatever reason “lost” to a study may differ from those who remain so that their absence has an important effect on the results of the study. This threat is due to some reasons such as illness, family relocation, requirements of other activities or some individuals may drop out of the study.
    35. 35.  Loss of subjects not only limits generalizability but also introduce bias- if those subjects who are lost would have responded differently from those from whom data were obtained. However, there is an attempt to eliminate the problem of mortality is to provide evidence that the subjects lost were similar to those remaining on pertinent characteristics such as age, gender, ethnicity, pretest scores, or other variables that presumably that might be related to the study outcomes. Indeed, the best solution to this threat is preventing or minimizing the loss of subjects.
    36. 36. Experimental mortality which is also known as the loss of subjects. Example: In a Web-based instruction project entitled Eruditio, it started with 161 subjects and only 95 of them completed the entire module. Those who stayed in the project all the way to end may be more motivated to learn and thus achieved higher performance.
    37. 37. Location threat happens when the particular locations in which an intervention is carried out, may create alternative explanations for results. The best method to control this problem is to hold location constant-that is, keep it the same for all participants.
    38. 38. The particular location in which data are collected or in which an intervention is carried out, may create alternative explanations for results. This is called a location threat. Example: Classrooms in which students are taught may have more or less resources, workstations, lighting, or teachers who may skew the results inadvertently. The location in which tests are administered may affect responses. Parent assessments of their children may be different when done at home than at school or if done individually or in groups.
    39. 39. Student performance on tests may be lower if tests are given in noisy or poorly lighted rooms. Observations of student interaction may be affected by the physical arrangement in certain classrooms. The best method of control for a location threat is to hold location constant that is, keep it the same for all participants.
    40. 40. Instrumentation is the process where instruments and procedures are used in collecting data in a study. Instrument decay happens when instrumentation creates problem if the nature of the instrument (including the scoring procedure) is changed in some way or another. This is often the case when instrument permits different interpretations of results (as in essay tests) or is especially long or difficult to score, thereby resulting in fatigue of the scorer.
    41. 41. Instruments are devices used by researchers to collect information. Examples are: questionnaires, surveys, tests, observation, participation, studies…. Instrument Decay can be a problem if the nature of the instrument is changed over time.  This may be due to fatigue or repetition on the part of the person administering the test, taking the test, or correcting the test.
    42. 42. Fatigue often happens when a researcher scores a number of tests one after the other; he/she becomes tired and scores the tests differently. For example, more rigorously at first, more generously later. Data Collector Characteristics is an inevitable part of most instruments and can affect results. The individual who collects the data may affect the results unintentionally.
    43. 43. Example: People may be more willing to be interviewed by females rather than males. Other characteristics could be language patterns, ethnicity, age, size…. Also, individuals may present information, researchers may collect data differently, or counselors may use different tactics when presenting orally. These threats, know as the implementer effect need to be controlled for as much as possible.
    44. 44. The characteristics of the data gatherers may tamper the data.  Female data collector will elicit more of a confession from the situation compared Primary ways to control the threat  Use the same data collector(s)  Analysing data separately between each collector  Ensuring each collector were used equally in a group setting
    45. 45. The data collectors or the scorer may unconsciously distort the data  More time allowed in the exam  Interviewers asking leading questions Technique to handle data collectors bias :  Standardize all procedure Require some training  Ensure data collectors lack the information that require them to distort the data Also known Planned Ignorance
    46. 46. Testing : the use of any instrumentation. Testing Threat : the subjects already ‘practiced’ the post-test using the pre-test given to them prior to the subject A pre-test sometimes regarded as a practice thus making the subjects alert/aware of the questions.
    47. 47. One or more unanticipated and unplanned event occur during the course of the study that can affect the result/outcome  Construction noises  Death of a certain eminent person Researcher must be alert on any events or occurrence during the study.
    48. 48. During an intervention, the change happen with the influence of time rather than the intervention itself. It is a serious threat to pre-post studies or studies that span over years of time. The best way overcome this is to have a good comparison group in the study.
    49. 49.  Hawthorne Effect • This positive effect, resulting from increased attention and recognition of subjects eg: productivity increased were made in physical working conditions (increase in the number of breaks)  Opposite Effect  The negative effect, resulting in becoming demoralized or resentful and hence perform more poorly than the treatment group. eg: productivity decreased when the control group receive no treatment at all  Remedy • Provide the control or comparison group(s) with a special treatment comparable to that received by the experimental group.
    50. 50.  Presence of regression threat  Change is studied in a group that is extremely low or high in its preintervention performance. eg: a class of students of markedly low ability may are given special help. Six months later, their average score on test involving similar problem has improved, but not necessarily because of the special help
    51. 51.  A possibility where an experimental group may be treated in ways that are unintended and not necessarily part of the method, yet which give them an advantage of one sort or another.  Different individuals implement different methods different outcomes  Some individuals have a personal bias in favor of one method over the other.
    52. 52. Evaluate the individuals who implement each method on pertinent (relevant) characteristic, and then try to equate the treatment groups on these dimensions.  To require that each method be taught by all teachers in the study.
    53. 53. Some teachers may have different abilities to implement the different methods.  Use several different individuals to implement each method, thereby reducing the chances of an advantage to either method.  Allow individuals to choose the method they wish to implement.  Have all methods used by all implementers, but with their preferences known beforehand.
    54. 54.  Standardizing the conditions under which the study occurs. (Location, Instrumentation, Subject attitude and Implementation threats)  Obtaining and using more information on the subjects of the study. (Subject characteristics, Mortality, Maturation and Regression threats)  Obtaining and using more information on the details of the study. ( Location, Instrumentation, History, Subject attitude and Implementation threats)  Choosing an appropriate design.
    55. 55. The degree to which results are generalizable, or applicable, to groups and environments outside the research setting. The extent to which the results of a study can be generalized determines the external validity of the study. External validity is related to generalizing. That's the major thing you need to keep in mind.
    56. 56. A study that has a large, randomly selected sample or a carefully matched sample is said to have external validity. Recall that validity refers to the approximate truth of propositions, inferences, or conclusions so external validity refers to the approximate truth of conclusions the involve generalizations. In simpler words, external validity is the degree to which the conclusions in your study would hold for other persons in other places and at other times.
    57. 57. Population refers to any set of people or events from which the sample is selected and to which the study results will generalize. Population generalizability refers to the degree to which a sample represents the population of interest. However, if the results of a study only apply to the group being studied and if that group is fairly small or is narrowly defined, the usefulness of any findings is seriously limited.
    58. 58. This is why trying to find a representative sample is so important because researchers usually want the results of an investigation to be as widely applicable as possible. Representativeness refers only to the essential, or relevant, characteristics of a population. In science there are two major approaches on how we provide evidence for a generalization.
    59. 59. The first approach is the Sampling Model. In the sampling model, you start by identifying the population you would like to generalize to. Then, you draw a fair sample from that population and conduct your research with the sample. Finally, because the sample is representative of the population, you can automatically generalize your results back to the population.
    60. 60. However, there are several problems with this approach. First, perhaps you don't know at the time of your study who you might ultimately like to generalize to. Second, you may not be easily able to draw a fair or representative sample. Third, it's impossible to sample across all times that you might like to generalize to (like next year).
    61. 61. A non-random sample reduces the external validity of the study. Much medical research is done on the patients one sees in the clinic, this is a non-random sample that is not representative of a larger population. It will not generalize because it is not a fatal flaw in the study. A study with a non-random sample still identifies true facts about the sample and perhaps those findings will be true for others as well. It is best to define your population first, and then obtain a random sample.
    62. 62. The sample size required depends on the requirements of the study and size of the population. As a rule the bigger the better. If the sample is too small then the performance of a few individuals can have a big effect on the data, and render the data less representative of the population.
    63. 63. Researches should describe the sample as thoroughly ad possible (in detail; age, gender, ethnicity and others) so that interested others can judge for themselves the degree that they want.  Replication; repeats the study using different groups of subjects in different situations.
    64. 64.  Have not been used;  Educational researches may be unaware of the hazards involved in generalizing when one does not have a random sample.  It is simply not feasible for a researcher to invest the time, money or other resources to obtain a random sample.
    65. 65. The degree to which the result of a study can be extended to other settings or condition. The researcher must ensure that the important aspect must match in order to generalize the finding from another study What hold true for another subject/material/condition/time doesn’t mean it will remain true with the other Hence, researcher must be careful in generalizing the findings from another research with the other
    66. 66. THANK YOU 

    ×