VALIDITY IN RESEARCH
ELT- 845 Research Methods in ELT
ECEM EKİNCİ
What is
Validity?
 Validity is defined as the extent to which a
concept is accurately measured in a study.
 It denotes the extent to which an instrument is
measuring what it is supposed to measure.
 As a process, validation involves collecting and
analyzing data to assess the accuracy of an
instrument.
 The concept of validity applies to both whole
studies (often called inference validity) and the
measurement of individual variables (often called
construct validity).
Types of Validity
Construct
Validity
Translation
Validity
Face
Content
Criterion
Validity
Predictive
Concurrent
Convergent
Discriminant
Interface Validity
Internal Validity
External Validity
Ecological
Validity
Interface Validity
 Internal validity (causality)
Refers to whether claimed conclusions, especially relating
to causality, are consistent with research results (e.g.,
statistical results) and research design (e.g., presence of
appropriate control variables, use of appropriate
methodology).
 If we suggest that x causes y, we should be sure that it is x
that is responsible for variation in y and not something else
that is producing an apparent causal relationship.
 External validity (generalizability)
External validity is the extent to which the results of a study
can be generalized from a sample to a population.
 A study that is externally valid helps obtain population
generalizability, or the degree to which a sample represents
the population.
A carpenter, a school teacher, and scientist were
traveling by train through Scotland when they saw a
black sheep through the window of the train. "Aha,"
said the carpenter with a smile, "I see that Scottish
sheep are black." "Hmm," said the school teacher, "You
mean that some Scottish sheep are black." "No," said
the scientist, "All we know is that there is at least one
sheep in Scotland, and that at least one side of that one
sheep is black."
 Ecological validity
It is concerned with the question of whether social scientific
findings are applicable to people’s everyday, natural social
settings.
 Do our instruments capture the daily life conditions, opinions,
values, attitudes, and knowledge base of those we study as
expressed in their natural habitat?’
Construct Validity (Measurement)
 Construct validity refers to whether you can draw inferences about
test scores related to the concept being studied.
1. Translation Validity
Face validity is concerned with whether it seems like we measure
what we claim. Here we look at how valid a measure appears on
the surface and make subjective judgments based on that.
Content validity. Do all of the elements of the measure seem
connected in the right direction to the concept. Do the measures
(questions, observation logs, etc.) accurately assess what you
want to know? e.g achievement tests.
2. Criterion Validity How well the measure relates to other measures and
characteristics
 Predictive validity. Ability to predict future events. e.g. a score of high self-efficacy
related to performing a task should predict the likelihood a participant completing the
task.
 Concurrent validity. Ability to discriminate between groups. For example, if you are
testing math, engineers should do better on the test than poets.
 Convergent validity shows that an instrument is highly correlated with instruments
measuring similar variables.
 Discriminant validity shows that an instrument is poorly correlated to instruments
measure different variables. In this case, for example, there should be a low correlation
between an instrument that measures motivation and one that measures self-efficacy.
Construct Validity
How can validity be improved?
The validity of the research findings is influenced by a range of different
factors including choice of sample, researcher bias and design of the
research tools.
Quantitative research
Appropriate statistical analysis
of the data
Design of research tools
Sample selection
Sample size
Qualitative research
Avoiding researcher bias
Design of research tools
Sample selection
The use of triangulation
Member checking, peer
reviews, external audits
 Member checking, also known as participant or respondent
validation, is a technique for exploring the credibility of results.
or results are returned to participants to check for accuracy and
resonance with their experiences.
 Peer review is the evaluation of work by one or more people with
similar competences as the producers of the work (peers). It
functions as a form of self-regulation by qualified members of a
profession within the relevant field. Peer review methods are used
maintain quality standards, improve performance and provide
credibility.
 Thick description is a term used to characterize the process of
paying attention to contextual detail in observing and interpreting
social meaning when conducting qualitative research.
 Validity should be viewed as a continuum, it is possible to improve the
validity of the findings within a study, however 100% validity can never
be achieved.
 The chosen methodology needs to be appropriate for the research
questions being investigated and this will then impact on your choice of
research methods.
 The design of the instruments used for data collection is critical in
ensuring a high level of validity.
 It is important to be aware of the potential for researcher bias to impact
on the design of the instruments.
AN EVALUATION OF PHD ELT PROGRAMS
IN TURKEY
PhD, Hacettepe University, 2015
Hülya Küçükoğlu
 This current study aims to evaluate the PhD ELT programs
in Turkey in terms of program descriptions, program
content, and atmosphere in the department, as well as
departmental support and program resources. The
participants of the study were 116 students enrolled in PhD
ELT programs and graduates (from 11 universities) who had
already graduated from those programs. The researcher
made use of both qualitative and quantitative data.
Qualitative data were collected through open-ended survey
questions. Quantitative data were collected through
questionnaires (Nunan, 1992, p. 143). The researcher used
multiple data collection instruments to increase the
reliability of the evaluation data.
The importance of collecting multiple data sources is
emphasized by a number of researchers. Patton (1990,
p.244-246) asserted that the use of multiple data
sources such as interviews, observations and
document analysis enables the researcher or evaluator
to validate and cross-check findings. He also indicated
that a multimethod, triangulation approach increases
both the validity and the reliability of evaluation data
(1990: 245). In his study Patton further claimed that
the evaluator can build on the strengths of each type
of data collection while minimizing the weaknesses of
any single approach.
GENERAL LEARNING STRATEGIES: IDENTIFICATION, TRANSFER TO
LANGUAGE LEARNING AND EFFECT ON LANGUAGE ACHIEVEMENT
PhD, University of Southampton, 2016
Nahum Samperio Sánchez
 The purpose of this study is to identify the
general learning strategies that beginner
learners of English have in their repertoire, the
transfer of such strategies to language learning
and the predictive value they have in language
achievement. It is also intended to discover the
effect that the teaching of not frequently used
general learning strategies have on learners’
language achievement. To identify possible
differences in strategy types and frequency of
strategy use in low and high strategy users as
well as high and low achievers of beginner
English language learners.
 This study followed a mixed-methods research
methodology by collecting numerical data by means of
a 51-item general strategies questionnaire (Martinez-
Guerrero 2004) applied in two administrations. The
sample consists of 118 beginner English language
learners in a language center at a northern Mexican
University. Data were analyzed with the SPSS and Excel
software. The qualitative data was collected through
twenty individual semi structured interviews;
furthermore, three one-hour-forty minute strategy
instruction sessions were included as the treatment.
Bryman (2006) suggests that triangulation gives greater
validity to the traditional view and that quantitative and
qualitative research could be combined to triangulate
findings to be mutually corroborated.
TEACHER PERCEPTIONS ON THE USE OF TECHNOLOGY WITH
ENGLISH LANGUAGE LEARNERS
PhD, Liberty University, 2018
H. R. Harvil
 The purpose of this transcendental
phenomenological study was to understand
general education teachers’ perceptions
regarding their use of technology with
students who qualify for English Language
Learner services in an urban Georgia school
district. The self-efficacy theory originated
by Bandura was used to examine 17
teachers’ experiences of using technology
as possible personal preference or as being
influenced by environmental factors.
 I used triangulation between three methods of
data collection. Participants were interviewed,
answered a questionnaire, and participated in a
focus group. I analyzed all data and investigated
patterns (Creswell, 2013; Moustakas, 1994;
Saldana, 2013). I also utilized member checks to
check the data. I asked the members about their
responses and my analysis of the data to
establish credibility. I addressed dependability
using thick, descriptive data, noting details in the
data, and becoming familiar with the data
through repeated self-analysis (Guba & Lincoln,
1989; Lincoln & Guba, 1985). I used a
dependability audit trail and an outside source
checked the data.
TEACHER IMPLEMENTATION OF MOBILE LEARNING INITIATIVE
AT A SIXTH GRADE SCHOOL: A PHENOMENOLOGICAL STUDY
PhD, Liberty University, 2015
Tina Hemphill Fitts
 The purpose of this qualitative,
phenomenological study was to investigate
teachers’ perceptions of and experiences with
the adoption and implementation process as
teachers at Tops Sixth Grade School transition
from a one-to-one laptop initiative to a
school-wide adoption of mobile learning.
Participants included eight sixth-grade
teachers whose students utilized a variety of
mobile devices across multiple platforms to
enhance learning. Using phenomenological
reduction to analyze data gathered through in-
depth, semi-structured interviews,
observations, and an online focus group,
the study revealed several overarching lessons. the
participants’ experiences and the researcher’s
interpretation (Schwandt, 2007), was achieved with
member checking, triangulation of data, and peer
debriefing. An audit trail was established as I
kept “clear documentation of all research decisions and
activities” (Creswell & Miller). I utilized three suggested
activities for establishing a clear audit trail: recording all
research activities and data analysis procedures, and
developing a data collection chronology. By providing
thick, rich descriptions, I provided readers of the
research with the opportunity to determine if the study’s
results can be generalized to other populations (Lincoln
& Guba, 1986).
(Re) Constructing Identities: South African Domestic Workers, English
Language Learning, and Power
PhD., University of Minnesota, 2018
Anna Kaiper
 This dissertation aims to ameliorate this dearth of
research while simultaneously broadening global
conceptions of adult language learners by
focusing on the English language learning of
older, Black, female, South African domestic
workers. Utilizing Critical Ethnographic Narrative
Analysis (CENA), in which I draw from the
histories, narratives, and HERstories of 28 female
domestic workers over three-year span, I explore
the complex reasons and motivations for South
African domestic workers to learn English in a
multilingual linguascape. For my research, I
employed several forms of triangulation. First, by
utilizing narratives, observations, and field notes, I
was able to elicit data in different spaces, in
different times, from a multitude of voices.
 Long-term participant observations not only
provide more in-depth data about specific
situations and events than any other method, but
also allow the researcher to confirm her
observations and inferences. Respondent
validation is done by continuously checking in
with participants on what they are saying by using
techniques such as stating, "so what I'm hearing
is...," and then checking in with them at a later
date regarding their previous interviews. This is
executed as a way of ruling out possibilities of
researchers’ misinterpretation of data, as well as
proving an invaluable tool to the researcher in
identifying her own biases and misperceptions.
Validity in Research

Validity in Research

  • 1.
    VALIDITY IN RESEARCH ELT-845 Research Methods in ELT ECEM EKİNCİ
  • 2.
    What is Validity?  Validityis defined as the extent to which a concept is accurately measured in a study.  It denotes the extent to which an instrument is measuring what it is supposed to measure.  As a process, validation involves collecting and analyzing data to assess the accuracy of an instrument.  The concept of validity applies to both whole studies (often called inference validity) and the measurement of individual variables (often called construct validity).
  • 3.
  • 4.
    Interface Validity  Internalvalidity (causality) Refers to whether claimed conclusions, especially relating to causality, are consistent with research results (e.g., statistical results) and research design (e.g., presence of appropriate control variables, use of appropriate methodology).  If we suggest that x causes y, we should be sure that it is x that is responsible for variation in y and not something else that is producing an apparent causal relationship.
  • 5.
     External validity(generalizability) External validity is the extent to which the results of a study can be generalized from a sample to a population.  A study that is externally valid helps obtain population generalizability, or the degree to which a sample represents the population. A carpenter, a school teacher, and scientist were traveling by train through Scotland when they saw a black sheep through the window of the train. "Aha," said the carpenter with a smile, "I see that Scottish sheep are black." "Hmm," said the school teacher, "You mean that some Scottish sheep are black." "No," said the scientist, "All we know is that there is at least one sheep in Scotland, and that at least one side of that one sheep is black."
  • 6.
     Ecological validity Itis concerned with the question of whether social scientific findings are applicable to people’s everyday, natural social settings.  Do our instruments capture the daily life conditions, opinions, values, attitudes, and knowledge base of those we study as expressed in their natural habitat?’
  • 7.
    Construct Validity (Measurement) Construct validity refers to whether you can draw inferences about test scores related to the concept being studied. 1. Translation Validity Face validity is concerned with whether it seems like we measure what we claim. Here we look at how valid a measure appears on the surface and make subjective judgments based on that. Content validity. Do all of the elements of the measure seem connected in the right direction to the concept. Do the measures (questions, observation logs, etc.) accurately assess what you want to know? e.g achievement tests.
  • 8.
    2. Criterion ValidityHow well the measure relates to other measures and characteristics  Predictive validity. Ability to predict future events. e.g. a score of high self-efficacy related to performing a task should predict the likelihood a participant completing the task.  Concurrent validity. Ability to discriminate between groups. For example, if you are testing math, engineers should do better on the test than poets.  Convergent validity shows that an instrument is highly correlated with instruments measuring similar variables.  Discriminant validity shows that an instrument is poorly correlated to instruments measure different variables. In this case, for example, there should be a low correlation between an instrument that measures motivation and one that measures self-efficacy. Construct Validity
  • 9.
    How can validitybe improved? The validity of the research findings is influenced by a range of different factors including choice of sample, researcher bias and design of the research tools. Quantitative research Appropriate statistical analysis of the data Design of research tools Sample selection Sample size Qualitative research Avoiding researcher bias Design of research tools Sample selection The use of triangulation Member checking, peer reviews, external audits
  • 10.
     Member checking,also known as participant or respondent validation, is a technique for exploring the credibility of results. or results are returned to participants to check for accuracy and resonance with their experiences.  Peer review is the evaluation of work by one or more people with similar competences as the producers of the work (peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer review methods are used maintain quality standards, improve performance and provide credibility.  Thick description is a term used to characterize the process of paying attention to contextual detail in observing and interpreting social meaning when conducting qualitative research.
  • 11.
     Validity shouldbe viewed as a continuum, it is possible to improve the validity of the findings within a study, however 100% validity can never be achieved.  The chosen methodology needs to be appropriate for the research questions being investigated and this will then impact on your choice of research methods.  The design of the instruments used for data collection is critical in ensuring a high level of validity.  It is important to be aware of the potential for researcher bias to impact on the design of the instruments.
  • 12.
    AN EVALUATION OFPHD ELT PROGRAMS IN TURKEY PhD, Hacettepe University, 2015 Hülya Küçükoğlu  This current study aims to evaluate the PhD ELT programs in Turkey in terms of program descriptions, program content, and atmosphere in the department, as well as departmental support and program resources. The participants of the study were 116 students enrolled in PhD ELT programs and graduates (from 11 universities) who had already graduated from those programs. The researcher made use of both qualitative and quantitative data. Qualitative data were collected through open-ended survey questions. Quantitative data were collected through questionnaires (Nunan, 1992, p. 143). The researcher used multiple data collection instruments to increase the reliability of the evaluation data. The importance of collecting multiple data sources is emphasized by a number of researchers. Patton (1990, p.244-246) asserted that the use of multiple data sources such as interviews, observations and document analysis enables the researcher or evaluator to validate and cross-check findings. He also indicated that a multimethod, triangulation approach increases both the validity and the reliability of evaluation data (1990: 245). In his study Patton further claimed that the evaluator can build on the strengths of each type of data collection while minimizing the weaknesses of any single approach.
  • 13.
    GENERAL LEARNING STRATEGIES:IDENTIFICATION, TRANSFER TO LANGUAGE LEARNING AND EFFECT ON LANGUAGE ACHIEVEMENT PhD, University of Southampton, 2016 Nahum Samperio Sánchez  The purpose of this study is to identify the general learning strategies that beginner learners of English have in their repertoire, the transfer of such strategies to language learning and the predictive value they have in language achievement. It is also intended to discover the effect that the teaching of not frequently used general learning strategies have on learners’ language achievement. To identify possible differences in strategy types and frequency of strategy use in low and high strategy users as well as high and low achievers of beginner English language learners.  This study followed a mixed-methods research methodology by collecting numerical data by means of a 51-item general strategies questionnaire (Martinez- Guerrero 2004) applied in two administrations. The sample consists of 118 beginner English language learners in a language center at a northern Mexican University. Data were analyzed with the SPSS and Excel software. The qualitative data was collected through twenty individual semi structured interviews; furthermore, three one-hour-forty minute strategy instruction sessions were included as the treatment. Bryman (2006) suggests that triangulation gives greater validity to the traditional view and that quantitative and qualitative research could be combined to triangulate findings to be mutually corroborated.
  • 14.
    TEACHER PERCEPTIONS ONTHE USE OF TECHNOLOGY WITH ENGLISH LANGUAGE LEARNERS PhD, Liberty University, 2018 H. R. Harvil  The purpose of this transcendental phenomenological study was to understand general education teachers’ perceptions regarding their use of technology with students who qualify for English Language Learner services in an urban Georgia school district. The self-efficacy theory originated by Bandura was used to examine 17 teachers’ experiences of using technology as possible personal preference or as being influenced by environmental factors.  I used triangulation between three methods of data collection. Participants were interviewed, answered a questionnaire, and participated in a focus group. I analyzed all data and investigated patterns (Creswell, 2013; Moustakas, 1994; Saldana, 2013). I also utilized member checks to check the data. I asked the members about their responses and my analysis of the data to establish credibility. I addressed dependability using thick, descriptive data, noting details in the data, and becoming familiar with the data through repeated self-analysis (Guba & Lincoln, 1989; Lincoln & Guba, 1985). I used a dependability audit trail and an outside source checked the data.
  • 15.
    TEACHER IMPLEMENTATION OFMOBILE LEARNING INITIATIVE AT A SIXTH GRADE SCHOOL: A PHENOMENOLOGICAL STUDY PhD, Liberty University, 2015 Tina Hemphill Fitts  The purpose of this qualitative, phenomenological study was to investigate teachers’ perceptions of and experiences with the adoption and implementation process as teachers at Tops Sixth Grade School transition from a one-to-one laptop initiative to a school-wide adoption of mobile learning. Participants included eight sixth-grade teachers whose students utilized a variety of mobile devices across multiple platforms to enhance learning. Using phenomenological reduction to analyze data gathered through in- depth, semi-structured interviews, observations, and an online focus group, the study revealed several overarching lessons. the participants’ experiences and the researcher’s interpretation (Schwandt, 2007), was achieved with member checking, triangulation of data, and peer debriefing. An audit trail was established as I kept “clear documentation of all research decisions and activities” (Creswell & Miller). I utilized three suggested activities for establishing a clear audit trail: recording all research activities and data analysis procedures, and developing a data collection chronology. By providing thick, rich descriptions, I provided readers of the research with the opportunity to determine if the study’s results can be generalized to other populations (Lincoln & Guba, 1986).
  • 16.
    (Re) Constructing Identities:South African Domestic Workers, English Language Learning, and Power PhD., University of Minnesota, 2018 Anna Kaiper  This dissertation aims to ameliorate this dearth of research while simultaneously broadening global conceptions of adult language learners by focusing on the English language learning of older, Black, female, South African domestic workers. Utilizing Critical Ethnographic Narrative Analysis (CENA), in which I draw from the histories, narratives, and HERstories of 28 female domestic workers over three-year span, I explore the complex reasons and motivations for South African domestic workers to learn English in a multilingual linguascape. For my research, I employed several forms of triangulation. First, by utilizing narratives, observations, and field notes, I was able to elicit data in different spaces, in different times, from a multitude of voices.  Long-term participant observations not only provide more in-depth data about specific situations and events than any other method, but also allow the researcher to confirm her observations and inferences. Respondent validation is done by continuously checking in with participants on what they are saying by using techniques such as stating, "so what I'm hearing is...," and then checking in with them at a later date regarding their previous interviews. This is executed as a way of ruling out possibilities of researchers’ misinterpretation of data, as well as proving an invaluable tool to the researcher in identifying her own biases and misperceptions.

Editor's Notes

  • #5 Inference validity refers to the validity of a research design as a whole. 
  • #6 Three strategies for strengthening external validity: Sampling. Select cases from a known population via a probability sample (e.g., a simple random sample). This provides a strong basis for claiming the results apply to the population as a whole. Representativeness. Show the similarities between the cases you studied to a population you wish your results to be applied to, then argue that the correlations you will found in your study will also be similar Replication. Repeat the study in multiple settings. Use meta statistics to evaluate the results across studies. Although journal reviewers don't always agree, consistent results across many settings with small samples is more powerful than a large sample of a single setting. 
  • #8 Subjective evaluation of whether a measure matches the construct it is meant to measure. Do the questions in the survey makes sense "on the face of it" for measuring what you are trying to measure.
  • #13 Hacettepe University, Istanbul University, Boğaziçi University, Ankara University, Gazi University, Çanakkale University, Yeditepe University, Çukurova University, Anadolu University, Atatürk University, and Dokuz Eylül University.