SlideShare a Scribd company logo
1 of 17
Copernicus Institute of Sustainable Development
(I Can’t Get No) Saturation: A Simulation and
Guidelines for Minimum Sample Sizes in
Qualitative Research
Frank van Rijnsoever
f.j.vanrijnsoever@uu.nl
Copernicus Institute of Sustainable Development
A random conversation…
• Question: How many interviews do I
need to do?
• Answer: It depends…
• Question: Depends on what?
• Answer: It depends on who you ask.
• Answer: But since you asked me, I will
give you my version of events.
Copernicus Institute of Sustainable Development
Introduction (1)
• Inductive qualitative research
 Is becoming more popular (Bluhm, Harman,
Lee, & Mitchell, 2011)
• Innovation policy, transition studies
• Useful for exploring new concepts, theories,
and processes of change in an in-depth
manner, among other things…
 Increased attention to methodology
(Suddaby, 2006)
 Sample size is a debated topic.
• Laborious process, don’t oversample too much.
• Typical recommended sizes: 15 - 25.
• Little rules (Patton, 1990), except ‘experience’ and
‘judgement of the researcher’ (Sandelowski, 1995).
Copernicus Institute of Sustainable Development
Introduction (2)
Aim
• “this paper explores the sample size that is required to reach
theoretical saturation in various scenarios and to use these insights
to formulate guidelines about purposive sampling.”
Simulation
• Insights in mechanisms behind purposive sampling
Contributions
• Theoretical basis for sample size
• Guidelines for practitioners
Copernicus Institute of Sustainable Development
My way of thinking
Copernicus Institute of Sustainable Development
Theoretical concepts
• A population is the “universe of units of analysis” from which a sample
can be drawn.
• Does not have to be the same as the unit from which information is
gathered.
• Population size = N
• Codes emerge from information sources that are part of a population.
• Informants for interviews, existing documents, etc.
• Denoted as i
• At each sampling step an information source is sampled from the
population.
• Part of an iterative process that includes data collection, analysis,
and interpretation
• Number of sampling steps = n
Copernicus Institute of Sustainable Development
Theoretical concepts
• Codes represent information.
 “tags” or “labels” on unique pieces of information (Bryman, 2013), e.g. concepts,
properties, relationships between other codes.
 Each code represents only one piece of information, there are no synonyms
 Denoted as c
• Theoretical saturation is reached when each code in the population is
observed at least once. Two factors influence the number of sampling
steps towards theoretical saturation: the number of codes and the mean
probability of observing codes
 Denoted as ns
• Purposive sampling implies informed estimation of these factors
 Complexity of the research question
 The likelihood of an information source actually containing the code,
 The willingness and ability of the source to let the code be uncovered, and
 The ability of the researcher to observe the code.
Copernicus Institute of Sustainable Development
Theoretical concepts
• In this paper I test the number of sampling steps required for
saturation based on three typical theoretical ‘sampling
scenario’s.’
 Random chance: random sampling
 Minimal information: each sampling step yields an information
sources with at least one new code.
 Maximal information: each sampling step yields an information
sources with the largest possible number of new codes.
• I simulate hypothetical populations in which I vary the
number of codes (k) and the mean probability of observing
codes (𝜱 𝒄.)
Copernicus Institute of Sustainable Development
Some mathematical notation
• Codes are stored in a vector of 0 and 1 of length k. Information sources are
denoted by i.
• 𝑐𝑖 = 𝑐𝑖1, 𝑐𝑖2, … , 𝑐𝑖𝑘 -> for example: (0,1,1,1,0,0,1)
• The probability that a code is present is represented by a random Bernouli trial Φ.
All codes probabilities together form a vector 𝛷𝑐 of length k.
• The probability that theoretical saturation is reached (𝑝 𝑛) based on random
chance is given by, 𝑝 𝑛 = 𝑐=1
𝑘
(1 − 1 − Φ 𝑐𝑘
𝑛
)
where n is the number of sampling steps
• If all values of 𝛷𝑐 are the same (𝛷 𝑘), then this becomes:
• 𝑝 𝑛 = (1 − 1 − Φ 𝑘
𝑛
) 𝑘
• When 𝑛 𝑠 is the number of sampling steps to reach theoretical saturation given Φ 𝑘
, k and 𝑝 𝑛. This can be rewritten to:
• 𝑛 𝑠 =
ln(1− 𝑘 𝑝 𝑛)
ln(1−Φ 𝑘)
• If we add a minimum number of repetitive codes (v) the formulas become:
• 𝑝 𝑛 = (1 − 1 − Φ 𝑘
𝑛
) 𝑘
) 𝜈
and 𝑛 𝑠 =
ln(1− 𝑘𝜈 𝑝 𝑛)
ln(1−Φ 𝑘)
• Only under very specific assumptions can we calculate theoretical saturation.
• Useful for calibrating my simulation!
Copernicus Institute of Sustainable Development
Methods
• The distribution of probabilities of vector
𝜱 𝒄 can be represented by the beta-
distribution.
 𝐸[𝛷𝑐] =
𝛼
𝛼+𝛽
 𝜱 𝒄
 𝑉𝑎𝑟 Φ 𝑐 =
𝛼𝛽
𝛼+𝛽 2(𝛼+𝛽+1)
• Input for simulations
 Simulate hypothetical populations
• N by k matrices with values 0 and 1
 Systematically vary 𝛼, 𝛽 and k
• 𝛼 & 𝛽 are 1, 2, 3, … 10
• k = 1, 11, 21, 31, … 101
• N=5000
• 1100 hypothetical populations
 For all three scenarios
 Set 𝑝 𝑛 to 0.95 (probability reaching ns)
• 500 trials per population
Copernicus Institute of Sustainable Development
Scenario’s
Copernicus Institute of Sustainable Development
Copernicus Institute of Sustainable Development
Results: sample size at ns
Copernicus Institute of Sustainable Development
Main findings
• 𝜱 𝒄 is more important than k to reach theoretical
saturation.
• Purposive sampling typically requires less than 50
sampling steps. A common value is around 20. This
is the same range as in the literature.
• Little differences between minimal and maximal
information.
 Minimal information gives more repetitive codes.
 Trade-off between efficiency and repetition.
Copernicus Institute of Sustainable Development
Guidelines for purposive sampling
1. Identify a population of information sources, and
subpopulations.
2. Estimate the number of codes per sub-population.
3. Estimate the mean probability of a code being observed.
4. Set a degree of certainty to reach theoretical saturation.
5. Assess which scenario is most applicable to each sub-
population.
6. Choose a fitting sampling strategy
7. Account for these steps when reporting the research.
In general: working under the assumptions of minimal
information seems reasonable.
Copernicus Institute of Sustainable Development
Limitations
• Not empirical
 Not possible.
 Not required.
• Mechanistic approach
 But in line with the assumptions of
qualitative research.
 Everyone is free to apply the results as he or
she wishes.
 Mixture of scenarios is possible.
• Not all possibilities are simulated
 But enough variation to capture plausible
conditions.
Copernicus Institute of Sustainable Development
Questions?
F.J.vanrijnsoever@uu.nl

More Related Content

What's hot

Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminar
drdeepika87
 
Cross sectional study by Dr Abhishek Kumar
Cross sectional study by Dr Abhishek KumarCross sectional study by Dr Abhishek Kumar
Cross sectional study by Dr Abhishek Kumar
ak07mail
 

What's hot (20)

Statistical tests of significance and Student`s T-Test
Statistical tests of significance and Student`s T-TestStatistical tests of significance and Student`s T-Test
Statistical tests of significance and Student`s T-Test
 
Regression
RegressionRegression
Regression
 
Bias in clinical research
Bias in clinical research Bias in clinical research
Bias in clinical research
 
Mann Whitney U test
Mann Whitney U testMann Whitney U test
Mann Whitney U test
 
Comparing means
Comparing meansComparing means
Comparing means
 
Research design
Research designResearch design
Research design
 
Parametric tests seminar
Parametric tests seminarParametric tests seminar
Parametric tests seminar
 
role of Biostatistics (new)
role of Biostatistics (new)role of Biostatistics (new)
role of Biostatistics (new)
 
seminar.pptx
seminar.pptxseminar.pptx
seminar.pptx
 
Standard error
Standard error Standard error
Standard error
 
Student's t test
Student's t testStudent's t test
Student's t test
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Measures of central tendency and dispersion
Measures of central tendency and dispersionMeasures of central tendency and dispersion
Measures of central tendency and dispersion
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Two way ANOVA
Two way ANOVATwo way ANOVA
Two way ANOVA
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
Choosing appropriate statistics test flow chart
Choosing appropriate statistics test flow chartChoosing appropriate statistics test flow chart
Choosing appropriate statistics test flow chart
 
Introduction of biostatistics
Introduction of biostatisticsIntroduction of biostatistics
Introduction of biostatistics
 
Cross sectional study by Dr Abhishek Kumar
Cross sectional study by Dr Abhishek KumarCross sectional study by Dr Abhishek Kumar
Cross sectional study by Dr Abhishek Kumar
 

Viewers also liked (8)

Quantitative Research Presentation (1)
Quantitative Research Presentation (1)Quantitative Research Presentation (1)
Quantitative Research Presentation (1)
 
Conversations are the new research
Conversations are the new researchConversations are the new research
Conversations are the new research
 
Qualitative Research Methodology Course Presentation
Qualitative Research Methodology Course PresentationQualitative Research Methodology Course Presentation
Qualitative Research Methodology Course Presentation
 
Design Research For Everyday Projects - UX London
Design Research For Everyday Projects  - UX LondonDesign Research For Everyday Projects  - UX London
Design Research For Everyday Projects - UX London
 
Grounded Theory
Grounded TheoryGrounded Theory
Grounded Theory
 
Chapter 10-DATA ANALYSIS & PRESENTATION
Chapter 10-DATA ANALYSIS & PRESENTATIONChapter 10-DATA ANALYSIS & PRESENTATION
Chapter 10-DATA ANALYSIS & PRESENTATION
 
THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...
THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...
THE EFFECTS OF SOCIAL NETWORKING SITES ON THE ACADEMIC PERFORMANCE OF STUDENT...
 
Qualitative data analysis
Qualitative data analysisQualitative data analysis
Qualitative data analysis
 

Similar to (I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample Sizes in Qualitative Research

-Statistical-Analysis hgffghfghfgfgg.pdf
-Statistical-Analysis hgffghfghfgfgg.pdf-Statistical-Analysis hgffghfghfgfgg.pdf
-Statistical-Analysis hgffghfghfgfgg.pdf
ALRAFIQHANILONG
 
Chapter 5 class version b(1)
Chapter 5 class version b(1)Chapter 5 class version b(1)
Chapter 5 class version b(1)
jbnx
 
Quantitative and Qualitative Research Methods.pptx
Quantitative and Qualitative Research Methods.pptxQuantitative and Qualitative Research Methods.pptx
Quantitative and Qualitative Research Methods.pptx
kiran513883
 

Similar to (I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample Sizes in Qualitative Research (20)

SAMPLING Theory.ppt
SAMPLING Theory.pptSAMPLING Theory.ppt
SAMPLING Theory.ppt
 
Design and Application of Experiments and User Studies
Design and Application of Experiments and User StudiesDesign and Application of Experiments and User Studies
Design and Application of Experiments and User Studies
 
-Statistical-Analysis hgffghfghfgfgg.pdf
-Statistical-Analysis hgffghfghfgfgg.pdf-Statistical-Analysis hgffghfghfgfgg.pdf
-Statistical-Analysis hgffghfghfgfgg.pdf
 
Chapter 5 class version b(1)
Chapter 5 class version b(1)Chapter 5 class version b(1)
Chapter 5 class version b(1)
 
Data Analysis in Research for Social Study
Data Analysis in Research for Social StudyData Analysis in Research for Social Study
Data Analysis in Research for Social Study
 
FUNDAMENTALS OF RESEARCH IN MEDICINE.pptx
FUNDAMENTALS OF RESEARCH IN MEDICINE.pptxFUNDAMENTALS OF RESEARCH IN MEDICINE.pptx
FUNDAMENTALS OF RESEARCH IN MEDICINE.pptx
 
Presentation1
Presentation1Presentation1
Presentation1
 
Desres final
Desres finalDesres final
Desres final
 
5. sampling design
5. sampling design5. sampling design
5. sampling design
 
Res701 research methodology lecture 7 8-devaprakasam
Res701 research methodology lecture 7 8-devaprakasamRes701 research methodology lecture 7 8-devaprakasam
Res701 research methodology lecture 7 8-devaprakasam
 
Quantitative and Qualitative Research Methods.pptx
Quantitative and Qualitative Research Methods.pptxQuantitative and Qualitative Research Methods.pptx
Quantitative and Qualitative Research Methods.pptx
 
Statistics
StatisticsStatistics
Statistics
 
Selecting a sample: Writing Skill
Selecting a sample: Writing Skill Selecting a sample: Writing Skill
Selecting a sample: Writing Skill
 
Sampaling
SampalingSampaling
Sampaling
 
Survey Research in Software Engineering
Survey Research in Software EngineeringSurvey Research in Software Engineering
Survey Research in Software Engineering
 
chap1.ppt
chap1.pptchap1.ppt
chap1.ppt
 
Research Methodology
Research MethodologyResearch Methodology
Research Methodology
 
ISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to StatisticsISSTA'16 Summer School: Intro to Statistics
ISSTA'16 Summer School: Intro to Statistics
 
Data Wrangling_1.pptx
Data Wrangling_1.pptxData Wrangling_1.pptx
Data Wrangling_1.pptx
 
Data in science
Data in science Data in science
Data in science
 

More from Gemma Derrick

More from Gemma Derrick (13)

Exploring stakeholders' views in the context of collaborative, public health ...
Exploring stakeholders' views in the context of collaborative, public health ...Exploring stakeholders' views in the context of collaborative, public health ...
Exploring stakeholders' views in the context of collaborative, public health ...
 
Validity and Reliability of Qualitative Assessments based on self-reported st...
Validity and Reliability of Qualitative Assessments based on self-reported st...Validity and Reliability of Qualitative Assessments based on self-reported st...
Validity and Reliability of Qualitative Assessments based on self-reported st...
 
Case study narratives
Case study narrativesCase study narratives
Case study narratives
 
QMM2015 Welcome
QMM2015 WelcomeQMM2015 Welcome
QMM2015 Welcome
 
Involving end users in research proposal evaluation: A case study with the Du...
Involving end users in research proposal evaluation: A case study with the Du...Involving end users in research proposal evaluation: A case study with the Du...
Involving end users in research proposal evaluation: A case study with the Du...
 
The in-vitro approach: Qualitative methodology to explore panel based peer re...
The in-vitro approach: Qualitative methodology to explore panel based peer re...The in-vitro approach: Qualitative methodology to explore panel based peer re...
The in-vitro approach: Qualitative methodology to explore panel based peer re...
 
Multiplying method: Ethnography and the reconceptualization of evaluation stu...
Multiplying method: Ethnography and the reconceptualization of evaluation stu...Multiplying method: Ethnography and the reconceptualization of evaluation stu...
Multiplying method: Ethnography and the reconceptualization of evaluation stu...
 
Taking the measure of quality: Mixed methods or mixed feelings?
Taking the measure of quality: Mixed methods or mixed feelings?Taking the measure of quality: Mixed methods or mixed feelings?
Taking the measure of quality: Mixed methods or mixed feelings?
 
Focus! A discussion about the use of focus groups as a method
Focus! A discussion about the use of focus groups as a methodFocus! A discussion about the use of focus groups as a method
Focus! A discussion about the use of focus groups as a method
 
Career development after being awarded by an early personal grant
Career development after being awarded by an early personal grantCareer development after being awarded by an early personal grant
Career development after being awarded by an early personal grant
 
Exploring the stakeholders' views in the context of collaborative, public hea...
Exploring the stakeholders' views in the context of collaborative, public hea...Exploring the stakeholders' views in the context of collaborative, public hea...
Exploring the stakeholders' views in the context of collaborative, public hea...
 
Rethinking the 'international' in the governance of science
Rethinking the 'international' in the governance of scienceRethinking the 'international' in the governance of science
Rethinking the 'international' in the governance of science
 
Intentions and strategies for evaluating the societal impact of research: Ins...
Intentions and strategies for evaluating the societal impact of research: Ins...Intentions and strategies for evaluating the societal impact of research: Ins...
Intentions and strategies for evaluating the societal impact of research: Ins...
 

Recently uploaded

Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
anilsa9823
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 

Recently uploaded (20)

Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 

(I Can't Get No) Saturation: A Simulation and Guidelines for Minimum Sample Sizes in Qualitative Research

  • 1. Copernicus Institute of Sustainable Development (I Can’t Get No) Saturation: A Simulation and Guidelines for Minimum Sample Sizes in Qualitative Research Frank van Rijnsoever f.j.vanrijnsoever@uu.nl
  • 2. Copernicus Institute of Sustainable Development A random conversation… • Question: How many interviews do I need to do? • Answer: It depends… • Question: Depends on what? • Answer: It depends on who you ask. • Answer: But since you asked me, I will give you my version of events.
  • 3. Copernicus Institute of Sustainable Development Introduction (1) • Inductive qualitative research  Is becoming more popular (Bluhm, Harman, Lee, & Mitchell, 2011) • Innovation policy, transition studies • Useful for exploring new concepts, theories, and processes of change in an in-depth manner, among other things…  Increased attention to methodology (Suddaby, 2006)  Sample size is a debated topic. • Laborious process, don’t oversample too much. • Typical recommended sizes: 15 - 25. • Little rules (Patton, 1990), except ‘experience’ and ‘judgement of the researcher’ (Sandelowski, 1995).
  • 4. Copernicus Institute of Sustainable Development Introduction (2) Aim • “this paper explores the sample size that is required to reach theoretical saturation in various scenarios and to use these insights to formulate guidelines about purposive sampling.” Simulation • Insights in mechanisms behind purposive sampling Contributions • Theoretical basis for sample size • Guidelines for practitioners
  • 5. Copernicus Institute of Sustainable Development My way of thinking
  • 6. Copernicus Institute of Sustainable Development Theoretical concepts • A population is the “universe of units of analysis” from which a sample can be drawn. • Does not have to be the same as the unit from which information is gathered. • Population size = N • Codes emerge from information sources that are part of a population. • Informants for interviews, existing documents, etc. • Denoted as i • At each sampling step an information source is sampled from the population. • Part of an iterative process that includes data collection, analysis, and interpretation • Number of sampling steps = n
  • 7. Copernicus Institute of Sustainable Development Theoretical concepts • Codes represent information.  “tags” or “labels” on unique pieces of information (Bryman, 2013), e.g. concepts, properties, relationships between other codes.  Each code represents only one piece of information, there are no synonyms  Denoted as c • Theoretical saturation is reached when each code in the population is observed at least once. Two factors influence the number of sampling steps towards theoretical saturation: the number of codes and the mean probability of observing codes  Denoted as ns • Purposive sampling implies informed estimation of these factors  Complexity of the research question  The likelihood of an information source actually containing the code,  The willingness and ability of the source to let the code be uncovered, and  The ability of the researcher to observe the code.
  • 8. Copernicus Institute of Sustainable Development Theoretical concepts • In this paper I test the number of sampling steps required for saturation based on three typical theoretical ‘sampling scenario’s.’  Random chance: random sampling  Minimal information: each sampling step yields an information sources with at least one new code.  Maximal information: each sampling step yields an information sources with the largest possible number of new codes. • I simulate hypothetical populations in which I vary the number of codes (k) and the mean probability of observing codes (𝜱 𝒄.)
  • 9. Copernicus Institute of Sustainable Development Some mathematical notation • Codes are stored in a vector of 0 and 1 of length k. Information sources are denoted by i. • 𝑐𝑖 = 𝑐𝑖1, 𝑐𝑖2, … , 𝑐𝑖𝑘 -> for example: (0,1,1,1,0,0,1) • The probability that a code is present is represented by a random Bernouli trial Φ. All codes probabilities together form a vector 𝛷𝑐 of length k. • The probability that theoretical saturation is reached (𝑝 𝑛) based on random chance is given by, 𝑝 𝑛 = 𝑐=1 𝑘 (1 − 1 − Φ 𝑐𝑘 𝑛 ) where n is the number of sampling steps • If all values of 𝛷𝑐 are the same (𝛷 𝑘), then this becomes: • 𝑝 𝑛 = (1 − 1 − Φ 𝑘 𝑛 ) 𝑘 • When 𝑛 𝑠 is the number of sampling steps to reach theoretical saturation given Φ 𝑘 , k and 𝑝 𝑛. This can be rewritten to: • 𝑛 𝑠 = ln(1− 𝑘 𝑝 𝑛) ln(1−Φ 𝑘) • If we add a minimum number of repetitive codes (v) the formulas become: • 𝑝 𝑛 = (1 − 1 − Φ 𝑘 𝑛 ) 𝑘 ) 𝜈 and 𝑛 𝑠 = ln(1− 𝑘𝜈 𝑝 𝑛) ln(1−Φ 𝑘) • Only under very specific assumptions can we calculate theoretical saturation. • Useful for calibrating my simulation!
  • 10. Copernicus Institute of Sustainable Development Methods • The distribution of probabilities of vector 𝜱 𝒄 can be represented by the beta- distribution.  𝐸[𝛷𝑐] = 𝛼 𝛼+𝛽  𝜱 𝒄  𝑉𝑎𝑟 Φ 𝑐 = 𝛼𝛽 𝛼+𝛽 2(𝛼+𝛽+1) • Input for simulations  Simulate hypothetical populations • N by k matrices with values 0 and 1  Systematically vary 𝛼, 𝛽 and k • 𝛼 & 𝛽 are 1, 2, 3, … 10 • k = 1, 11, 21, 31, … 101 • N=5000 • 1100 hypothetical populations  For all three scenarios  Set 𝑝 𝑛 to 0.95 (probability reaching ns) • 500 trials per population
  • 11. Copernicus Institute of Sustainable Development Scenario’s
  • 12. Copernicus Institute of Sustainable Development
  • 13. Copernicus Institute of Sustainable Development Results: sample size at ns
  • 14. Copernicus Institute of Sustainable Development Main findings • 𝜱 𝒄 is more important than k to reach theoretical saturation. • Purposive sampling typically requires less than 50 sampling steps. A common value is around 20. This is the same range as in the literature. • Little differences between minimal and maximal information.  Minimal information gives more repetitive codes.  Trade-off between efficiency and repetition.
  • 15. Copernicus Institute of Sustainable Development Guidelines for purposive sampling 1. Identify a population of information sources, and subpopulations. 2. Estimate the number of codes per sub-population. 3. Estimate the mean probability of a code being observed. 4. Set a degree of certainty to reach theoretical saturation. 5. Assess which scenario is most applicable to each sub- population. 6. Choose a fitting sampling strategy 7. Account for these steps when reporting the research. In general: working under the assumptions of minimal information seems reasonable.
  • 16. Copernicus Institute of Sustainable Development Limitations • Not empirical  Not possible.  Not required. • Mechanistic approach  But in line with the assumptions of qualitative research.  Everyone is free to apply the results as he or she wishes.  Mixture of scenarios is possible. • Not all possibilities are simulated  But enough variation to capture plausible conditions.
  • 17. Copernicus Institute of Sustainable Development Questions? F.J.vanrijnsoever@uu.nl