SlideShare a Scribd company logo
1 of 27
Analysis of Data
with Missing
Values
Chakra Budhathoki, PhD
02/17/2022
Nursing Office for Research Administration
Outline
1. Background
2. Extent of missingness
3. Pattern of missingness
4. Strategies in dealing with missing data
5. Examples
6. Prevention of missing data
7. Summary
Background
• Missing data very common in research studies
• Best solution? Avoid them!!
• Not taught in many statistical courses
• Handling missing data
• Reporting of missing data
Background Cont.
• Preventing missing data
• Study designs: (1) longitudinal vs. cross-sectional,
(2) randomized vs. observational studies
• Missingness is generally an outcome in some pilot
or feasibility studies
Missing Data: Cross-sectional
ID Var_1 Var_2 Var_3 .. .. .. .. Var_k
1 x x x x x x x x
2 x . x x x x x x
3 x . . . . . . .
..
..
..
n x x x x x x x x
Missing Data: Cross-sectional, Scales
ID SS1_1 SS1_2 SS1_3 SS2_1 SS2_2 .. .. SSx_nk
1 x x x x x x x x
2 x . x x x x x x
3 x x x . x x . x
..
..
..
n x x x x x x x x
Missing Data: Longitudinal
ID T1_Var_
1
T2_Var_
1
T3_Var_
1
T1_Var_
2
T2_Var_
2
T3_Var_
2
.. ..
1 x x x x x x x x
2 x x . x x x x x
3 x . x x . x x x
..
..
..
n x x x x x x x x
Coding Missing Data
• Often coded as values that are not possible, e.g.
999, -999
• If coded that way, make sure to specify them as
missing in data analysis
• Sometimes such coding scheme developed to list
different reasons of missing data
• If they are not important, safer to leave blank or
enter “.”
Extent of Missing Data
• <1%, <5%, 10% or higher?
• By item/variable or by subject?
• Most values missing in one or a few variables?
• Missing values in one or a few primary variables?
• Missing values in one or a few secondary
variables?
• Few values missing in several variables?
Pattern of Missing Data
• Item-level missingness
• Subject-level missingness
• Missing in outcome or predictor variable?
• Missing in continuous or categorical variable?
• Designed trials or designed surveys?
• Unstructured surveys?
Type of Missing Data
1. Missing completely at random
(MCAR)
2. Missing at random (MAR)
3. Missing not at random (MNAR)
Missing Completely at Random
(MCAR)
• Reasons: Lab error, road accident, bad weather,
residential move, family emergency, inadvertently
skipping questions
• Example: income and age, prob of missing data on
income does not depend on income and age, i.e.
participants of all ages likely to report income
• Little’s MCAR test
• Non-significant test => MCAR
Missing at Random (MAR)
• Also called ignorable missingness
• Probability of missingness on Y does not depend
on Y itself after controlling for other variables
• Example: prob of missingness on income depends
on age (older more likely to report than younger),
but participants within each group equally likely to
report income, i.e. prob of missingness on income
unrelated within an age group
Missing Not at Random (MNAR)
• Also called nonignorable missingness
• Missingness is not MCAR or MAR
• Probability of missingness on Y depends on values
of Y itself, e.g. people with higher income do not
report income (even after controlling for other
factors)
• No statistical tests for MAR and MNAR, but can
run some sensitivity analyses
Missing Data Patterns: Example (Polit,
2010)
# Follow-Up Missing data reason Cotinine M, ng/mL
(All Subjects)
Cotinine M
(90 Subjects)
No missing 50 men
50 women
- 185.0 -
MCAR 45 men
45 women
Lab error, road accident, bad
weather, residential move, family
emergency
185.0 185.5
MAR 40 men
50 women
Male dropouts lost interest, men
smoked >women, dropout
unrelated to cotinine level
185.0 175.0
MNAR 40 men
50 women
Male dropouts resumed heavy
smoking, embarrassed to
continue, dropout related to
cotinine level
185.0 165.0
Handling Missing Data
• Ignoring missing data:
1. Pairwise deletion, e.g. bivariate correlation, also called
available-case analysis
2. Listwise or casewise deletion, e.g. multiple regression,
also called complete-case analysis
• Ignoring missing data, but using all available data:
GEE, mixed models, survival analysis
• Imputation: (1) single-imputation, (2) multiple imputation
• Sensitivity analysis, e.g. worst outcome
Single Imputation
• Imputation using a central tendency measure:
Continuous-> Mean, Ordinal-> median, Nominal->
mode
• Subgroup imputation
• LOCF
• Regression
• Maximum likelihood
• Expectation maximization (EM) algorithm
Multiple Imputation, Newgard and Haukoos
(2007)
An Example of Sensitivity Analysis, Polit
(2010)
Listwise
Deletion
Mean
Imputation
Sub-group
Mean
Regression
Imputation
EM
N 1753 1904 1904 1904 1904
Mean, Imp
(N=151)
- 45.50 45.52
(m=46.36, f=44.7)
46.43 46.38
SD, Imp - 0.00 0.84 5.95 5.93
Mean, Full 45.50 45.50 45.50 45.57 45.57
SD, Full 13.83 13.27 13.28 13.37 13.49
R2 0.185 0.170 0.171 0.198 0.197
Example: Mean Imputation
Example: Mode Imputation
Example: LOCF
Example: MI
Considerations to Decrease Missing
Data
• Some attrition unavoidable
• Take attrition or drop-out into account in estimating sample
size
• Analyze all randomized subjects in RCTs
• Try to increase response rate in surveys
• Ask questions that decrease refusal rate, e.g. exact income
vs. income categories
• Logistical support for clinic visits if ethical
Considerations to Decrease Missing Data
Cont.
• Study design
• Reduce drop outs
• One may discontinue assigned treatment, but try to
keep them in the study follow-ups
• They can switch treatment, or completely
discontinue
• Communication
• Training
Summary
• Missing data are common
• Data analysis plan should specify how missing data would
be handled
• Better study designs
• Account for expected attrition in sample size estimation
• Better data analyses: imputation is common
• GEE, mixed models, survival analyses do not need
imputation
References
Allison, P. D. (2002). Missing data. Sage publications.
Newgard, C. D., & Haukoos, J. S. (2007). Advanced
Statistics: Missing Data in Clinical Research-Part 2: Multiple
Imputation. Academic Emergency Medicine, 14: 669-678.
Polit, D. (2010). Statistics and data analysis for nursing
research. Pearson.

More Related Content

What's hot

INFERENTIAL STATISTICS: AN INTRODUCTION
INFERENTIAL STATISTICS: AN INTRODUCTIONINFERENTIAL STATISTICS: AN INTRODUCTION
INFERENTIAL STATISTICS: AN INTRODUCTIONJohn Labrador
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
Application of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceApplication of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceAlexander Decker
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionAbdelaziz Tayoun
 
Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Dr Athar Khan
 
Probability Distributions
Probability Distributions Probability Distributions
Probability Distributions Anthony J. Evans
 
Correlation - Biostatistics
Correlation - BiostatisticsCorrelation - Biostatistics
Correlation - BiostatisticsFahmida Swati
 
Data analysis 1
Data analysis 1Data analysis 1
Data analysis 1Bùi Trâm
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statisticsewhite00
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsNitin George
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)MikeBlyth
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examplesGaurav Kamboj
 
hypothesis testing
hypothesis testinghypothesis testing
hypothesis testingilona50
 
Correlation- an introduction and application of spearman rank correlation by...
Correlation- an introduction and application of spearman rank correlation  by...Correlation- an introduction and application of spearman rank correlation  by...
Correlation- an introduction and application of spearman rank correlation by...Gunjan Verma
 

What's hot (20)

INFERENTIAL STATISTICS: AN INTRODUCTION
INFERENTIAL STATISTICS: AN INTRODUCTIONINFERENTIAL STATISTICS: AN INTRODUCTION
INFERENTIAL STATISTICS: AN INTRODUCTION
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Application of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceApplication of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performance
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
MANOVA SPSS
MANOVA SPSSMANOVA SPSS
MANOVA SPSS
 
Correlations using SPSS
Correlations using SPSSCorrelations using SPSS
Correlations using SPSS
 
Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Binary OR Binomial logistic regression
Binary OR Binomial logistic regression
 
Probability Distributions
Probability Distributions Probability Distributions
Probability Distributions
 
Binary Logistic Regression
Binary Logistic RegressionBinary Logistic Regression
Binary Logistic Regression
 
Correlation - Biostatistics
Correlation - BiostatisticsCorrelation - Biostatistics
Correlation - Biostatistics
 
Data analysis 1
Data analysis 1Data analysis 1
Data analysis 1
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trials
 
Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)Logistic regression (blyth 2006) (simplified)
Logistic regression (blyth 2006) (simplified)
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examples
 
Correlation.pdf
Correlation.pdfCorrelation.pdf
Correlation.pdf
 
hypothesis testing
hypothesis testinghypothesis testing
hypothesis testing
 
Path analysis
Path analysisPath analysis
Path analysis
 
Correlation- an introduction and application of spearman rank correlation by...
Correlation- an introduction and application of spearman rank correlation  by...Correlation- an introduction and application of spearman rank correlation  by...
Correlation- an introduction and application of spearman rank correlation by...
 
Correlation
CorrelationCorrelation
Correlation
 

Similar to Analysis-of-data-with-missing-values.pptx

2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higginsrgveroniki
 
Biostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataBiostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataHopkinsCFAR
 
Statistics for DP Biology IA
Statistics for DP Biology IAStatistics for DP Biology IA
Statistics for DP Biology IAVeronika Garga
 
Stat 3203 -sampling errors and non-sampling errors
Stat 3203 -sampling errors  and non-sampling errorsStat 3203 -sampling errors  and non-sampling errors
Stat 3203 -sampling errors and non-sampling errorsKhulna University
 
Stat11t Chapter1
Stat11t Chapter1Stat11t Chapter1
Stat11t Chapter1gueste87a4f
 
introduction to statistical theory
introduction to statistical theoryintroduction to statistical theory
introduction to statistical theoryUnsa Shakir
 
6.hypothesistesting.06
6.hypothesistesting.066.hypothesistesting.06
6.hypothesistesting.06MehwishFahad
 
Population sampling RSS6 2014
Population sampling RSS6 2014Population sampling RSS6 2014
Population sampling RSS6 2014RSS6
 
Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatMarwa Zalat
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.pptmanaswidebbarma1
 
Research 101: Quantitative Data Preparation
Research 101: Quantitative Data PreparationResearch 101: Quantitative Data Preparation
Research 101: Quantitative Data PreparationHarold Gamero
 
5 numerical descriptive statitics
5 numerical descriptive statitics5 numerical descriptive statitics
5 numerical descriptive statiticsPenny Jiang
 
Week 2 measures of disease occurence
Week 2  measures of disease occurenceWeek 2  measures of disease occurence
Week 2 measures of disease occurenceHamdi Alhakimi
 
Statistical Approaches to Missing Data
Statistical Approaches to Missing DataStatistical Approaches to Missing Data
Statistical Approaches to Missing DataDataCards
 
Epidemiological Analysis Workshop By Dr Suzanne Campbell
Epidemiological Analysis Workshop By Dr Suzanne Campbell Epidemiological Analysis Workshop By Dr Suzanne Campbell
Epidemiological Analysis Workshop By Dr Suzanne Campbell COUNTDOWN on NTDs
 

Similar to Analysis-of-data-with-missing-values.pptx (20)

2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins
 
Biostatistics Workshop: Missing Data
Biostatistics Workshop: Missing DataBiostatistics Workshop: Missing Data
Biostatistics Workshop: Missing Data
 
1.2 types of data
1.2 types of data1.2 types of data
1.2 types of data
 
Statistics for DP Biology IA
Statistics for DP Biology IAStatistics for DP Biology IA
Statistics for DP Biology IA
 
Stat 3203 -sampling errors and non-sampling errors
Stat 3203 -sampling errors  and non-sampling errorsStat 3203 -sampling errors  and non-sampling errors
Stat 3203 -sampling errors and non-sampling errors
 
Stat11t chapter1
Stat11t chapter1Stat11t chapter1
Stat11t chapter1
 
Stat11t Chapter1
Stat11t Chapter1Stat11t Chapter1
Stat11t Chapter1
 
introduction to statistical theory
introduction to statistical theoryintroduction to statistical theory
introduction to statistical theory
 
6.hypothesistesting.06
6.hypothesistesting.066.hypothesistesting.06
6.hypothesistesting.06
 
Population sampling RSS6 2014
Population sampling RSS6 2014Population sampling RSS6 2014
Population sampling RSS6 2014
 
Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa Zalat
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.ppt
 
Research 101: Quantitative Data Preparation
Research 101: Quantitative Data PreparationResearch 101: Quantitative Data Preparation
Research 101: Quantitative Data Preparation
 
5 numerical descriptive statitics
5 numerical descriptive statitics5 numerical descriptive statitics
5 numerical descriptive statitics
 
Sampling brm chap-4
Sampling brm chap-4Sampling brm chap-4
Sampling brm chap-4
 
Week 2 measures of disease occurence
Week 2  measures of disease occurenceWeek 2  measures of disease occurence
Week 2 measures of disease occurence
 
Statistical Approaches to Missing Data
Statistical Approaches to Missing DataStatistical Approaches to Missing Data
Statistical Approaches to Missing Data
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Session1b.ppt
Session1b.pptSession1b.ppt
Session1b.ppt
 
Epidemiological Analysis Workshop By Dr Suzanne Campbell
Epidemiological Analysis Workshop By Dr Suzanne Campbell Epidemiological Analysis Workshop By Dr Suzanne Campbell
Epidemiological Analysis Workshop By Dr Suzanne Campbell
 

Recently uploaded

AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptNishitharanjan Rout
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsSandeep D Chaudhary
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSAnaAcapella
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Celine George
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Introduction to TechSoup’s Digital Marketing Services and Use Cases
Introduction to TechSoup’s Digital Marketing  Services and Use CasesIntroduction to TechSoup’s Digital Marketing  Services and Use Cases
Introduction to TechSoup’s Digital Marketing Services and Use CasesTechSoup
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningMarc Dusseiller Dusjagr
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxCeline George
 
Play hard learn harder: The Serious Business of Play
Play hard learn harder:  The Serious Business of PlayPlay hard learn harder:  The Serious Business of Play
Play hard learn harder: The Serious Business of PlayPooky Knightsmith
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 

Recently uploaded (20)

AIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.pptAIM of Education-Teachers Training-2024.ppt
AIM of Education-Teachers Training-2024.ppt
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17Model Attribute _rec_name in the Odoo 17
Model Attribute _rec_name in the Odoo 17
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Introduction to TechSoup’s Digital Marketing Services and Use Cases
Introduction to TechSoup’s Digital Marketing  Services and Use CasesIntroduction to TechSoup’s Digital Marketing  Services and Use Cases
Introduction to TechSoup’s Digital Marketing Services and Use Cases
 
dusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learningdusjagr & nano talk on open tools for agriculture research and learning
dusjagr & nano talk on open tools for agriculture research and learning
 
What is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptxWhat is 3 Way Matching Process in Odoo 17.pptx
What is 3 Way Matching Process in Odoo 17.pptx
 
Play hard learn harder: The Serious Business of Play
Play hard learn harder:  The Serious Business of PlayPlay hard learn harder:  The Serious Business of Play
Play hard learn harder: The Serious Business of Play
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 

Analysis-of-data-with-missing-values.pptx

  • 1. Analysis of Data with Missing Values Chakra Budhathoki, PhD 02/17/2022 Nursing Office for Research Administration
  • 2. Outline 1. Background 2. Extent of missingness 3. Pattern of missingness 4. Strategies in dealing with missing data 5. Examples 6. Prevention of missing data 7. Summary
  • 3. Background • Missing data very common in research studies • Best solution? Avoid them!! • Not taught in many statistical courses • Handling missing data • Reporting of missing data
  • 4. Background Cont. • Preventing missing data • Study designs: (1) longitudinal vs. cross-sectional, (2) randomized vs. observational studies • Missingness is generally an outcome in some pilot or feasibility studies
  • 5. Missing Data: Cross-sectional ID Var_1 Var_2 Var_3 .. .. .. .. Var_k 1 x x x x x x x x 2 x . x x x x x x 3 x . . . . . . . .. .. .. n x x x x x x x x
  • 6. Missing Data: Cross-sectional, Scales ID SS1_1 SS1_2 SS1_3 SS2_1 SS2_2 .. .. SSx_nk 1 x x x x x x x x 2 x . x x x x x x 3 x x x . x x . x .. .. .. n x x x x x x x x
  • 7. Missing Data: Longitudinal ID T1_Var_ 1 T2_Var_ 1 T3_Var_ 1 T1_Var_ 2 T2_Var_ 2 T3_Var_ 2 .. .. 1 x x x x x x x x 2 x x . x x x x x 3 x . x x . x x x .. .. .. n x x x x x x x x
  • 8. Coding Missing Data • Often coded as values that are not possible, e.g. 999, -999 • If coded that way, make sure to specify them as missing in data analysis • Sometimes such coding scheme developed to list different reasons of missing data • If they are not important, safer to leave blank or enter “.”
  • 9. Extent of Missing Data • <1%, <5%, 10% or higher? • By item/variable or by subject? • Most values missing in one or a few variables? • Missing values in one or a few primary variables? • Missing values in one or a few secondary variables? • Few values missing in several variables?
  • 10. Pattern of Missing Data • Item-level missingness • Subject-level missingness • Missing in outcome or predictor variable? • Missing in continuous or categorical variable? • Designed trials or designed surveys? • Unstructured surveys?
  • 11. Type of Missing Data 1. Missing completely at random (MCAR) 2. Missing at random (MAR) 3. Missing not at random (MNAR)
  • 12. Missing Completely at Random (MCAR) • Reasons: Lab error, road accident, bad weather, residential move, family emergency, inadvertently skipping questions • Example: income and age, prob of missing data on income does not depend on income and age, i.e. participants of all ages likely to report income • Little’s MCAR test • Non-significant test => MCAR
  • 13. Missing at Random (MAR) • Also called ignorable missingness • Probability of missingness on Y does not depend on Y itself after controlling for other variables • Example: prob of missingness on income depends on age (older more likely to report than younger), but participants within each group equally likely to report income, i.e. prob of missingness on income unrelated within an age group
  • 14. Missing Not at Random (MNAR) • Also called nonignorable missingness • Missingness is not MCAR or MAR • Probability of missingness on Y depends on values of Y itself, e.g. people with higher income do not report income (even after controlling for other factors) • No statistical tests for MAR and MNAR, but can run some sensitivity analyses
  • 15. Missing Data Patterns: Example (Polit, 2010) # Follow-Up Missing data reason Cotinine M, ng/mL (All Subjects) Cotinine M (90 Subjects) No missing 50 men 50 women - 185.0 - MCAR 45 men 45 women Lab error, road accident, bad weather, residential move, family emergency 185.0 185.5 MAR 40 men 50 women Male dropouts lost interest, men smoked >women, dropout unrelated to cotinine level 185.0 175.0 MNAR 40 men 50 women Male dropouts resumed heavy smoking, embarrassed to continue, dropout related to cotinine level 185.0 165.0
  • 16. Handling Missing Data • Ignoring missing data: 1. Pairwise deletion, e.g. bivariate correlation, also called available-case analysis 2. Listwise or casewise deletion, e.g. multiple regression, also called complete-case analysis • Ignoring missing data, but using all available data: GEE, mixed models, survival analysis • Imputation: (1) single-imputation, (2) multiple imputation • Sensitivity analysis, e.g. worst outcome
  • 17. Single Imputation • Imputation using a central tendency measure: Continuous-> Mean, Ordinal-> median, Nominal-> mode • Subgroup imputation • LOCF • Regression • Maximum likelihood • Expectation maximization (EM) algorithm
  • 18. Multiple Imputation, Newgard and Haukoos (2007)
  • 19. An Example of Sensitivity Analysis, Polit (2010) Listwise Deletion Mean Imputation Sub-group Mean Regression Imputation EM N 1753 1904 1904 1904 1904 Mean, Imp (N=151) - 45.50 45.52 (m=46.36, f=44.7) 46.43 46.38 SD, Imp - 0.00 0.84 5.95 5.93 Mean, Full 45.50 45.50 45.50 45.57 45.57 SD, Full 13.83 13.27 13.28 13.37 13.49 R2 0.185 0.170 0.171 0.198 0.197
  • 24. Considerations to Decrease Missing Data • Some attrition unavoidable • Take attrition or drop-out into account in estimating sample size • Analyze all randomized subjects in RCTs • Try to increase response rate in surveys • Ask questions that decrease refusal rate, e.g. exact income vs. income categories • Logistical support for clinic visits if ethical
  • 25. Considerations to Decrease Missing Data Cont. • Study design • Reduce drop outs • One may discontinue assigned treatment, but try to keep them in the study follow-ups • They can switch treatment, or completely discontinue • Communication • Training
  • 26. Summary • Missing data are common • Data analysis plan should specify how missing data would be handled • Better study designs • Account for expected attrition in sample size estimation • Better data analyses: imputation is common • GEE, mixed models, survival analyses do not need imputation
  • 27. References Allison, P. D. (2002). Missing data. Sage publications. Newgard, C. D., & Haukoos, J. S. (2007). Advanced Statistics: Missing Data in Clinical Research-Part 2: Multiple Imputation. Academic Emergency Medicine, 14: 669-678. Polit, D. (2010). Statistics and data analysis for nursing research. Pearson.