SlideShare a Scribd company logo
Missing data
• Dealing with missing data, has been always a challenge in data analysis context.
• We need methods in missing data analysis that:
▫ Minimize the bias
▫ Maximize use of available information, and
▫ Get good estimates of uncertainty e.g., p-value, confidence interval, etc.
Rec No Variable n
1 Unit non-response Unobserved/Latent
variable
2
3 Missing data
4 Item-non-
response
Assumptions (MCAR, MAR, NMAR)
• When values are missing completely at random (MCAR), the probability of
missingness is unrelated to the values of any variable
• Data are missing at random (MAR) if missingness is unrelated to the value that
is missing, but is related to the values of other variables
• E.g., a question about typical number of hours spent browsing the internet
might be missing more often for married than unmarried participants; BUT
among the married subset, missingness is completely random—Not related
to how many hours the person browses
• Data are missing not at random (MNAR) if missingness is related to the value
that is missing, and often to the values of other variables as well
• E.g. missing values are more prevalent among those who typically browse
more than among those who browse less
• Deletion methods : Delete cases or variables that are missing
▫ Listwise methods
▫ Pairwise deletion
▫ Variable deletion
• Imputation methods : Substitution methods
▫ Single imputation
🞄 Mean imputation
🞄 Conditional mean imputation
🞄 Case mean imputation
🞄 Regression imputation
🞄 Last observation carried forward
🞄 Worst case imputation
🞄 Best case imputation
🞄 EM imputation
▫ Multiple imputation
Methods of handling missing data
List wise deletion
• A good method when the proportion of
missing data is less than 15%.
• Advantages:
▫ It can be used for any type of statistical
analysis.
▫ No special computations are required.
▫ The parameters estimations are
unbiased.
▫ The standard errors are appropriate
compare to original data.
• Disadvantages:
▫ May remove a considerable fraction of
data
Pair wise deletion
• Pairwise deletion involves dropping cases
with missing values on an analysis-by-
analysis basis
• Advantages:
▫ Using all available non-missing data
• Disadvantages:
▫ Estimated standard errors and test
statistics are biased
Variable deletion
• Variable deletion involves dropping
variables with missing values on an case -
by-case basis
• Advantages:
▫ Makes sense when lot of missing values in
a variable and if the variable is of relatively
less importance
• Disadvantages:
▫ Loss of information regarding the variable
Mean imputation
• Replace missing values with the mean of
that variable Case Var1 Var2 Var3
1 9 8 8
2 7.44 7 6
3 8 5 6
4 7 4 5
5 9 5 7
6 8 8 9
7 6 7 6
8 5 9 7
9 7 8 ?
10 8 8 7
Conditional Mean imputation
• Replace missing values with value of the
variable mean for a relevant subgroup
Case Var1 Sex Var2 Var3
1 9 F 8 8
2 8.25 F 7 6
3 8 F 5 6
4 7 F 4 5
5 9 F 5 7
6 8 M 8 9
7 6 M 7 6
8 5 M 9 7
9 7 M 8 ?
10 8 M 8 7
Case Mean imputation
• Replace missing values using information
from other variables for the same case to
impute the missing value
Case Var1 Var2 Var3
1 9 8 8
2 6.50 7 6
3 8 5 6
4 7 4 5
5 9 5 7
6 8 8 9
7 6 7 6
8 5 9 7
9 7 8 ?
10 8 8 7
• Imputes the missing value as a
value on the same outcome the
most recent time it was observed
• Variants :
• Average of T1 and T2
Last observation carried forward
Case T1 T2 T3
1 9 8 8
2 ? 7 6
3 8 5 6
4 7 4 5
5 9 5 7
6 8 8 9
7 6 7 6
8 5 9 7
9 7 8 8
10 8 8 7
• Use interpolation to fill in missing
values
• Useful for longitudinal datasets
Interpolation
• Worst case replaces a missing value with the worst case scenario for a
categorical outcome
• Best case replaces a missing value with the best case scenario for a
categorical outcome
Worst case and Best case imputation
• Substitute best missing values using
a ML imputation
• In the E-step, expected values are
calculated based on all complete
data points
• In the M-step, the procedure
imputes the expected values from
the E-step and then maximizes the
likelihood function to obtain new
parameter estimates
Expectation-Maximization
• Multiple imputation is quickly
becoming the “gold standard”
approach to handling missing
values
• Computationally complex
Multiple imputation

More Related Content

Similar to missingdatahandling-160923201313.pptx

Performance Measurement for Machine Leaning.pptx
Performance Measurement for Machine Leaning.pptxPerformance Measurement for Machine Leaning.pptx
Performance Measurement for Machine Leaning.pptx
toneve4907
 
KNOLX_Data_preprocessing
KNOLX_Data_preprocessingKNOLX_Data_preprocessing
KNOLX_Data_preprocessing
Knoldus Inc.
 
Database ppt.pptx
Database ppt.pptxDatabase ppt.pptx
Database ppt.pptx
AASTHAJAJOO
 
dimension reduction.ppt
dimension reduction.pptdimension reduction.ppt
dimension reduction.ppt
Deadpool120050
 
Kevin Swingler: Introduction to Data Mining
Kevin Swingler: Introduction to Data MiningKevin Swingler: Introduction to Data Mining
Kevin Swingler: Introduction to Data Mining
Library and Information Science Research Coalition
 
DataMiningOverview_Galambos_2015_06_04.pptx
DataMiningOverview_Galambos_2015_06_04.pptxDataMiningOverview_Galambos_2015_06_04.pptx
DataMiningOverview_Galambos_2015_06_04.pptx
Akash527744
 
5 numerical descriptive statitics
5 numerical descriptive statitics5 numerical descriptive statitics
5 numerical descriptive statitics
Penny Jiang
 
Dimensionality Reduction.pptx
Dimensionality Reduction.pptxDimensionality Reduction.pptx
Dimensionality Reduction.pptx
PriyadharshiniG41
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
BHUOnlineDepartment
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
BealCollegeOnline
 
2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins
rgveroniki
 
Data Preparation and Processing
Data Preparation and ProcessingData Preparation and Processing
Data Preparation and Processing
Mehul Gondaliya
 
Data analysis
Data analysisData analysis
Data analysis
SANTHANAM V
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
Georgios Ath. Kounis
 
Anomaly detection : QuantUniversity Workshop
Anomaly detection : QuantUniversity Workshop Anomaly detection : QuantUniversity Workshop
Anomaly detection : QuantUniversity Workshop
QuantUniversity
 
Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016 Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016
Alex Gilgur
 
chapter12.ppt
chapter12.pptchapter12.ppt
chapter12.ppt
EndrisHEbrahim
 
Forecasting Examples
Forecasting ExamplesForecasting Examples
Forecasting Examples
Muhammad Imran
 
Data Transformation – Standardization & Normalization PPM.pptx
Data Transformation – Standardization & Normalization PPM.pptxData Transformation – Standardization & Normalization PPM.pptx
Data Transformation – Standardization & Normalization PPM.pptx
ssuser5cdaa93
 
Statistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronStatistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica Cameron
User Vision
 

Similar to missingdatahandling-160923201313.pptx (20)

Performance Measurement for Machine Leaning.pptx
Performance Measurement for Machine Leaning.pptxPerformance Measurement for Machine Leaning.pptx
Performance Measurement for Machine Leaning.pptx
 
KNOLX_Data_preprocessing
KNOLX_Data_preprocessingKNOLX_Data_preprocessing
KNOLX_Data_preprocessing
 
Database ppt.pptx
Database ppt.pptxDatabase ppt.pptx
Database ppt.pptx
 
dimension reduction.ppt
dimension reduction.pptdimension reduction.ppt
dimension reduction.ppt
 
Kevin Swingler: Introduction to Data Mining
Kevin Swingler: Introduction to Data MiningKevin Swingler: Introduction to Data Mining
Kevin Swingler: Introduction to Data Mining
 
DataMiningOverview_Galambos_2015_06_04.pptx
DataMiningOverview_Galambos_2015_06_04.pptxDataMiningOverview_Galambos_2015_06_04.pptx
DataMiningOverview_Galambos_2015_06_04.pptx
 
5 numerical descriptive statitics
5 numerical descriptive statitics5 numerical descriptive statitics
5 numerical descriptive statitics
 
Dimensionality Reduction.pptx
Dimensionality Reduction.pptxDimensionality Reduction.pptx
Dimensionality Reduction.pptx
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
 
Hm306 week 4
Hm306 week 4Hm306 week 4
Hm306 week 4
 
2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins2010 smg training_cardiff_day1_session3_higgins
2010 smg training_cardiff_day1_session3_higgins
 
Data Preparation and Processing
Data Preparation and ProcessingData Preparation and Processing
Data Preparation and Processing
 
Data analysis
Data analysisData analysis
Data analysis
 
Environmental statistics
Environmental statisticsEnvironmental statistics
Environmental statistics
 
Anomaly detection : QuantUniversity Workshop
Anomaly detection : QuantUniversity Workshop Anomaly detection : QuantUniversity Workshop
Anomaly detection : QuantUniversity Workshop
 
Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016 Performance OR Capacity #CMGimPACt2016
Performance OR Capacity #CMGimPACt2016
 
chapter12.ppt
chapter12.pptchapter12.ppt
chapter12.ppt
 
Forecasting Examples
Forecasting ExamplesForecasting Examples
Forecasting Examples
 
Data Transformation – Standardization & Normalization PPM.pptx
Data Transformation – Standardization & Normalization PPM.pptxData Transformation – Standardization & Normalization PPM.pptx
Data Transformation – Standardization & Normalization PPM.pptx
 
Statistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronStatistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica Cameron
 

Recently uploaded

Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 

Recently uploaded (20)

Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 

missingdatahandling-160923201313.pptx

  • 1. Missing data • Dealing with missing data, has been always a challenge in data analysis context. • We need methods in missing data analysis that: ▫ Minimize the bias ▫ Maximize use of available information, and ▫ Get good estimates of uncertainty e.g., p-value, confidence interval, etc. Rec No Variable n 1 Unit non-response Unobserved/Latent variable 2 3 Missing data 4 Item-non- response
  • 2. Assumptions (MCAR, MAR, NMAR) • When values are missing completely at random (MCAR), the probability of missingness is unrelated to the values of any variable • Data are missing at random (MAR) if missingness is unrelated to the value that is missing, but is related to the values of other variables • E.g., a question about typical number of hours spent browsing the internet might be missing more often for married than unmarried participants; BUT among the married subset, missingness is completely random—Not related to how many hours the person browses • Data are missing not at random (MNAR) if missingness is related to the value that is missing, and often to the values of other variables as well • E.g. missing values are more prevalent among those who typically browse more than among those who browse less
  • 3.
  • 4. • Deletion methods : Delete cases or variables that are missing ▫ Listwise methods ▫ Pairwise deletion ▫ Variable deletion • Imputation methods : Substitution methods ▫ Single imputation 🞄 Mean imputation 🞄 Conditional mean imputation 🞄 Case mean imputation 🞄 Regression imputation 🞄 Last observation carried forward 🞄 Worst case imputation 🞄 Best case imputation 🞄 EM imputation ▫ Multiple imputation Methods of handling missing data
  • 5. List wise deletion • A good method when the proportion of missing data is less than 15%. • Advantages: ▫ It can be used for any type of statistical analysis. ▫ No special computations are required. ▫ The parameters estimations are unbiased. ▫ The standard errors are appropriate compare to original data. • Disadvantages: ▫ May remove a considerable fraction of data
  • 6. Pair wise deletion • Pairwise deletion involves dropping cases with missing values on an analysis-by- analysis basis • Advantages: ▫ Using all available non-missing data • Disadvantages: ▫ Estimated standard errors and test statistics are biased
  • 7. Variable deletion • Variable deletion involves dropping variables with missing values on an case - by-case basis • Advantages: ▫ Makes sense when lot of missing values in a variable and if the variable is of relatively less importance • Disadvantages: ▫ Loss of information regarding the variable
  • 8. Mean imputation • Replace missing values with the mean of that variable Case Var1 Var2 Var3 1 9 8 8 2 7.44 7 6 3 8 5 6 4 7 4 5 5 9 5 7 6 8 8 9 7 6 7 6 8 5 9 7 9 7 8 ? 10 8 8 7
  • 9. Conditional Mean imputation • Replace missing values with value of the variable mean for a relevant subgroup Case Var1 Sex Var2 Var3 1 9 F 8 8 2 8.25 F 7 6 3 8 F 5 6 4 7 F 4 5 5 9 F 5 7 6 8 M 8 9 7 6 M 7 6 8 5 M 9 7 9 7 M 8 ? 10 8 M 8 7
  • 10. Case Mean imputation • Replace missing values using information from other variables for the same case to impute the missing value Case Var1 Var2 Var3 1 9 8 8 2 6.50 7 6 3 8 5 6 4 7 4 5 5 9 5 7 6 8 8 9 7 6 7 6 8 5 9 7 9 7 8 ? 10 8 8 7
  • 11. • Imputes the missing value as a value on the same outcome the most recent time it was observed • Variants : • Average of T1 and T2 Last observation carried forward Case T1 T2 T3 1 9 8 8 2 ? 7 6 3 8 5 6 4 7 4 5 5 9 5 7 6 8 8 9 7 6 7 6 8 5 9 7 9 7 8 8 10 8 8 7
  • 12. • Use interpolation to fill in missing values • Useful for longitudinal datasets Interpolation
  • 13. • Worst case replaces a missing value with the worst case scenario for a categorical outcome • Best case replaces a missing value with the best case scenario for a categorical outcome Worst case and Best case imputation
  • 14. • Substitute best missing values using a ML imputation • In the E-step, expected values are calculated based on all complete data points • In the M-step, the procedure imputes the expected values from the E-step and then maximizes the likelihood function to obtain new parameter estimates Expectation-Maximization
  • 15. • Multiple imputation is quickly becoming the “gold standard” approach to handling missing values • Computationally complex Multiple imputation