SlideShare a Scribd company logo
1 of 14
A Survey on Missing Information Strategies and
Imputation Methods in Healthcare
Presented By
Saroj Kumar Pandey
Department of Information Technology
National Institute of Technology, Raipur (C.G.)
Introduction
Strategies of missing data
Techniques for managing the missing information
Supporting tools
Conclusion
References
Outline…
Introduction
Issue of missing information is generally basic in many existing exploration informational
index and can significantly affect the outcomes.
An issue in healthcare framework happens when information are absent in at least one
spots chance elements.
Missing information shows different issues
Absence of data reduce the statistical power.
Lost data can cause bias in the estimation of parameters.
Reduce the representativeness of the samples.
Complicate the analysis of the study.
Levels of missing information
Strategies of missing data
Missing completely at random (MCAR)
No –No condition
Example: Blood pressure measurement is missing because of break down of an
automatic sphygmomanometer.
Missing at random (MAR)
No-Yes condition
Example : Missing blood pressure measurement may be lower than measured
blood pressure because younger people may have more likely to have missing
blood pressure measurement.
Missing not at random (MNAR)
Yes-Yes condition
Examples: Suppose the study is not effective for reducing the blood pressure, there
may be a chance of subjects drop out.
Techniques for managing the missing information
List-wise deletion
o Decrease statistical power.
o May introduce bias in parameter.
o Default option in many statistical package.
Pair-wise deletion
o Preserve great deal of information than list wise deletion.
o Interpretation become difficult.
o May lead mathematically inconsistent correlation.
Cont…
Mean imputation
o Involves replacing missing value with the value of the sample mean for that
variable.
o The oldest most widely used method.
Regression imputation
o Estimate missing data based on other variables in the data set.
o Better than list-wise and pair wise deletion .
N
xf
x

 )(
Cont…
Last observation carried forward
o Replaces every missing value with the last observed value from the same subject.
o Easy to understand and communicate between the statisticians and clinicians or
between a sponsor and the researcher.
Maximum likelihood imputation
o The assumption that the observed data are a sample drawn from a multivariate normal
distribution is relatively easy to understand.
o Parameters are estimated using the available data, the missing data are estimated based
on the parameters which have just been estimated.
Cont…
Expectation maximization
o Type of the maximum likelihood method that can be used to create a new data set,
in which all missing values are imputed with values estimated by the maximum
likelihood methods.
Multiple Imputations(MI)
o Multiple imputation technique is used to replacing missing data value when a data set
having more than one missing data.
o Every imputed information is examined in the same manner by standard information
techniques, and the results are merged using the simple mathematics
Expectation step
Update variable
Maximization step
Update hypothesis
Supporting tools
R-studio: It supports numerous libraries such as “norm”, “cat”, “mix”, and “pan”
for imputing information under multivariable standard models namely, log-linear
models, general location models, and linear mixed models.
MATLAB: While missing data are present in the data set, You can fill missing
value with the following: ‘constant’, 'previous', 'next', 'nearest', 'linear', 'spline',
'pchip' .
SAS: PROC MI applies regression methods and propensity scores for imputation.
IVEware: Imputation and Variance Estimation programmed tool for SRMI,
MICE: Multiple Imputation tool using Chained Equations, library available in
both S-plus and R –studio .
Conclusion
The article in general emphasis on the level of disappeared and mislaid data
contrivances (problems) and various missing data managing practices and tools.
Distinguishing what should and should not be imputed is usually not possible
using a single code for every type of the missing value.
It is difficult to know whether the multiple imputation or full maximum likelihood
estimation is best, but both are superior to the traditional approaches. Both
techniques are best used with large samples.
References
1. J. Luengo, S. García, and F. Herrera, On the choice of the best imputation methods for missing values
considering three groups of classification methods, vol. 32, no. 1. 2012.
2. J. Luengo, J. A. Sáez, and F. Herrera, “Missing data imputation for fuzzy rule-based classification systems,”
Soft Comput., vol. 16, no. 5, pp. 863–881, 2012.
3. R. T. O’Neill and R. Temple, “The prevention and treatment of missing data in clinical trials: an FDA
perspective on the importance of dealing with it.,” Clin. Pharmacol. Ther., vol. 91, no. 3, pp. 550–4, 2012.
4. P. D. Allison, “Missing Data,” vol. 17, no. 4, pp. 372–411, 2008.
5. D. B. Rubin, “Inference and missing data,” Biometrika, vol. 63, no. 3. pp. 581–592, 1976.
6. G. E. A. P. A. Batista and M. C. Monard, “An analysis of four missing data treatment methods for supervised
learning,” Appl. Artif. Intell., vol. 17, no. 5–6, pp. 519–533, 2003.
7. J. D. Dziura, L. A. Post, Q. Zhao, Z. Fu, and P. Peduzzi, “Strategies for dealing with missing data in clinical
trials: from design to analysis.,” Yale J. Biol. Med., vol. 86, no. 3, pp. 343–58, 2013.
8. H. Daniell, “NIH Public Access,” vol. 76, no. October 2009, pp. 211–220, 2012.
9. H. Kang, “The prevention and handling of the missing data,” vol. 64, no. 5, pp. 402–406, 2013.
10. M. Soley-bori, “Dealing with missing data: Key assumptions and methods for applied analysis,” PM931 Dir.
Study Heal. Policy Manag., no. 4, p. 20, 2013.
11. J. Figueredo, P. E. McKnight, K. M. McKnight, and S. Sidani, “Multivariate modeling of missing data
within and across assessment waves.,” Addiction, vol. 95 Suppl 3, no. February, pp. S361–S380, 2000.
12. X. P. Zhu, “Comparison of Four Methods for Handing Missing Data in Longitudinal Data Analysis through a
Simulation Study,” Open J. Stat., vol. 4, no. 4, pp. 933–944, 2014.
Cont…
13. A. N. Baraldi and C. K. Enders, “An introduction to modern missing data analyses,” J. Sch. Psychol., vol. 48,
no. 1, pp. 5–37, 2010.
14. H. Xu, “LOCF Method and Application in Clinical Data Analysis,” Sugi, no. 2, pp. 1–5, 2009.
15. R. M. Hamer and P. M. Simpson, “Last observation carried forward versus mixed models in the analysis of
psychiatric clinical trials (American Journal of Psychiatry (2009) 166, (639-641)),” Am. J. Psychiatry, vol.
166, no. 8, p. 942, 2009.
16. P. D. Allison, “Handling Missing Data by Maximum Likelihood,” SAS Glob. Forum 2012 Stat. Data Anal.,
pp. 1–21, 2012.
17. Y. Dong and C.-Y. J. Peng, “Principled missing data methods for researchers.,” Springerplus, vol. 2, no. 1, p.
222, 2013.
18. A. A. P. Dempster, N. M. Laird, D. B. Rubin, S. Journal, R. Statistical, and S. Series, Maximum Likelihood
from Incomplete Data via the EM Algorithm, vol. 39, no. 1. 2017.
19. L. M. Collins, J. L. Schafer, and C. M. Kam, “A comparison of inclusive and restrictive strategies in modern
missing data procedures.,” Psychol. Methods, vol. 6, no. 4, pp. 330–51, 2001.
20. E.-L. Silva-Ramírez, R. Pino-Mejías, M. López-Coello, and M.-D. Cubiles-de-la-Vega, “Missing value
imputation on missing completely at random data using multilayer perceptrons.,” Neural networks, vol. 24,
no. 1, pp. 121–129, 2011.
21. Rosato, Rosalba, et al. "Missing data imputation in longitudinal trial of endometrial cancer patients."
QUALITY OF LIFE RESEARCH. Vol. 25. VAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT,
NETHERLANDS: SPRINGER, 2016.
22. Beaulieu-Jones, Brett K., and Jason H. Moore. "Missing data imputation in the electronic health record using
deeply learned autoencoders.” PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017
23. Zeng, Yan, et al. "A Study of Missing Data Imputation in Predictive Modeling of a Wood-Composite
Manufacturing Process." Journal of Quality Technology 48.3 (2016): 284.
Thank you

More Related Content

What's hot

September Journal Club -Aishwarya
September Journal Club -AishwaryaSeptember Journal Club -Aishwarya
September Journal Club -AishwaryaRSG Luxembourg
 
ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)
ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)
ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)Lenis Beatriz Marquez Vidal
 
9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic Review9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic ReviewResearchGuru
 
Basics of Educational Statistics (Sampling and Types)
Basics of Educational Statistics (Sampling and Types)Basics of Educational Statistics (Sampling and Types)
Basics of Educational Statistics (Sampling and Types)HennaAnsari
 
Assumptions about parametric and non parametric tests
Assumptions about parametric and non parametric testsAssumptions about parametric and non parametric tests
Assumptions about parametric and non parametric testsBarath Babu Kumar
 
Tutorial parametric tests
Tutorial   parametric testsTutorial   parametric tests
Tutorial parametric testsKen Plummer
 
How to handle discrepancies while you collect data for systemic review – pubrica
How to handle discrepancies while you collect data for systemic review – pubricaHow to handle discrepancies while you collect data for systemic review – pubrica
How to handle discrepancies while you collect data for systemic review – pubricaPubrica
 
Analytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure ScienceAnalytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure ScienceLarry Michael
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsMaarten van Smeden
 
Statistical Approaches to Missing Data
Statistical Approaches to Missing DataStatistical Approaches to Missing Data
Statistical Approaches to Missing DataDataCards
 
Research methodology and biostatistics
Research methodology and biostatisticsResearch methodology and biostatistics
Research methodology and biostatisticsMedical Ultrasound
 
Common statistical tools used in research and their uses
Common statistical tools used in research and their usesCommon statistical tools used in research and their uses
Common statistical tools used in research and their usesNorhac Kali
 
Glossary
GlossaryGlossary
Glossaryasfawm
 

What's hot (17)

September Journal Club -Aishwarya
September Journal Club -AishwaryaSeptember Journal Club -Aishwarya
September Journal Club -Aishwarya
 
ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)
ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)
ANALYZING DATA BY BY SELLIGER AND SHAOAMY (1989)
 
9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic Review9-Meta Analysis/ Systematic Review
9-Meta Analysis/ Systematic Review
 
Basics of Educational Statistics (Sampling and Types)
Basics of Educational Statistics (Sampling and Types)Basics of Educational Statistics (Sampling and Types)
Basics of Educational Statistics (Sampling and Types)
 
Analyzing data
Analyzing dataAnalyzing data
Analyzing data
 
Assumptions about parametric and non parametric tests
Assumptions about parametric and non parametric testsAssumptions about parametric and non parametric tests
Assumptions about parametric and non parametric tests
 
Multiple imputation of missing data
Multiple imputation of missing dataMultiple imputation of missing data
Multiple imputation of missing data
 
Tutorial parametric tests
Tutorial   parametric testsTutorial   parametric tests
Tutorial parametric tests
 
Systematic review and meta analysis applications in medication safety 2
Systematic review and meta analysis applications in medication safety 2Systematic review and meta analysis applications in medication safety 2
Systematic review and meta analysis applications in medication safety 2
 
How to handle discrepancies while you collect data for systemic review – pubrica
How to handle discrepancies while you collect data for systemic review – pubricaHow to handle discrepancies while you collect data for systemic review – pubrica
How to handle discrepancies while you collect data for systemic review – pubrica
 
Statistics and data analysis
Statistics  and data analysisStatistics  and data analysis
Statistics and data analysis
 
Analytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure ScienceAnalytical Chemistry and Statistics in Exposure Science
Analytical Chemistry and Statistics in Exposure Science
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questions
 
Statistical Approaches to Missing Data
Statistical Approaches to Missing DataStatistical Approaches to Missing Data
Statistical Approaches to Missing Data
 
Research methodology and biostatistics
Research methodology and biostatisticsResearch methodology and biostatistics
Research methodology and biostatistics
 
Common statistical tools used in research and their uses
Common statistical tools used in research and their usesCommon statistical tools used in research and their uses
Common statistical tools used in research and their uses
 
Glossary
GlossaryGlossary
Glossary
 

Similar to A survey on missing information strategies and imputation methods in healthcare

Chapter 19Basic Quantitative Data AnalysisData Cleaning.docx
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docxChapter 19Basic Quantitative Data AnalysisData Cleaning.docx
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docxketurahhazelhurst
 
Capstone poster gail_falcione (1)
Capstone poster gail_falcione (1)Capstone poster gail_falcione (1)
Capstone poster gail_falcione (1)Gail Falcione
 
Data Presentation & Analysis.pptx
Data Presentation & Analysis.pptxData Presentation & Analysis.pptx
Data Presentation & Analysis.pptxheencomm
 
Research EDU821-1.pptx
Research EDU821-1.pptxResearch EDU821-1.pptx
Research EDU821-1.pptxSalmaNiazi2
 
Level of Measurement, Frequency Distribution,Stem & Leaf
Level of Measurement, Frequency Distribution,Stem & Leaf   Level of Measurement, Frequency Distribution,Stem & Leaf
Level of Measurement, Frequency Distribution,Stem & Leaf Qasim Raza
 
Towards reducing the
Towards reducing theTowards reducing the
Towards reducing theIJDKP
 
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...CSCJournals
 
analysing_data_using_spss.pdf
analysing_data_using_spss.pdfanalysing_data_using_spss.pdf
analysing_data_using_spss.pdfDrAnilKannur1
 
Analysis Of Data Using SPSS
Analysis Of Data Using SPSSAnalysis Of Data Using SPSS
Analysis Of Data Using SPSSBrittany Brown
 
A Magnified Application of Deficient Data Using Bolzano Classifier
A Magnified Application of Deficient Data Using Bolzano ClassifierA Magnified Application of Deficient Data Using Bolzano Classifier
A Magnified Application of Deficient Data Using Bolzano Classifierjournal ijrtem
 
Machine learning to solve bioinformatics problems
Machine learning to solve bioinformatics problemsMachine learning to solve bioinformatics problems
Machine learning to solve bioinformatics problemsJunaidAKG
 
Heart Diseases Diagnosis Using Data Mining Techniques
Heart Diseases Diagnosis Using Data Mining TechniquesHeart Diseases Diagnosis Using Data Mining Techniques
Heart Diseases Diagnosis Using Data Mining Techniquespaperpublications3
 
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IJDKP
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
 
Es credit scoring_2020
Es credit scoring_2020Es credit scoring_2020
Es credit scoring_2020Eero Siljander
 
Talk on reproducibility in EEG research
Talk on reproducibility in EEG researchTalk on reproducibility in EEG research
Talk on reproducibility in EEG researchDorothy Bishop
 
Biostats2019 5
Biostats2019 5Biostats2019 5
Biostats2019 5daforerog
 

Similar to A survey on missing information strategies and imputation methods in healthcare (20)

Chapter 19Basic Quantitative Data AnalysisData Cleaning.docx
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docxChapter 19Basic Quantitative Data AnalysisData Cleaning.docx
Chapter 19Basic Quantitative Data AnalysisData Cleaning.docx
 
Capstone poster gail_falcione (1)
Capstone poster gail_falcione (1)Capstone poster gail_falcione (1)
Capstone poster gail_falcione (1)
 
Data Presentation & Analysis.pptx
Data Presentation & Analysis.pptxData Presentation & Analysis.pptx
Data Presentation & Analysis.pptx
 
Research EDU821-1.pptx
Research EDU821-1.pptxResearch EDU821-1.pptx
Research EDU821-1.pptx
 
Nursing research design
Nursing research designNursing research design
Nursing research design
 
Level of Measurement, Frequency Distribution,Stem & Leaf
Level of Measurement, Frequency Distribution,Stem & Leaf   Level of Measurement, Frequency Distribution,Stem & Leaf
Level of Measurement, Frequency Distribution,Stem & Leaf
 
Towards reducing the
Towards reducing theTowards reducing the
Towards reducing the
 
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
Evaluation of Logistic Regression and Neural Network Model With Sensitivity A...
 
analysing_data_using_spss.pdf
analysing_data_using_spss.pdfanalysing_data_using_spss.pdf
analysing_data_using_spss.pdf
 
analysing_data_using_spss.pdf
analysing_data_using_spss.pdfanalysing_data_using_spss.pdf
analysing_data_using_spss.pdf
 
Analysis Of Data Using SPSS
Analysis Of Data Using SPSSAnalysis Of Data Using SPSS
Analysis Of Data Using SPSS
 
A Magnified Application of Deficient Data Using Bolzano Classifier
A Magnified Application of Deficient Data Using Bolzano ClassifierA Magnified Application of Deficient Data Using Bolzano Classifier
A Magnified Application of Deficient Data Using Bolzano Classifier
 
Machine learning to solve bioinformatics problems
Machine learning to solve bioinformatics problemsMachine learning to solve bioinformatics problems
Machine learning to solve bioinformatics problems
 
Heart Diseases Diagnosis Using Data Mining Techniques
Heart Diseases Diagnosis Using Data Mining TechniquesHeart Diseases Diagnosis Using Data Mining Techniques
Heart Diseases Diagnosis Using Data Mining Techniques
 
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
IDENTIFICATION OF OUTLIERS IN OXAZOLINES AND OXAZOLES HIGH DIMENSION MOLECULA...
 
Basic concept of statistics
Basic concept of statisticsBasic concept of statistics
Basic concept of statistics
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
Es credit scoring_2020
Es credit scoring_2020Es credit scoring_2020
Es credit scoring_2020
 
Talk on reproducibility in EEG research
Talk on reproducibility in EEG researchTalk on reproducibility in EEG research
Talk on reproducibility in EEG research
 
Biostats2019 5
Biostats2019 5Biostats2019 5
Biostats2019 5
 

Recently uploaded

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 

Recently uploaded (20)

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 

A survey on missing information strategies and imputation methods in healthcare

  • 1. A Survey on Missing Information Strategies and Imputation Methods in Healthcare Presented By Saroj Kumar Pandey Department of Information Technology National Institute of Technology, Raipur (C.G.)
  • 2. Introduction Strategies of missing data Techniques for managing the missing information Supporting tools Conclusion References Outline…
  • 3. Introduction Issue of missing information is generally basic in many existing exploration informational index and can significantly affect the outcomes. An issue in healthcare framework happens when information are absent in at least one spots chance elements. Missing information shows different issues Absence of data reduce the statistical power. Lost data can cause bias in the estimation of parameters. Reduce the representativeness of the samples. Complicate the analysis of the study.
  • 4. Levels of missing information
  • 5. Strategies of missing data Missing completely at random (MCAR) No –No condition Example: Blood pressure measurement is missing because of break down of an automatic sphygmomanometer. Missing at random (MAR) No-Yes condition Example : Missing blood pressure measurement may be lower than measured blood pressure because younger people may have more likely to have missing blood pressure measurement. Missing not at random (MNAR) Yes-Yes condition Examples: Suppose the study is not effective for reducing the blood pressure, there may be a chance of subjects drop out.
  • 6. Techniques for managing the missing information List-wise deletion o Decrease statistical power. o May introduce bias in parameter. o Default option in many statistical package. Pair-wise deletion o Preserve great deal of information than list wise deletion. o Interpretation become difficult. o May lead mathematically inconsistent correlation.
  • 7. Cont… Mean imputation o Involves replacing missing value with the value of the sample mean for that variable. o The oldest most widely used method. Regression imputation o Estimate missing data based on other variables in the data set. o Better than list-wise and pair wise deletion . N xf x   )(
  • 8. Cont… Last observation carried forward o Replaces every missing value with the last observed value from the same subject. o Easy to understand and communicate between the statisticians and clinicians or between a sponsor and the researcher. Maximum likelihood imputation o The assumption that the observed data are a sample drawn from a multivariate normal distribution is relatively easy to understand. o Parameters are estimated using the available data, the missing data are estimated based on the parameters which have just been estimated.
  • 9. Cont… Expectation maximization o Type of the maximum likelihood method that can be used to create a new data set, in which all missing values are imputed with values estimated by the maximum likelihood methods. Multiple Imputations(MI) o Multiple imputation technique is used to replacing missing data value when a data set having more than one missing data. o Every imputed information is examined in the same manner by standard information techniques, and the results are merged using the simple mathematics Expectation step Update variable Maximization step Update hypothesis
  • 10. Supporting tools R-studio: It supports numerous libraries such as “norm”, “cat”, “mix”, and “pan” for imputing information under multivariable standard models namely, log-linear models, general location models, and linear mixed models. MATLAB: While missing data are present in the data set, You can fill missing value with the following: ‘constant’, 'previous', 'next', 'nearest', 'linear', 'spline', 'pchip' . SAS: PROC MI applies regression methods and propensity scores for imputation. IVEware: Imputation and Variance Estimation programmed tool for SRMI, MICE: Multiple Imputation tool using Chained Equations, library available in both S-plus and R –studio .
  • 11. Conclusion The article in general emphasis on the level of disappeared and mislaid data contrivances (problems) and various missing data managing practices and tools. Distinguishing what should and should not be imputed is usually not possible using a single code for every type of the missing value. It is difficult to know whether the multiple imputation or full maximum likelihood estimation is best, but both are superior to the traditional approaches. Both techniques are best used with large samples.
  • 12. References 1. J. Luengo, S. García, and F. Herrera, On the choice of the best imputation methods for missing values considering three groups of classification methods, vol. 32, no. 1. 2012. 2. J. Luengo, J. A. Sáez, and F. Herrera, “Missing data imputation for fuzzy rule-based classification systems,” Soft Comput., vol. 16, no. 5, pp. 863–881, 2012. 3. R. T. O’Neill and R. Temple, “The prevention and treatment of missing data in clinical trials: an FDA perspective on the importance of dealing with it.,” Clin. Pharmacol. Ther., vol. 91, no. 3, pp. 550–4, 2012. 4. P. D. Allison, “Missing Data,” vol. 17, no. 4, pp. 372–411, 2008. 5. D. B. Rubin, “Inference and missing data,” Biometrika, vol. 63, no. 3. pp. 581–592, 1976. 6. G. E. A. P. A. Batista and M. C. Monard, “An analysis of four missing data treatment methods for supervised learning,” Appl. Artif. Intell., vol. 17, no. 5–6, pp. 519–533, 2003. 7. J. D. Dziura, L. A. Post, Q. Zhao, Z. Fu, and P. Peduzzi, “Strategies for dealing with missing data in clinical trials: from design to analysis.,” Yale J. Biol. Med., vol. 86, no. 3, pp. 343–58, 2013. 8. H. Daniell, “NIH Public Access,” vol. 76, no. October 2009, pp. 211–220, 2012. 9. H. Kang, “The prevention and handling of the missing data,” vol. 64, no. 5, pp. 402–406, 2013. 10. M. Soley-bori, “Dealing with missing data: Key assumptions and methods for applied analysis,” PM931 Dir. Study Heal. Policy Manag., no. 4, p. 20, 2013. 11. J. Figueredo, P. E. McKnight, K. M. McKnight, and S. Sidani, “Multivariate modeling of missing data within and across assessment waves.,” Addiction, vol. 95 Suppl 3, no. February, pp. S361–S380, 2000. 12. X. P. Zhu, “Comparison of Four Methods for Handing Missing Data in Longitudinal Data Analysis through a Simulation Study,” Open J. Stat., vol. 4, no. 4, pp. 933–944, 2014.
  • 13. Cont… 13. A. N. Baraldi and C. K. Enders, “An introduction to modern missing data analyses,” J. Sch. Psychol., vol. 48, no. 1, pp. 5–37, 2010. 14. H. Xu, “LOCF Method and Application in Clinical Data Analysis,” Sugi, no. 2, pp. 1–5, 2009. 15. R. M. Hamer and P. M. Simpson, “Last observation carried forward versus mixed models in the analysis of psychiatric clinical trials (American Journal of Psychiatry (2009) 166, (639-641)),” Am. J. Psychiatry, vol. 166, no. 8, p. 942, 2009. 16. P. D. Allison, “Handling Missing Data by Maximum Likelihood,” SAS Glob. Forum 2012 Stat. Data Anal., pp. 1–21, 2012. 17. Y. Dong and C.-Y. J. Peng, “Principled missing data methods for researchers.,” Springerplus, vol. 2, no. 1, p. 222, 2013. 18. A. A. P. Dempster, N. M. Laird, D. B. Rubin, S. Journal, R. Statistical, and S. Series, Maximum Likelihood from Incomplete Data via the EM Algorithm, vol. 39, no. 1. 2017. 19. L. M. Collins, J. L. Schafer, and C. M. Kam, “A comparison of inclusive and restrictive strategies in modern missing data procedures.,” Psychol. Methods, vol. 6, no. 4, pp. 330–51, 2001. 20. E.-L. Silva-Ramírez, R. Pino-Mejías, M. López-Coello, and M.-D. Cubiles-de-la-Vega, “Missing value imputation on missing completely at random data using multilayer perceptrons.,” Neural networks, vol. 24, no. 1, pp. 121–129, 2011. 21. Rosato, Rosalba, et al. "Missing data imputation in longitudinal trial of endometrial cancer patients." QUALITY OF LIFE RESEARCH. Vol. 25. VAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT, NETHERLANDS: SPRINGER, 2016. 22. Beaulieu-Jones, Brett K., and Jason H. Moore. "Missing data imputation in the electronic health record using deeply learned autoencoders.” PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017 23. Zeng, Yan, et al. "A Study of Missing Data Imputation in Predictive Modeling of a Wood-Composite Manufacturing Process." Journal of Quality Technology 48.3 (2016): 284.