SlideShare a Scribd company logo
1 of 17
mindcraft.ai
Data Imputation and Restoration using Reverse ML
Data imputation heals spoiled data
Dataset models the world only partially
Input, Transformation, Interpretation
Difference between 0 and NULL
(no item, no info, not available, no input)
Impute or Remove
mindcraft.ai
Types of Item Non-Response
Missing at Random (MAR)
Missing Completely at Random (MCAR)
Missing not at Random (MNAR)
Deletion for MAR and MCAR only
mindcraft.ai
Types of Imputation
Univariate imputation: Impute values using only the target variable itself (Mean).
Multivariate imputation: Impute values based on other variables (LR).
Single imputation: Impute any missing values within the dataset only once to
create a single imputed dataset.
Multiple imputation: Impute the same missing values within the dataset multiple
times (MICE).
mindcraft.ai
Imputation methods - Simple and Out of Box
Remove Data
- multivariate missing?
Deductive Investigation
Zero, Constant
Random (uniform, normal)
mindcraft.ai
Imputation methods - Basic
Mean, Median, Mode:
- reduce variance
- ignores correlation
- NULL category
LR or any other regression using NN
- problem in multivariate
KNN, Fuzzy Clustering
- sensitive for outliers
- heavy computation
References: https://towardsdatascience.com/6-different-ways-to-
compensate-for-missing-values-data-imputation-with-examples-
6022d9ca0779
mindcraft.ai
Imputation methods - MICE
Multivariate Imputation by Chained Equation
Multiple Regressions
Predictive Mean Matching
Generate values from predictive
distributions
Uncertainty and MCMC
References: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3074241/
https://towardsdatascience.com/how-to-handle-missing-data-
8646b18db0d4
mindcraft.ai
Imputation methods - Time Series
Last Observation Carried Forward (LOCF)
Next Observation Carried Backward (NOCB)
Interpolation (Linear, RNN)
Seasonal Adjustment + Interpolation
Interpolation -> Extrapolation
-> Predictive Models
mindcraft.ai
Imputation methods - Cleaning
AutoEncoder
Limited amount of missed data
Reference: https://github.com/andy-bosyi/articles/blob/master/AutoEncoder-MNIST-
clean.ipynb
mindcraft.ai
Imputation methods - Generative Networks
VAE
GAIN
MisGAN
VIGAN
CollaGAN
References: https://towardsdatascience.com/gans-and-missing-data-imputation-
815a0cbc4ece
mindcraft.ai
Reverse ML - Training AutoEncoder
Add Dropout
Regularization:
Reference: https://github.com/andy-
bosyi/articles/blob/master/ReverseTrainedAutoEncoder-MNIST.ipynb
mindcraft.ai
Reverse ML - Direct AE Application
Original Data
Missing 36%
Restored by AE
mindcraft.ai
Reverse ML - Training AutoEncoder
Add Dropout
Regularization:
Reference: https://github.com/andy-
bosyi/articles/blob/master/ReverseTrainedAutoEncoder-MNIST.ipynb
mindcraft.ai
Reverse ML - Result on AE Input
Original Data
Missing 36%
Restored by RTAE
as Input
mindcraft.ai
Reverse ML - Result on AE Output
Original Data
Missing 36%
Restored by RTAE
as Output
mindcraft.ai
Reverse ML - Results and Conclusion
AE
Acc = 90.56%
RTAE
Acc = 96.22%
Better accuracy than classical methods
Requires more computational resources
Stable to compare with generative models
Scalability
Reference: https://github.com/andy-
bosyi/articles/blob/master/ReverseTrainedAutoEncoder-MNIST.ipynb
mindcraft.ai
Das ist MindCraft
Decision-making Engines for Data-driven Businesses, especially:
- Document and Web pages Classification, Capturing (NLP, CNN, CV, NER)
- Price Prediction (DNN, Regression, Prognosis)
- Command Centers for IoT systems (RNN, Time Series, Anomaly Detection)
- Computer Vision and Object Detection
- Data Analysis and Generation
Andy Bosyi: Data Imputation using Reverse ML

More Related Content

Similar to Andy Bosyi: Data Imputation using Reverse ML

Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningBenjamin Bengfort
 
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning TechniquesAnalysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning TechniquesIRJET Journal
 
2016 VLDB - The iBench Integration Metadata Generator
2016 VLDB - The iBench Integration Metadata Generator2016 VLDB - The iBench Integration Metadata Generator
2016 VLDB - The iBench Integration Metadata GeneratorBoris Glavic
 
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTER
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTERDynamic Batch Parallel Algorithms for Updating PageRank : POSTER
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTERSubhajit Sahu
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streamingAdam Doyle
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsQuantUniversity
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUsCarol McDonald
 
Missing Value imputation, Poor man's
Missing Value imputation, Poor man'sMissing Value imputation, Poor man's
Missing Value imputation, Poor man'sLeonardo Auslender
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-stepsShesha R
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONIRJET Journal
 
Google Developer Groups Talk - TensorFlow
Google Developer Groups Talk - TensorFlowGoogle Developer Groups Talk - TensorFlow
Google Developer Groups Talk - TensorFlowHarini Gunabalan
 

Similar to Andy Bosyi: Data Imputation using Reverse ML (20)

Session 6.pdf
Session 6.pdfSession 6.pdf
Session 6.pdf
 
Session 6.pdf
Session 6.pdfSession 6.pdf
Session 6.pdf
 
MTECH IT syllabus
MTECH IT syllabusMTECH IT syllabus
MTECH IT syllabus
 
Visual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learningVisual diagnostics for more effective machine learning
Visual diagnostics for more effective machine learning
 
Ds for finance day 3
Ds for finance day 3Ds for finance day 3
Ds for finance day 3
 
Machine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdfMachine Learning_Unit 2_Full.ppt.pdf
Machine Learning_Unit 2_Full.ppt.pdf
 
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning TechniquesAnalysis on Fraud Detection Mechanisms Using Machine Learning Techniques
Analysis on Fraud Detection Mechanisms Using Machine Learning Techniques
 
2016 VLDB - The iBench Integration Metadata Generator
2016 VLDB - The iBench Integration Metadata Generator2016 VLDB - The iBench Integration Metadata Generator
2016 VLDB - The iBench Integration Metadata Generator
 
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTER
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTERDynamic Batch Parallel Algorithms for Updating PageRank : POSTER
Dynamic Batch Parallel Algorithms for Updating PageRank : POSTER
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streaming
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and Applications
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANN
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
Introduction to machine learning with GPUs
Introduction to machine learning with GPUsIntroduction to machine learning with GPUs
Introduction to machine learning with GPUs
 
Missing Value imputation, Poor man's
Missing Value imputation, Poor man'sMissing Value imputation, Poor man's
Missing Value imputation, Poor man's
 
Data analytcis-first-steps
Data analytcis-first-stepsData analytcis-first-steps
Data analytcis-first-steps
 
AIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTIONAIRLINE FARE PRICE PREDICTION
AIRLINE FARE PRICE PREDICTION
 
Google Developer Groups Talk - TensorFlow
Google Developer Groups Talk - TensorFlowGoogle Developer Groups Talk - TensorFlow
Google Developer Groups Talk - TensorFlow
 
Machine learning
Machine learningMachine learning
Machine learning
 

More from Edunomica

Daniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of WorkDaniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of WorkEdunomica
 
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...Edunomica
 
Zack Johnson: Session title: People Analytics: the epicenter of management an...
Zack Johnson: Session title: People Analytics: the epicenter of management an...Zack Johnson: Session title: People Analytics: the epicenter of management an...
Zack Johnson: Session title: People Analytics: the epicenter of management an...Edunomica
 
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...Edunomica
 
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...Edunomica
 
Kevin Martin: The New Corporate Currency
Kevin Martin: The New Corporate CurrencyKevin Martin: The New Corporate Currency
Kevin Martin: The New Corporate CurrencyEdunomica
 
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...Edunomica
 
Kevin Martin: Empowering Your Board with the People Analytics That Matter
Kevin Martin: Empowering Your Board with the People Analytics That MatterKevin Martin: Empowering Your Board with the People Analytics That Matter
Kevin Martin: Empowering Your Board with the People Analytics That MatterEdunomica
 
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...Edunomica
 
Alan Susi: Organizational Health: A People Team’s path to Minimum Viable ‘Wo...
Alan Susi: Organizational Health:  A People Team’s path to Minimum Viable ‘Wo...Alan Susi: Organizational Health:  A People Team’s path to Minimum Viable ‘Wo...
Alan Susi: Organizational Health: A People Team’s path to Minimum Viable ‘Wo...Edunomica
 
Cole Napper: Are you ready for generative AI in people analytics?
Cole Napper: Are you ready for generative AI in people analytics?Cole Napper: Are you ready for generative AI in people analytics?
Cole Napper: Are you ready for generative AI in people analytics?Edunomica
 
Fahim Karim: Attrition Prevention
Fahim Karim: Attrition PreventionFahim Karim: Attrition Prevention
Fahim Karim: Attrition PreventionEdunomica
 
Taras Filatov: Building your own metaverse & NFT app
Taras Filatov: Building your own metaverse & NFT appTaras Filatov: Building your own metaverse & NFT app
Taras Filatov: Building your own metaverse & NFT appEdunomica
 
Alex Poon: Should you gamify community contributions?
Alex Poon: Should you gamify community contributions?Alex Poon: Should you gamify community contributions?
Alex Poon: Should you gamify community contributions?Edunomica
 
Julio Holon: Decentralised colaboration
Julio Holon: Decentralised colaborationJulio Holon: Decentralised colaboration
Julio Holon: Decentralised colaborationEdunomica
 
Startup Presentation: Gaianet
Startup Presentation: GaianetStartup Presentation: Gaianet
Startup Presentation: GaianetEdunomica
 
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescueShawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescueEdunomica
 
Joachim Stroh: Hypha DAO, the 3rd generation of DAOs
Joachim Stroh: Hypha DAO, the 3rd generation of DAOsJoachim Stroh: Hypha DAO, the 3rd generation of DAOs
Joachim Stroh: Hypha DAO, the 3rd generation of DAOsEdunomica
 
Vikram Aditya: Biggest Opportunity Areas in the DAOverse
Vikram Aditya: Biggest Opportunity Areas in the DAOverseVikram Aditya: Biggest Opportunity Areas in the DAOverse
Vikram Aditya: Biggest Opportunity Areas in the DAOverseEdunomica
 
Tamara Helenius: The Commons are Coming
Tamara Helenius: The Commons are ComingTamara Helenius: The Commons are Coming
Tamara Helenius: The Commons are ComingEdunomica
 

More from Edunomica (20)

Daniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of WorkDaniel Samaan: ChatGPT and the Future of Work
Daniel Samaan: ChatGPT and the Future of Work
 
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
Fanni Kadocsa: Unlocking the power of capability building: Maximizing the imp...
 
Zack Johnson: Session title: People Analytics: the epicenter of management an...
Zack Johnson: Session title: People Analytics: the epicenter of management an...Zack Johnson: Session title: People Analytics: the epicenter of management an...
Zack Johnson: Session title: People Analytics: the epicenter of management an...
 
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
Anita Zbieg: How to make data actionable? Lessons from the teams on how to tu...
 
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
Aizhan Tursunbayeva: The ethics of people analytics: risks, opportunities and...
 
Kevin Martin: The New Corporate Currency
Kevin Martin: The New Corporate CurrencyKevin Martin: The New Corporate Currency
Kevin Martin: The New Corporate Currency
 
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
Catherine Coppinger: Why Anchor Days Are Sinking Productivity & What to Do Ab...
 
Kevin Martin: Empowering Your Board with the People Analytics That Matter
Kevin Martin: Empowering Your Board with the People Analytics That MatterKevin Martin: Empowering Your Board with the People Analytics That Matter
Kevin Martin: Empowering Your Board with the People Analytics That Matter
 
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
Aizhan Tursunbayeva: Navigating Opportunities and Risks: A Responsible Approa...
 
Alan Susi: Organizational Health: A People Team’s path to Minimum Viable ‘Wo...
Alan Susi: Organizational Health:  A People Team’s path to Minimum Viable ‘Wo...Alan Susi: Organizational Health:  A People Team’s path to Minimum Viable ‘Wo...
Alan Susi: Organizational Health: A People Team’s path to Minimum Viable ‘Wo...
 
Cole Napper: Are you ready for generative AI in people analytics?
Cole Napper: Are you ready for generative AI in people analytics?Cole Napper: Are you ready for generative AI in people analytics?
Cole Napper: Are you ready for generative AI in people analytics?
 
Fahim Karim: Attrition Prevention
Fahim Karim: Attrition PreventionFahim Karim: Attrition Prevention
Fahim Karim: Attrition Prevention
 
Taras Filatov: Building your own metaverse & NFT app
Taras Filatov: Building your own metaverse & NFT appTaras Filatov: Building your own metaverse & NFT app
Taras Filatov: Building your own metaverse & NFT app
 
Alex Poon: Should you gamify community contributions?
Alex Poon: Should you gamify community contributions?Alex Poon: Should you gamify community contributions?
Alex Poon: Should you gamify community contributions?
 
Julio Holon: Decentralised colaboration
Julio Holon: Decentralised colaborationJulio Holon: Decentralised colaboration
Julio Holon: Decentralised colaboration
 
Startup Presentation: Gaianet
Startup Presentation: GaianetStartup Presentation: Gaianet
Startup Presentation: Gaianet
 
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescueShawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
Shawn Grubb: Minnows v. whales: Quadratic Governance to the rescue
 
Joachim Stroh: Hypha DAO, the 3rd generation of DAOs
Joachim Stroh: Hypha DAO, the 3rd generation of DAOsJoachim Stroh: Hypha DAO, the 3rd generation of DAOs
Joachim Stroh: Hypha DAO, the 3rd generation of DAOs
 
Vikram Aditya: Biggest Opportunity Areas in the DAOverse
Vikram Aditya: Biggest Opportunity Areas in the DAOverseVikram Aditya: Biggest Opportunity Areas in the DAOverse
Vikram Aditya: Biggest Opportunity Areas in the DAOverse
 
Tamara Helenius: The Commons are Coming
Tamara Helenius: The Commons are ComingTamara Helenius: The Commons are Coming
Tamara Helenius: The Commons are Coming
 

Recently uploaded

00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![© ر
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![©  ر00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![©  ر
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![© رnafizanafzal
 
PALWAL CALL GIRL ❤ 8272964427❤ CALL GIRLS IN PALWAL ESCORTS
PALWAL CALL GIRL ❤ 8272964427❤ CALL GIRLS IN PALWAL ESCORTSPALWAL CALL GIRL ❤ 8272964427❤ CALL GIRLS IN PALWAL ESCORTS
PALWAL CALL GIRL ❤ 8272964427❤ CALL GIRLS IN PALWAL ESCORTSkajalroy875762
 
Presentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelledPresentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelledCaitlinCummins3
 
NAGPUR CALL GIRL ❤ 8272964427❤ CALL GIRLS IN NAGPUR ESCORTS SERVICE PROVIDE
NAGPUR CALL GIRL ❤ 8272964427❤ CALL GIRLS IN NAGPUR ESCORTS SERVICE PROVIDENAGPUR CALL GIRL ❤ 8272964427❤ CALL GIRLS IN NAGPUR ESCORTS SERVICE PROVIDE
NAGPUR CALL GIRL ❤ 8272964427❤ CALL GIRLS IN NAGPUR ESCORTS SERVICE PROVIDEkajalroy875762
 
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...yulianti213969
 
Progress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdf
Progress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdfProgress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdf
Progress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdfHolger Mueller
 
10 Influential Leaders Defining the Future of Digital Banking in 2024.pdf
10 Influential Leaders Defining the Future of Digital Banking in 2024.pdf10 Influential Leaders Defining the Future of Digital Banking in 2024.pdf
10 Influential Leaders Defining the Future of Digital Banking in 2024.pdfciolook1
 
Powerpoint showing results from tik tok metrics
Powerpoint showing results from tik tok metricsPowerpoint showing results from tik tok metrics
Powerpoint showing results from tik tok metricsCaitlinCummins3
 
Cracking the 'Career Pathing' Slideshare
Cracking the 'Career Pathing' SlideshareCracking the 'Career Pathing' Slideshare
Cracking the 'Career Pathing' SlideshareWorkforce Group
 
How does a bike-share company navigate speedy success? - Cyclistic
How does a bike-share company navigate speedy success? - CyclisticHow does a bike-share company navigate speedy success? - Cyclistic
How does a bike-share company navigate speedy success? - CyclisticChristofer Vizcaino
 
Shots fired Budget Presentation.pdf12312
Shots fired Budget Presentation.pdf12312Shots fired Budget Presentation.pdf12312
Shots fired Budget Presentation.pdf12312LR1709MUSIC
 
Solar Panel Installation A Comprehensive Guide.pdf
Solar Panel Installation A Comprehensive Guide.pdfSolar Panel Installation A Comprehensive Guide.pdf
Solar Panel Installation A Comprehensive Guide.pdfDYFA Electrical
 
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAIGetting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAITim Wilson
 
A DAY IN THE LIFE OF A SALESPERSON .pptx
A DAY IN THE LIFE OF A SALESPERSON .pptxA DAY IN THE LIFE OF A SALESPERSON .pptx
A DAY IN THE LIFE OF A SALESPERSON .pptxseemajojo02
 
Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...
Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...
Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...ssuserf63bd7
 
The Vietnam Believer Newsletter_May 13th, 2024_ENVol. 007.pdf
The Vietnam Believer Newsletter_May 13th, 2024_ENVol. 007.pdfThe Vietnam Believer Newsletter_May 13th, 2024_ENVol. 007.pdf
The Vietnam Believer Newsletter_May 13th, 2024_ENVol. 007.pdfbelieveminhh
 
UJJAIN CALL GIRL ❤ 8272964427❤ CALL GIRLS IN UJJAIN ESCORTS SERVICE PROVIDE
UJJAIN CALL GIRL ❤ 8272964427❤ CALL GIRLS IN UJJAIN ESCORTS SERVICE PROVIDEUJJAIN CALL GIRL ❤ 8272964427❤ CALL GIRLS IN UJJAIN ESCORTS SERVICE PROVIDE
UJJAIN CALL GIRL ❤ 8272964427❤ CALL GIRLS IN UJJAIN ESCORTS SERVICE PROVIDEkajalroy875762
 

Recently uploaded (20)

00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![© ر
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![©  ر00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![©  ر
00971508021841 حبوب الإجهاض في دبي | أبوظبي | الشارقة | السطوة |❇ ❈ ((![© ر
 
PALWAL CALL GIRL ❤ 8272964427❤ CALL GIRLS IN PALWAL ESCORTS
PALWAL CALL GIRL ❤ 8272964427❤ CALL GIRLS IN PALWAL ESCORTSPALWAL CALL GIRL ❤ 8272964427❤ CALL GIRLS IN PALWAL ESCORTS
PALWAL CALL GIRL ❤ 8272964427❤ CALL GIRLS IN PALWAL ESCORTS
 
Presentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelledPresentation4 (2) survey responses clearly labelled
Presentation4 (2) survey responses clearly labelled
 
NAGPUR CALL GIRL ❤ 8272964427❤ CALL GIRLS IN NAGPUR ESCORTS SERVICE PROVIDE
NAGPUR CALL GIRL ❤ 8272964427❤ CALL GIRLS IN NAGPUR ESCORTS SERVICE PROVIDENAGPUR CALL GIRL ❤ 8272964427❤ CALL GIRLS IN NAGPUR ESCORTS SERVICE PROVIDE
NAGPUR CALL GIRL ❤ 8272964427❤ CALL GIRLS IN NAGPUR ESCORTS SERVICE PROVIDE
 
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
obat aborsi bandung wa 081336238223 jual obat aborsi cytotec asli di bandung9...
 
Progress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdf
Progress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdfProgress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdf
Progress Report - UKG Analyst Summit 2024 - A lot to do - Good Progress1-1.pdf
 
WheelTug Short Pitch Deck 2024 | Byond Insights
WheelTug Short Pitch Deck 2024 | Byond InsightsWheelTug Short Pitch Deck 2024 | Byond Insights
WheelTug Short Pitch Deck 2024 | Byond Insights
 
10 Influential Leaders Defining the Future of Digital Banking in 2024.pdf
10 Influential Leaders Defining the Future of Digital Banking in 2024.pdf10 Influential Leaders Defining the Future of Digital Banking in 2024.pdf
10 Influential Leaders Defining the Future of Digital Banking in 2024.pdf
 
Learn How To Start Buy Verified Payoneer Accounts
Learn How To Start Buy Verified Payoneer AccountsLearn How To Start Buy Verified Payoneer Accounts
Learn How To Start Buy Verified Payoneer Accounts
 
Powerpoint showing results from tik tok metrics
Powerpoint showing results from tik tok metricsPowerpoint showing results from tik tok metrics
Powerpoint showing results from tik tok metrics
 
Obat Aborsi Pasuruan 0851\7696\3835 Jual Obat Cytotec Di Pasuruan
Obat Aborsi Pasuruan 0851\7696\3835 Jual Obat Cytotec Di PasuruanObat Aborsi Pasuruan 0851\7696\3835 Jual Obat Cytotec Di Pasuruan
Obat Aborsi Pasuruan 0851\7696\3835 Jual Obat Cytotec Di Pasuruan
 
Cracking the 'Career Pathing' Slideshare
Cracking the 'Career Pathing' SlideshareCracking the 'Career Pathing' Slideshare
Cracking the 'Career Pathing' Slideshare
 
How does a bike-share company navigate speedy success? - Cyclistic
How does a bike-share company navigate speedy success? - CyclisticHow does a bike-share company navigate speedy success? - Cyclistic
How does a bike-share company navigate speedy success? - Cyclistic
 
Shots fired Budget Presentation.pdf12312
Shots fired Budget Presentation.pdf12312Shots fired Budget Presentation.pdf12312
Shots fired Budget Presentation.pdf12312
 
Solar Panel Installation A Comprehensive Guide.pdf
Solar Panel Installation A Comprehensive Guide.pdfSolar Panel Installation A Comprehensive Guide.pdf
Solar Panel Installation A Comprehensive Guide.pdf
 
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAIGetting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
 
A DAY IN THE LIFE OF A SALESPERSON .pptx
A DAY IN THE LIFE OF A SALESPERSON .pptxA DAY IN THE LIFE OF A SALESPERSON .pptx
A DAY IN THE LIFE OF A SALESPERSON .pptx
 
Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...
Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...
Understanding Financial Accounting 3rd Canadian Edition by Christopher D. Bur...
 
The Vietnam Believer Newsletter_May 13th, 2024_ENVol. 007.pdf
The Vietnam Believer Newsletter_May 13th, 2024_ENVol. 007.pdfThe Vietnam Believer Newsletter_May 13th, 2024_ENVol. 007.pdf
The Vietnam Believer Newsletter_May 13th, 2024_ENVol. 007.pdf
 
UJJAIN CALL GIRL ❤ 8272964427❤ CALL GIRLS IN UJJAIN ESCORTS SERVICE PROVIDE
UJJAIN CALL GIRL ❤ 8272964427❤ CALL GIRLS IN UJJAIN ESCORTS SERVICE PROVIDEUJJAIN CALL GIRL ❤ 8272964427❤ CALL GIRLS IN UJJAIN ESCORTS SERVICE PROVIDE
UJJAIN CALL GIRL ❤ 8272964427❤ CALL GIRLS IN UJJAIN ESCORTS SERVICE PROVIDE
 

Andy Bosyi: Data Imputation using Reverse ML

  • 1. mindcraft.ai Data Imputation and Restoration using Reverse ML Data imputation heals spoiled data Dataset models the world only partially Input, Transformation, Interpretation Difference between 0 and NULL (no item, no info, not available, no input) Impute or Remove
  • 2. mindcraft.ai Types of Item Non-Response Missing at Random (MAR) Missing Completely at Random (MCAR) Missing not at Random (MNAR) Deletion for MAR and MCAR only
  • 3. mindcraft.ai Types of Imputation Univariate imputation: Impute values using only the target variable itself (Mean). Multivariate imputation: Impute values based on other variables (LR). Single imputation: Impute any missing values within the dataset only once to create a single imputed dataset. Multiple imputation: Impute the same missing values within the dataset multiple times (MICE).
  • 4. mindcraft.ai Imputation methods - Simple and Out of Box Remove Data - multivariate missing? Deductive Investigation Zero, Constant Random (uniform, normal)
  • 5. mindcraft.ai Imputation methods - Basic Mean, Median, Mode: - reduce variance - ignores correlation - NULL category LR or any other regression using NN - problem in multivariate KNN, Fuzzy Clustering - sensitive for outliers - heavy computation References: https://towardsdatascience.com/6-different-ways-to- compensate-for-missing-values-data-imputation-with-examples- 6022d9ca0779
  • 6. mindcraft.ai Imputation methods - MICE Multivariate Imputation by Chained Equation Multiple Regressions Predictive Mean Matching Generate values from predictive distributions Uncertainty and MCMC References: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3074241/ https://towardsdatascience.com/how-to-handle-missing-data- 8646b18db0d4
  • 7. mindcraft.ai Imputation methods - Time Series Last Observation Carried Forward (LOCF) Next Observation Carried Backward (NOCB) Interpolation (Linear, RNN) Seasonal Adjustment + Interpolation Interpolation -> Extrapolation -> Predictive Models
  • 8. mindcraft.ai Imputation methods - Cleaning AutoEncoder Limited amount of missed data Reference: https://github.com/andy-bosyi/articles/blob/master/AutoEncoder-MNIST- clean.ipynb
  • 9. mindcraft.ai Imputation methods - Generative Networks VAE GAIN MisGAN VIGAN CollaGAN References: https://towardsdatascience.com/gans-and-missing-data-imputation- 815a0cbc4ece
  • 10. mindcraft.ai Reverse ML - Training AutoEncoder Add Dropout Regularization: Reference: https://github.com/andy- bosyi/articles/blob/master/ReverseTrainedAutoEncoder-MNIST.ipynb
  • 11. mindcraft.ai Reverse ML - Direct AE Application Original Data Missing 36% Restored by AE
  • 12. mindcraft.ai Reverse ML - Training AutoEncoder Add Dropout Regularization: Reference: https://github.com/andy- bosyi/articles/blob/master/ReverseTrainedAutoEncoder-MNIST.ipynb
  • 13. mindcraft.ai Reverse ML - Result on AE Input Original Data Missing 36% Restored by RTAE as Input
  • 14. mindcraft.ai Reverse ML - Result on AE Output Original Data Missing 36% Restored by RTAE as Output
  • 15. mindcraft.ai Reverse ML - Results and Conclusion AE Acc = 90.56% RTAE Acc = 96.22% Better accuracy than classical methods Requires more computational resources Stable to compare with generative models Scalability Reference: https://github.com/andy- bosyi/articles/blob/master/ReverseTrainedAutoEncoder-MNIST.ipynb
  • 16. mindcraft.ai Das ist MindCraft Decision-making Engines for Data-driven Businesses, especially: - Document and Web pages Classification, Capturing (NLP, CNN, CV, NER) - Price Prediction (DNN, Regression, Prognosis) - Command Centers for IoT systems (RNN, Time Series, Anomaly Detection) - Computer Vision and Object Detection - Data Analysis and Generation