SlideShare a Scribd company logo
1 of 35
Download to read offline
MisGAN
Learning from Incomplete Data with Generative Adversarial Networks
Steven Cheng-Xian Li
University of Massachusetts Amherst
Jihoo Kim
datartist@hanyang.ac.kr
Dept. of Computer and Software, Hanyang University
ICLR’19
Abstract
GANs provides an effective way to model complex distributions.
But, typical GANs require full-observed data during training.
In this paper, we present a GAN-based framework for learning from complex, high-
dimensional incomplete data
The proposed framework learns a complete data generator
along with a mask generator that models the missing data distribution.
We evaluate the proposed framework under the MCAR assumption.
1. Introduction
Unlike likelihood-based methods, GANs is an implicit probabilistic models
which represent a probability distribution through a generator
that learns to directly produce samples from the desired distribution.
GANs have been shown to be very successful in a range of applications
- Generating photorealistic images (2018)
- Image inpainting (2016, 2017)
Training GANs normally requires access to a large collection of fully-observed data.
However, it is not always possible to obtain a large amount of full-observed data.
1. Introduction
The generative process for incompletely observed data (2014, Little & Rubin)
the observed elements of x
the missing according to the mask m
the unknown parameters
of the mask distribution
the unknown parameters
of the data distribution
a binary mask that determines
which entries in x to reveal
a complete data vector
1. Introduction
The unknown parameters are estimated by maximizing the following marginal likelihood.
Little & Rubin (2014) characterize the missing data mechanism
in terms of independence between the complete data x, and the masks m.
①
②
③
1. Introduction
Most work on incomplete data assumes MCAR or MAR since under these assumptions
can be factorized into .
→ The missing data mechanism can be ignored when learning the data generating model
while yielding correct estimates for θ.
When does not admit efficient marginalization over , estimation of θ is usually
performed by maximizing a variational lower bound
1. Introduction
The primary contribution of this paper is the development of a GAN-based framework for
learning high-dimensional data distributions in the presence of incomplete observations.
Our framework introduces an auxiliary GAN for learning a mask distribution to model
the missingness.
The masks are used to “mask” generated complete data by filling the indicated missing
entries with a constant value.
The complete data generator is trained so that the resulting masked data are
indistinguishable from real incomplete data that are masked similarly.
1. Introduction
Our framework builds on the ideas of AmbientGAN (2018).
AmbientGAN modifies the discriminator of a GAN to distinguish corrupted real samples
from corrupted generated samples under a range of corruption processes.
Missing data can be seen as a special type of corruption.
AmbientGAN assumes the measurement process is known only by a few parameters,
which is not the case in general missing data problems.
We provide empirical evidence that the proposed framework is able to effectively learn
complex, high-dimensional data distributions from highly incomplete data.
We further show how the architecture can be used to generate high-quality imputations.
1. Introduction
1 , is observed.
2. MisGAN: A GAN for Missing Data
incomplete data
a partially-observed data vector
a corresponding mask
0 , is missing and contain arbitrary value that we should ignore.
It leads to a cleaner description of the proposed MisGAN.
It suggests how MisGAN can be implemented efficiently.
Instead of …
Two key ideas…
1. We explicitly model the missing data process using a mask generator.
Since the masks in the incomplete dataset are fully observed, we can estimate their distribution.
2. We train the complete data generator adversarially by masking its outputs using generated
masks and , and comparing to real incomplete data that are similarly masked by .
2. MisGAN: A GAN for Missing Data
Masking operator that fills in missing entries with a constant value .
2. MisGAN: A GAN for Missing Data
We use two generator-discriminator pairs
We focus on MCAR, where the two generators are independent of each other
and have their own noise distributions
Loss function for the masks
Loss function for the data
Fake MaskReal Mask
Fake DataReal Data
2. MisGAN: A GAN for Missing Data
We optimize the generators and the discriminators according to the following objectives
Loss function for the masks
Loss function for the data
The losses above follow the Wasserstein GAN formulation (Arjovsky, 2017)
coefficient
We find that choosing a small value
such as 𝜶 = 𝟎. 𝟐 improves performance
Wasserstein GAN (Arjovsky, 2017) Facebook AI Research
KL-Divergence and JS-Divergence
Wasserstein GAN (Arjovsky, 2017) Facebook AI Research
Wasserstein GAN (WGAN) proposes a new cost function
using Wasserstein distance that has a smoother gradient everywhere.
Arjovsky et al 2017 wrote a paper to illustrate the GAN problem mathematically.
Wasserstein GAN (Arjovsky, 2017) Facebook AI Research
2. MisGAN: A GAN for Missing Data
The data discriminator takes as input the masked samples as if the data are fully-observed.
This allows us to use any existing architecture designed for complete data.
The masks are binary. Discrete data generating processes have zero gradient almost everywhere.
To carry out gradient-based training for GANs, we relax the output of the mask generator .
The discriminator in MisGAN is unaware of which entries are missing in the masked input samples,
and does not even need to know which value is used for masking. (In next section, theoretical analysis)
Note that…
2. MisGAN: A GAN for Missing Data
3. Theoretical Results
Two important questions
Does the choice of the filled-in value
affect the ability to recover the data distribution?
Does information about the location of missing values
affect the ability to recover the data distribution?
Q1.
Q2.
3. Theoretical Results
3. Theoretical Results
=
3. Theoretical Results
3. Theoretical Results
4. Missing Data Imputation
We show how to impute missing data according to
by equipping MisGAN with an imputer accompanied by a corresponding discriminator .
Loss function for the masks
Loss function for the data
Loss function for the imputer
noise distribution
𝜶 = 𝟎. 𝟐
𝜷 = 𝟎. 𝟏
This encourages the generated masks
to match the distribution of the real masks
and the masked generated complete samples
to match masked real data.
This encourages the generated complete data
to match the distribution of the imputed real data
In addition to having the masked generated data
match the masked real data.
4. Missing Data Imputation
We can also train a stand-alone imputer using only
with a pre-trained data generator .
Moreover, it is also possible to train the imputer to target a different missing distribution
with a pre-trained data generator alone without access to the original (incomplete) training data
4. Missing Data Imputation
5. Experiments
Data
Missing data
distributions
Evaluation
metric
MNIST
CIFAR-10
CelebA
28x28 handwritten digits images
32x32 color images from 10 classes
64x64 face images (202,599)
The range of pixel values
is rescaled to
Square
observation
Dropout
Variable-size
rectangular
observation
All pixels are missing except for a square
occurring at a random location on the image
Each pixel is independently missing
according to a Bernoulli distribution
All pixels are missing except for a rectangular observed region
(width and height are drawn from 25% to 75% o the image length)
(Heusel, 2017)
5. Experiments
1. Architectures
2. Baseline
3. Results
MisGAN with convolutional networks – DCGAN (Radford, 2015)
MisGAN with fully connected networksFC-MisGAN
Conv-MisGAN
ConvAC
The generative convolutional arithmetic circuit (Sharir, 2016)
→ capable of learning from large-scale incomplete data
Figure 3
Figure 4
Figure 5
Figure 6
5.1 Empirical Study of MisGAN on MNIST
Next slides...
5. Experiments
Training Samples
Generated data Samples
Generated mask Samples
Generated data Samples
Conv-MisGAN
FC-MisGAN
5.1 Empirical Study of MisGAN on MNIST
5. Experiments 5.1 Empirical Study of MisGAN on MNIST
MisGAN outperforms ConvAC
Data samples generated by Conv-MisGAN
Mask samples generated by Conv-MisGAN
Data samples generated by MisGAN
Variable-size
Square
5. Experiments
4. Ablation study
5.1 Empirical Study of MisGAN on MNIST
We point out that the mask discriminator in MisGAN is important for learning the correct distribution.
Two failure cases of AmbientGAN, which is essentially equivalent to a MisGAN without the mask discriminator.
Generated data samplesGenerated mask samples Generated data samples Generated mask samples
rescale
5. Experiments 5.1 Empirical Study of MisGAN on MNIST
5. Missing data imputation
Inside of box → observed pixels
Outside of box → generated pixels Each row → same incomplete input
The imputer can produce a variety of different imputed results
5. Experiments 5.2 Quantitative Evaluation
1. Baselines
3. Architecture
2. Evaluation of
imputation
4. Results
We focus on evaluating MisGAN on the missing data imputation task
zero/mean imputation
matrix factorization
GAIN (Generative Adversarial Imputation Network)
FID between the imputed data and the original fully-observed data
For MNIST → Fully-connected imputer network
For CIFAR-10 and CelebA → Five-layer U-Net architecture (Ronneberger, 2015)
Next slides...
5. Experiments 5.2 Quantitative Evaluation
MisGAN consistently outperforms other methods in all cases, especially under high missing rates.
Training MisGAN is more stable than training GAIN.
6. Discussion and Future Work
This work presents and evaluates a high flexible framework for learning
standard GAN data generators in the presence of missing data.
We only focus on the MCAR case in this work.
MisGAN can be easily extended to cases both MAR and NMAR.
We have tried the modified architecture and it showed similar results.
This suggests that the extra dependencies may not adversely affect learnability.
We leave the formal evaluation of this modified framework for future work.

More Related Content

What's hot

Manual co detector-popp_en
Manual co detector-popp_enManual co detector-popp_en
Manual co detector-popp_enDomotica daVinci
 
統計的学習手法による人検出
統計的学習手法による人検出統計的学習手法による人検出
統計的学習手法による人検出MPRG_Chubu_University
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data MiningAmritanshu Mehra
 
CAPTCHA and Convolutional neural network
CAPTCHA and Convolutional neural network CAPTCHA and Convolutional neural network
CAPTCHA and Convolutional neural network Bushra Jbawi
 
Data Engineering 101
Data Engineering 101Data Engineering 101
Data Engineering 101DaeMyung Kang
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferRoelof Pieters
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introductionLiang Xiang
 
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUESTUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUEIJDKP
 
Siber Güvenlik ve Etik Hacking Sunu - 7
Siber Güvenlik ve Etik Hacking Sunu - 7Siber Güvenlik ve Etik Hacking Sunu - 7
Siber Güvenlik ve Etik Hacking Sunu - 7Murat KARA
 
Applying deep learning to medical data
Applying deep learning to medical dataApplying deep learning to medical data
Applying deep learning to medical dataHyun-seok Min
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)Cory Cook
 
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...KwanyoungKim7
 
Support vector machine
Support vector machineSupport vector machine
Support vector machineMusa Hawamdah
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
Codetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep LearningCodetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep LearningMatthew Opala
 

What's hot (20)

Manual co detector-popp_en
Manual co detector-popp_enManual co detector-popp_en
Manual co detector-popp_en
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
 
統計的学習手法による人検出
統計的学習手法による人検出統計的学習手法による人検出
統計的学習手法による人検出
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
 
CAPTCHA and Convolutional neural network
CAPTCHA and Convolutional neural network CAPTCHA and Convolutional neural network
CAPTCHA and Convolutional neural network
 
Data Engineering 101
Data Engineering 101Data Engineering 101
Data Engineering 101
 
Zero shot learning through cross-modal transfer
Zero shot learning through cross-modal transferZero shot learning through cross-modal transfer
Zero shot learning through cross-modal transfer
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUESTUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
 
Siber Güvenlik ve Etik Hacking Sunu - 7
Siber Güvenlik ve Etik Hacking Sunu - 7Siber Güvenlik ve Etik Hacking Sunu - 7
Siber Güvenlik ve Etik Hacking Sunu - 7
 
Yolo
YoloYolo
Yolo
 
Applying deep learning to medical data
Applying deep learning to medical dataApplying deep learning to medical data
Applying deep learning to medical data
 
K Nearest Neighbors
K Nearest NeighborsK Nearest Neighbors
K Nearest Neighbors
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
 
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
Noise Distribution Adaptive Self-Supervised Image Denoising using Tweedie Dis...
 
Support vector machine
Support vector machineSupport vector machine
Support vector machine
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Codetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep LearningCodetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep Learning
 
Machine Learning for Data Mining
Machine Learning for Data MiningMachine Learning for Data Mining
Machine Learning for Data Mining
 

Similar to [Paper Review] MisGAN: Learning from Incomplete Data with Generative Adversarial Networks (ICLR'19)

An Extensive Review on Generative Adversarial Networks GAN’s
An Extensive Review on Generative Adversarial Networks GAN’sAn Extensive Review on Generative Adversarial Networks GAN’s
An Extensive Review on Generative Adversarial Networks GAN’sijtsrd
 
Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)Asha Aher
 
Gans - Generative Adversarial Nets
Gans - Generative Adversarial NetsGans - Generative Adversarial Nets
Gans - Generative Adversarial NetsSajalRastogi8
 
An ann approach for network
An ann approach for networkAn ann approach for network
An ann approach for networkIJNSA Journal
 
Time series anomaly detection using cnn coupled with data augmentation using ...
Time series anomaly detection using cnn coupled with data augmentation using ...Time series anomaly detection using cnn coupled with data augmentation using ...
Time series anomaly detection using cnn coupled with data augmentation using ...Prasenjeet Acharjee
 
AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...
AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...
AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...IJNSA Journal
 
Nips 2016 tutorial generative adversarial networks review
Nips 2016 tutorial  generative adversarial networks reviewNips 2016 tutorial  generative adversarial networks review
Nips 2016 tutorial generative adversarial networks reviewMinho Heo
 
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORETEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCOREIJCI JOURNAL
 
Progress of Machine Learning in the Field of Intrusion Detection Systems
Progress of Machine Learning in the Field of Intrusion Detection SystemsProgress of Machine Learning in the Field of Intrusion Detection Systems
Progress of Machine Learning in the Field of Intrusion Detection Systemsijcisjournal
 
11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...
11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...
11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...ijcisjournal
 
Botnet detection using Wgans for security
Botnet detection using Wgans for securityBotnet detection using Wgans for security
Botnet detection using Wgans for securityssuser3f5a831
 
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial NetworkIRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial NetworkIRJET Journal
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...IRJET Journal
 
Image Masking.pdf
Image Masking.pdfImage Masking.pdf
Image Masking.pdffarin11
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System OperationsIRJET Journal
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Prakhar Rastogi
 
AN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILURE
AN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILUREAN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILURE
AN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILUREIJCI JOURNAL
 

Similar to [Paper Review] MisGAN: Learning from Incomplete Data with Generative Adversarial Networks (ICLR'19) (20)

An Extensive Review on Generative Adversarial Networks GAN’s
An Extensive Review on Generative Adversarial Networks GAN’sAn Extensive Review on Generative Adversarial Networks GAN’s
An Extensive Review on Generative Adversarial Networks GAN’s
 
Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)Anomaly Detection Using Generative Adversarial Network(GAN)
Anomaly Detection Using Generative Adversarial Network(GAN)
 
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdfTop Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
Top Machine Learning Algorithms Used By AI Professionals ARTiBA.pdf
 
Gans - Generative Adversarial Nets
Gans - Generative Adversarial NetsGans - Generative Adversarial Nets
Gans - Generative Adversarial Nets
 
An ann approach for network
An ann approach for networkAn ann approach for network
An ann approach for network
 
2. visualization in data mining
2. visualization in data mining2. visualization in data mining
2. visualization in data mining
 
Time series anomaly detection using cnn coupled with data augmentation using ...
Time series anomaly detection using cnn coupled with data augmentation using ...Time series anomaly detection using cnn coupled with data augmentation using ...
Time series anomaly detection using cnn coupled with data augmentation using ...
 
AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...
AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...
AN ANN APPROACH FOR NETWORK INTRUSION DETECTION USING ENTROPY BASED FEATURE S...
 
Nips 2016 tutorial generative adversarial networks review
Nips 2016 tutorial  generative adversarial networks reviewNips 2016 tutorial  generative adversarial networks review
Nips 2016 tutorial generative adversarial networks review
 
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORETEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
 
Progress of Machine Learning in the Field of Intrusion Detection Systems
Progress of Machine Learning in the Field of Intrusion Detection SystemsProgress of Machine Learning in the Field of Intrusion Detection Systems
Progress of Machine Learning in the Field of Intrusion Detection Systems
 
11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...
11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...
11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...
 
Botnet detection using Wgans for security
Botnet detection using Wgans for securityBotnet detection using Wgans for security
Botnet detection using Wgans for security
 
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial NetworkIRJET- Generating 3D Models Using 3D Generative Adversarial Network
IRJET- Generating 3D Models Using 3D Generative Adversarial Network
 
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detec...
 
Image Masking.pdf
Image Masking.pdfImage Masking.pdf
Image Masking.pdf
 
IRJET - Hand Gesture Recognition to Perform System Operations
IRJET -  	  Hand Gesture Recognition to Perform System OperationsIRJET -  	  Hand Gesture Recognition to Perform System Operations
IRJET - Hand Gesture Recognition to Perform System Operations
 
Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)Generative Adversarial Network (GAN)
Generative Adversarial Network (GAN)
 
Fake News Detection using Deep Learning
Fake News Detection using Deep LearningFake News Detection using Deep Learning
Fake News Detection using Deep Learning
 
AN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILURE
AN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILUREAN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILURE
AN IMPROVED CTGAN FOR DATA PROCESSING METHOD OF IMBALANCED DISK FAILURE
 

Recently uploaded

20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...shivangimorya083
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 

Recently uploaded (20)

20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
Full night 🥵 Call Girls Delhi New Friends Colony {9711199171} Sanya Reddy ✌️o...
 
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
Russian Call Girls Dwarka Sector 15 💓 Delhi 9999965857 @Sabina Modi VVIP MODE...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 

[Paper Review] MisGAN: Learning from Incomplete Data with Generative Adversarial Networks (ICLR'19)

  • 1. MisGAN Learning from Incomplete Data with Generative Adversarial Networks Steven Cheng-Xian Li University of Massachusetts Amherst Jihoo Kim datartist@hanyang.ac.kr Dept. of Computer and Software, Hanyang University ICLR’19
  • 2. Abstract GANs provides an effective way to model complex distributions. But, typical GANs require full-observed data during training. In this paper, we present a GAN-based framework for learning from complex, high- dimensional incomplete data The proposed framework learns a complete data generator along with a mask generator that models the missing data distribution. We evaluate the proposed framework under the MCAR assumption.
  • 3. 1. Introduction Unlike likelihood-based methods, GANs is an implicit probabilistic models which represent a probability distribution through a generator that learns to directly produce samples from the desired distribution. GANs have been shown to be very successful in a range of applications - Generating photorealistic images (2018) - Image inpainting (2016, 2017) Training GANs normally requires access to a large collection of fully-observed data. However, it is not always possible to obtain a large amount of full-observed data.
  • 4. 1. Introduction The generative process for incompletely observed data (2014, Little & Rubin) the observed elements of x the missing according to the mask m the unknown parameters of the mask distribution the unknown parameters of the data distribution a binary mask that determines which entries in x to reveal a complete data vector
  • 5. 1. Introduction The unknown parameters are estimated by maximizing the following marginal likelihood. Little & Rubin (2014) characterize the missing data mechanism in terms of independence between the complete data x, and the masks m. ① ② ③
  • 6. 1. Introduction Most work on incomplete data assumes MCAR or MAR since under these assumptions can be factorized into . → The missing data mechanism can be ignored when learning the data generating model while yielding correct estimates for θ. When does not admit efficient marginalization over , estimation of θ is usually performed by maximizing a variational lower bound
  • 7. 1. Introduction The primary contribution of this paper is the development of a GAN-based framework for learning high-dimensional data distributions in the presence of incomplete observations. Our framework introduces an auxiliary GAN for learning a mask distribution to model the missingness. The masks are used to “mask” generated complete data by filling the indicated missing entries with a constant value. The complete data generator is trained so that the resulting masked data are indistinguishable from real incomplete data that are masked similarly.
  • 8. 1. Introduction Our framework builds on the ideas of AmbientGAN (2018). AmbientGAN modifies the discriminator of a GAN to distinguish corrupted real samples from corrupted generated samples under a range of corruption processes. Missing data can be seen as a special type of corruption. AmbientGAN assumes the measurement process is known only by a few parameters, which is not the case in general missing data problems.
  • 9. We provide empirical evidence that the proposed framework is able to effectively learn complex, high-dimensional data distributions from highly incomplete data. We further show how the architecture can be used to generate high-quality imputations. 1. Introduction
  • 10. 1 , is observed. 2. MisGAN: A GAN for Missing Data incomplete data a partially-observed data vector a corresponding mask 0 , is missing and contain arbitrary value that we should ignore. It leads to a cleaner description of the proposed MisGAN. It suggests how MisGAN can be implemented efficiently. Instead of …
  • 11. Two key ideas… 1. We explicitly model the missing data process using a mask generator. Since the masks in the incomplete dataset are fully observed, we can estimate their distribution. 2. We train the complete data generator adversarially by masking its outputs using generated masks and , and comparing to real incomplete data that are similarly masked by . 2. MisGAN: A GAN for Missing Data Masking operator that fills in missing entries with a constant value .
  • 12. 2. MisGAN: A GAN for Missing Data We use two generator-discriminator pairs We focus on MCAR, where the two generators are independent of each other and have their own noise distributions Loss function for the masks Loss function for the data Fake MaskReal Mask Fake DataReal Data
  • 13. 2. MisGAN: A GAN for Missing Data We optimize the generators and the discriminators according to the following objectives Loss function for the masks Loss function for the data The losses above follow the Wasserstein GAN formulation (Arjovsky, 2017) coefficient We find that choosing a small value such as 𝜶 = 𝟎. 𝟐 improves performance
  • 14. Wasserstein GAN (Arjovsky, 2017) Facebook AI Research KL-Divergence and JS-Divergence
  • 15. Wasserstein GAN (Arjovsky, 2017) Facebook AI Research Wasserstein GAN (WGAN) proposes a new cost function using Wasserstein distance that has a smoother gradient everywhere. Arjovsky et al 2017 wrote a paper to illustrate the GAN problem mathematically.
  • 16. Wasserstein GAN (Arjovsky, 2017) Facebook AI Research
  • 17. 2. MisGAN: A GAN for Missing Data The data discriminator takes as input the masked samples as if the data are fully-observed. This allows us to use any existing architecture designed for complete data. The masks are binary. Discrete data generating processes have zero gradient almost everywhere. To carry out gradient-based training for GANs, we relax the output of the mask generator . The discriminator in MisGAN is unaware of which entries are missing in the masked input samples, and does not even need to know which value is used for masking. (In next section, theoretical analysis) Note that…
  • 18. 2. MisGAN: A GAN for Missing Data
  • 19. 3. Theoretical Results Two important questions Does the choice of the filled-in value affect the ability to recover the data distribution? Does information about the location of missing values affect the ability to recover the data distribution? Q1. Q2.
  • 24. 4. Missing Data Imputation We show how to impute missing data according to by equipping MisGAN with an imputer accompanied by a corresponding discriminator . Loss function for the masks Loss function for the data Loss function for the imputer noise distribution 𝜶 = 𝟎. 𝟐 𝜷 = 𝟎. 𝟏 This encourages the generated masks to match the distribution of the real masks and the masked generated complete samples to match masked real data. This encourages the generated complete data to match the distribution of the imputed real data In addition to having the masked generated data match the masked real data.
  • 25. 4. Missing Data Imputation We can also train a stand-alone imputer using only with a pre-trained data generator . Moreover, it is also possible to train the imputer to target a different missing distribution with a pre-trained data generator alone without access to the original (incomplete) training data
  • 26. 4. Missing Data Imputation
  • 27. 5. Experiments Data Missing data distributions Evaluation metric MNIST CIFAR-10 CelebA 28x28 handwritten digits images 32x32 color images from 10 classes 64x64 face images (202,599) The range of pixel values is rescaled to Square observation Dropout Variable-size rectangular observation All pixels are missing except for a square occurring at a random location on the image Each pixel is independently missing according to a Bernoulli distribution All pixels are missing except for a rectangular observed region (width and height are drawn from 25% to 75% o the image length) (Heusel, 2017)
  • 28. 5. Experiments 1. Architectures 2. Baseline 3. Results MisGAN with convolutional networks – DCGAN (Radford, 2015) MisGAN with fully connected networksFC-MisGAN Conv-MisGAN ConvAC The generative convolutional arithmetic circuit (Sharir, 2016) → capable of learning from large-scale incomplete data Figure 3 Figure 4 Figure 5 Figure 6 5.1 Empirical Study of MisGAN on MNIST Next slides...
  • 29. 5. Experiments Training Samples Generated data Samples Generated mask Samples Generated data Samples Conv-MisGAN FC-MisGAN 5.1 Empirical Study of MisGAN on MNIST
  • 30. 5. Experiments 5.1 Empirical Study of MisGAN on MNIST MisGAN outperforms ConvAC Data samples generated by Conv-MisGAN Mask samples generated by Conv-MisGAN Data samples generated by MisGAN Variable-size Square
  • 31. 5. Experiments 4. Ablation study 5.1 Empirical Study of MisGAN on MNIST We point out that the mask discriminator in MisGAN is important for learning the correct distribution. Two failure cases of AmbientGAN, which is essentially equivalent to a MisGAN without the mask discriminator. Generated data samplesGenerated mask samples Generated data samples Generated mask samples rescale
  • 32. 5. Experiments 5.1 Empirical Study of MisGAN on MNIST 5. Missing data imputation Inside of box → observed pixels Outside of box → generated pixels Each row → same incomplete input The imputer can produce a variety of different imputed results
  • 33. 5. Experiments 5.2 Quantitative Evaluation 1. Baselines 3. Architecture 2. Evaluation of imputation 4. Results We focus on evaluating MisGAN on the missing data imputation task zero/mean imputation matrix factorization GAIN (Generative Adversarial Imputation Network) FID between the imputed data and the original fully-observed data For MNIST → Fully-connected imputer network For CIFAR-10 and CelebA → Five-layer U-Net architecture (Ronneberger, 2015) Next slides...
  • 34. 5. Experiments 5.2 Quantitative Evaluation MisGAN consistently outperforms other methods in all cases, especially under high missing rates. Training MisGAN is more stable than training GAIN.
  • 35. 6. Discussion and Future Work This work presents and evaluates a high flexible framework for learning standard GAN data generators in the presence of missing data. We only focus on the MCAR case in this work. MisGAN can be easily extended to cases both MAR and NMAR. We have tried the modified architecture and it showed similar results. This suggests that the extra dependencies may not adversely affect learnability. We leave the formal evaluation of this modified framework for future work.