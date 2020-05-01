Successfully reported this slideshow.
  1. 1. Why Not to Use Zero Imputation? Correcting Sparsity Bias in Training Neural Networks Joonyoung Yi, Juhyuk Lee, Kwang Joon Kim, Sung Ju Hwang, Eunho Yang ICLR 2020 Machine Learning & Intelligence Laboratory
  2. 2. The Value and Problem of Zero Imputation ● Missing data is widespread in machine learning. (e.g. Recommendation, Electronic medical records, IoT sensor dataset) ● Zero Imputation: The simplest and most intuitive way to handle missing data. ● However, many previous studies have reported that zero imputation has an adverse effect on model performance [Hazan et al., 2015; Luo et al., 2018; Smieja et al., 2018]. ● We identified Variable Sparsity Problem (VSP) which causes performance degradation of zero imputation. 2
  3. 3. Variable Sparsity Problem & Sparsity Normalization ● Variable Sparsity Problem (VSP): The output of a neural network greatly vary with respect to the number of missing entries in the input. (Figure (a), Under left figure) ● Sparsity Normalization (SN): Making expected output independent of input sparsity level. (Figure (b), Under right figure) ● With SN, as more features are known for a particular instance, the variance of prediction for that instance decreases. 3
  4. 4. Experiment Result ● Collaborative Filtering dataset. ○ State-of-the-arts among neural network based collaborative filtering models. ● National Health Insurance Service (NHIS) dataset. ○ Even with its simplicity, SN exhibits better or similar performances compared to other more complex techniques. 4

