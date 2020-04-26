Successfully reported this slideshow.
Arizona State University Crowdsourcing via Tensor Augmentation and Completion (TAC) Presenter: Yao Zhou joint work with: D...
Arizona State University- 2 - Roadmap  Background  Related work  Crowdsourcing based on TAC  Experimental results  Co...
Arizona State University- 3 - Crowdsourcing in machine learning  Training a supervised machine learning model needs train...
Arizona State University- 4 - An example of crowdsourcing Lynx (wildcat) Tabby (domestic cat)
Arizona State University- 5 - Key problem of crowdsourcing  Cons:  Low quality: Collected labels from the crowd (non-exp...
Arizona State University- 6 - Some related work MV  Majority Voting, a simple baseline. DS-EM [Dawid and Skene, 1979] ...
Arizona State University- 7 - Roadmap  Background  Related work  Crowdsourcing based on TAC  Experimental results  Co...
Arizona State University- 8 - Tensor augmentation  Notation:  Re-organize labels of crowds as a three-way tensor:  Base...
Arizona State University- 9 - Tensor augmentation and completion (TAC)  Main principle of TAC:  Rank minimization NP-har...
Arizona State University- 10 - Tensor augmentation and completion (TAC)  Definition of trace norm for an n-way tensor [Li...
Arizona State University- 11 - Tensor augmentation and completion (TAC) Relaxed objective of TAC with regularization: Reg...
Arizona State University- 12 - Updating 𝑴𝒍 Sub-problem: Closed form solution, proved by [Cai et.al. 2009]: Here, and τ = ...
Arizona State University- 13 - Updating X Two formulations:  Prior guided ground truth inference (PG-TAC) Prior Statisti...
Arizona State University- 14 - Prior guided ground truth inference (PG-TAC)  Solution: Elements of tensor X can be divid...
Arizona State University- 15 - Elements of set 𝑪 𝟑: Updating X Prior guided ground truth inference (RS-TAC)  Solution: E...
Arizona State University- 16 - Roadmap  Background  Related work  Crowdsourcing based on TAC  Experimental results  C...
Arizona State University- 17 - Experimental Results Synthetic Data Set:  Four configurations:  Notations: • # of Worker...
Arizona State University- 18 - Experimental Results Real-world Data Set: References: Dengyong Zhou, et.al. Learning from ...
Arizona State University- 19 - Experimental Results Real-world Data Set results: References: Qiang Liu, et al. Variationa...
Arizona State University- 20 - Conclusion  Two novel methods PG-TAC and RS-TAC:  Augment the data tensor with a ground t...
Arizona State University- 21 - Thank you ! & Questions?
  1. 1. Arizona State University Crowdsourcing via Tensor Augmentation and Completion (TAC) Presenter: Yao Zhou joint work with: Dr. Jingrui He - 1 - Arizona State University Arizona State University
  2. 2. Arizona State University- 2 - Roadmap  Background  Related work  Crowdsourcing based on TAC  Experimental results  Conclusion
  3. 3. Arizona State University- 3 - Crowdsourcing in machine learning  Training a supervised machine learning model needs training labels  Many crowdsourcing platforms provide services to collect labels information.
  4. 4. Arizona State University- 4 - An example of crowdsourcing Lynx (wildcat) Tabby (domestic cat)
  5. 5. Arizona State University- 5 - Key problem of crowdsourcing  Cons:  Low quality: Collected labels from the crowd (non-expert) are noisy.  Missing labels: Some workers are not willing to label all of the items. How to infer the true labels from a large number of labels collected from crowd?  Pros:  Low cost: Collecting large amounts of labels is economic. Noisy labels Missing labels
  6. 6. Arizona State University- 6 - Some related work MV  Majority Voting, a simple baseline. DS-EM [Dawid and Skene, 1979]  Infer worker’s ability matrix and true labels.  Two-coin model for a binary labelling task. GLAD [Whitehill et al., 2009].  Infer the worker’s ability, item difficulty and item true labels simultaneously. DS-MF [Liu et al., 2012].  Employ variational Bayesian inference using meanfield algorithm. MMCE [Zhou et al., 2012].  Employ the minimax entropy principle to infer worker ability, item difficulty and true labels at the same time. Structural information of labels is not utilized !!
  7. 7. Arizona State University- 7 - Roadmap  Background  Related work  Crowdsourcing based on TAC  Experimental results  Conclusion
  8. 8. Arizona State University- 8 - Tensor augmentation  Notation:  Re-organize labels of crowds as a three-way tensor:  Based on worker’s labelling decision, generate an index set: • Workers: i = 1,2, …, Nw • Items: j = 1,2, …, Ni • Classes: k = 1, …, Nc Tensor T # of items Ni #ofworkersNw Ground truth layer +1  The ground truth layer:  Extra tensor slice of size 𝑁𝑖 × 𝑁𝑐.  Augmented on tensor along the worker dimension.
  9. 9. Arizona State University- 9 - Tensor augmentation and completion (TAC)  Main principle of TAC:  Rank minimization NP-hard Tightest convex envelope  Trace norm minimization  Goal of TAC:  Complete the augmented tensor
  10. 10. Arizona State University- 10 - Tensor augmentation and completion (TAC)  Definition of trace norm for an n-way tensor [Liu et.al 2012]: Here, represents for the unfold of a tensor X. The reverse operation is fold.𝑋 𝑙 Tensor: 𝑋 ∈ 𝑅3×2×2 Unfolded matrices: 𝑋(1) ∈ 𝑅3×4 𝑋(2) ∈ 𝑅2×6 𝑋 3 ∈ 𝑅2×6 Reference: Ji Liu et.al. Tensor completion for estimating missing values in visual data. TPAMI 2012
  11. 11. Arizona State University- 11 - Tensor augmentation and completion (TAC) Relaxed objective of TAC with regularization: Regularization term Index of the ground truth layer Intermediate relaxed matrices Solution:  Block Coordinate Descend (BCD)  Four blocks of variables:
  12. 12. Arizona State University- 12 - Updating 𝑴𝒍 Sub-problem: Closed form solution, proved by [Cai et.al. 2009]: Here, and τ = Reference: Jian-Feng Cai, et.al. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization, 2010. Singular Value Thresholding
  13. 13. Arizona State University- 13 - Updating X Two formulations:  Prior guided ground truth inference (PG-TAC) Prior Statistics  Relaxed simplex ground truth inference (RS-TAC) Slack variable
  14. 14. Arizona State University- 14 - Prior guided ground truth inference (PG-TAC)  Solution: Elements of tensor X can be divided into three sets {𝐶1, 𝐶2, 𝐶3} Elements of set 𝑪 𝟏: Elements of set 𝑪 𝟐: Elements of set 𝑪 𝟑: Updating X Tensor T Ground truth layer Elements of Set C3 Elements of Set C1 Elements of Set C2
  15. 15. Arizona State University- 15 - Elements of set 𝑪 𝟑: Updating X Prior guided ground truth inference (RS-TAC)  Solution: Elements of tensor X can be divided into three sets {𝐶1, 𝐶2, 𝐶3} Elements of set 𝑪 𝟏: Elements of set 𝑪 𝟐: Tensor T Ground truth layer Elements of Set C3 Elements of Set C1 Elements of Set C2 Different from PG-TAC
  16. 16. Arizona State University- 16 - Roadmap  Background  Related work  Crowdsourcing based on TAC  Experimental results  Conclusion
  17. 17. Arizona State University- 17 - Experimental Results Synthetic Data Set:  Four configurations:  Notations: • # of Workers: Nw • # of Items: Ni • # of Classes: Nc • Probability of not giving labels q Lower is better Lower is better  Initial configuration: • Nw = 50, Ni = 400 • Nc = 4, q = 0.7
  18. 18. Arizona State University- 18 - Experimental Results Real-world Data Set: References: Dengyong Zhou, et.al. Learning from the wisdom of crowds by minimax entropy. NIPS, 2012. Dengyong Zhou, et.al. Regularized minimax conditional entropy for crowdsourcing. CoRR, 2015. Hu Han, et.al. Demographic estimation from face images: Human vs. machine performance. TPAMI, 2015. Rion Snow, et.al. Cheap and fast—but is it good?: Evaluating non-expert annotations for natural language tasks. EMNLP, 2008.
  19. 19. Arizona State University- 19 - Experimental Results Real-world Data Set results: References: Qiang Liu, et al. Variational inference for crowdsourcing. NIPS, 2012. Dengyong Zhou, et.al. Learning from the wisdom of crowds by minimax entropy. NIPS, 2012. A. P. Dawid, et al. Maximum likelihood estimation of observer error-rates using the EM algorithm. Applied Statistics, 1979. Jacob Whitehill et al. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. NIPS , 2009.
  20. 20. Arizona State University- 20 - Conclusion  Two novel methods PG-TAC and RS-TAC:  Augment the data tensor with a ground truth layer.  Utilize the structural information of crowd labels.  Infer the true labels of items in binary and multi-class settings.  Experimental results:  Six real data sets.  Outperform state-of-the-art methods.
  21. 21. Arizona State University- 21 - Thank you ! & Questions?

