Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Human Computation

2,254 views

Published on

京都大学x理研AIPxエクサウィザーズ 機械学習勉強会1で発表頂いた鹿島久嗣教授の資料です

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Human Computation

  1. 1. 1 Hisashi Kashima, Human Computation Team KYOTO UNIVERSITY DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY Human Computation Hisashi Kashima Kyoto University / RIKEN AIP Human Computation Team Human-AI collaboration & skill development using AI
  2. 2. 2 Hisashi Kashima, Human Computation Team  Many successes of “Artificial Intelligence”: – Q.A. machine beating quiz champions – Go program surpassing top players  Current A.I. boom owes machine learning – Especially, deep learning Machine learning: A core technology underlies the 3rd AI boom
  3. 3. 3 Hisashi Kashima, Human Computation Team  Marketing – Recommendation – Sentiment analysis – Web ads optimization  Finance – Credit risk estimation – Fraud detection  Science – Biology – Material science Various applications of machine learning: From on-line shopping to system monitoring  Web – Search – Spam filtering – Social media  Healthcare – Medical diagnosis  Multimedia – Image/voice understanding  System monitoring – Fault detection
  4. 4. 4 Hisashi Kashima, Human Computation Team  Recent significant advances of AI technologies – Deep learning beating the best humans – AI as a possible threat to mankind  AI is still behind humans in flexibility and creativity, in abstract, open-ended, or context-dependent tasks  Human computation: solves hard problems by combining artificial intelligence and human intelligence Human Computation: Collaborative problem solving by machines and humans
  5. 5. 5 Hisashi Kashima, Human Computation Team Crowdsourcing: Outsourcing human-intelligence tasks to a large group of unspecified people via Internet Emergence of online crowd-labor marketplaces (e.g. Mechanical Turk, oDesk, Lancers, …) Serve as platforms for human computation Rise of crowdsourcing: On-demand access to massive on-line labor Requester Crowdsourcing marketplace Crowd workers posting tasks Task execution
  6. 6. 6 Hisashi Kashima, Human Computation Team 1. Human-AI collaborative problem solving – Crowd-powered data analysis – Problem solving with crowds 2. Human skill development using AI – Learning analytics – Data science education Research focus of AIP Human Computation Team: Human-AI collaboration & skill development using AI
  7. 7. 7 Hisashi Kashima, Human Computation Team  Automatic data analysis techniques (e.g. machine learning) are often considered as main components of data analytics  However, data analysis is heavily labor intensive –Manual processing dominates a large part of data analysis process Big challenge in big data analytics: Manpower bottleneck data collection/ computerization data cleansing/ curation modeling/ visualization evaluation/ interpretation business understanding data analysis process
  8. 8. 8 Hisashi Kashima, Human Computation Team  Use the power of crowds with various knowledge/skills to execute the labor-intensive data analysis process Crowdsourcing big data analysis: Crowdsourced execution of data analysis process Data collection/ computerization Data cleansing/ curation Modeling/ visualization Evaluation/ interpretation Business understanding Data analysis process Process execution by crowds Kashima et al. Crowdsourcing for Big Data Analytics. In PAKDD 2015 Tutorial
  9. 9. 9 Hisashi Kashima, Human Computation Team  Use crowdsourcing to define and extract useful features which are hard to find by machines  Iterative feature generation with boosting [AAAI 18] Feature definition with crowds Training Classifiers Feature labeling with crowds Finding difficult data instances Human-in-the-loop machine learning: Feature generation with crowds
  10. 10. 10 Hisashi Kashima, Human Computation Team  Data science education through competition-style real data analysis  existing educational programs given as classwork  data analysis competitions with monetary rewards (e.g. Kaggle)  Eco-system with data providers (clients) and learners (data scientists) Data science education: Educational data analysis competition platform y=f(x) data scientist crowdsclients data certificate analysis data science courses / corporate training programs companies / governments analysis http://universityofbigdata.net
  11. 11. 11 Hisashi Kashima, Human Computation Team Quality control is essential –Crowd-workers have different levels of skills and motivations, and sometimes they are malicious –Quality of crowdsourcing results is uneven Introducing redundancy is the key 1. Assign a same task to multiple workers 2. Aggregate the answers –Majority voting/averaging is a typical solution Technical challenge: Quality control of human computation
  12. 12. 12 Hisashi Kashima, Human Computation Team  Statistical model of generative story from unobserved true answer to observed worker answers – Worker reliability and task difficulty are unknown  Inference of true answer from worker answers Statistical quality control: Generative models of worker answers True answer Worker reliability Task difficulty Worker answers [observed] Generative story + Inference
  13. 13. 13 Hisashi Kashima, Human Computation Team  Whitehill et al. used Rasch model from item response theory – Probability of correct answer: • True answer is inferred by using EM algorithm Example of statistical quality control models: GLAD model True answer Worker reliability Task difficulty Worker answers [observed] Generative story + Inference Whitehill et al. Whose Vote Should Count More: Optimal Integration of Labels from Labelers of Unknown Expertise. In NIPS 2009.
  14. 14. 14 Hisashi Kashima, Human Computation Team  Quality control of complex crowd answers  Quality estimation of general crowd work [KDD 13] Quality control of complex crowd work: Generative models of complex crowd work Hierarchical classification [ICDM 15] Ranking [PAKDD 14, AAAI 17]
  15. 15. 15 Hisashi Kashima, Human Computation Team  How can we find the answers of expert-knowledge questions?  Hyper question [CIKM 17] – Experts lose in majority voting – “Hyper questions” boost expert opinions Hyper questions: Quality control of difficult questions
  16. 16. 16 Hisashi Kashima, Human Computation Team  Observation: Experts are more likely to agree with each other on sets of questions than non-experts  Idea: Random -question sets (hyper questions) – Take majority voting (or other statistical aggregation) on hyper questions Hyper questions: Use of -question sets let experts win in majority voting
  17. 17. 17 Hisashi Kashima, Human Computation Team  Hyper MV outperforms simple MV on difficult questions  Combination with more sophisticated generative models also works well Hyper questions: Hyper questions are effective for difficult questions
  18. 18. 18 Hisashi Kashima, Human Computation Team  Idea generation: creating ideas with crowds  Idea organization: grouping & prioritizing with crowds Human computation for general problems: How to find solutions for open problems with crowds Problem finding, ideation, crowds Problem finding, ideation, and organization with crowds
  19. 19. 19 Hisashi Kashima, Human Computation Team  Making decisions from ideas, designs, or others  Clustering for making sense of the ideas  Ranking for prioritizing the ideas  Fully automated clustering and ranking are difficult Idea organization problem: Clustering and ranking
  20. 20. 20 Hisashi Kashima, Human Computation Team  Instead of fully automated clustering and ranking, we resort to a human-machine hybrid approach  Aggregating pairwise judges by crowds – Pairwise similarity comparison to judge if two objects are similar to each other – Pairwise preference comparison to judge which is preferred in two objects  Research question: can we improve the both results by using both types of queries? Crowd idea organization: Simultaneous clustering and ranking with crowds
  21. 21. 21 Hisashi Kashima, Human Computation Team  Ranking helps embedding: –If competencies of two items are different, they are not similar –Not vice versa  Embedding helps ranking –If two items are similar, their competencies should be close –Not always vice versa  Do ranking and clustering (embedding) in a single formulation Proposed method: Ranking and embedding help each other
  22. 22. 22 Hisashi Kashima, Human Computation Team  The proposed method (SCARPA [IJCAI 18]) successfully identifies clusters as well as the priority direction – Symbol size: priority, Shape: similarity Result: Ranking and embedding help each other

×