SlideShare a Scribd company logo
1 of 30
What Can Machine Learning & Crowdsourcing
Do for You?
Exploring New Tools for Scalable Data Processing
Matt Lease
School of Information @mattlease
University of Texas at Austin ml@utexas.edu
Slides:
slideshare.net/mattlease
“The place where people & technology meet”
~ Wobbrock et al., 2009
“iSchools” now exist at 65 universities around the world
www.ischools.org
What’s an Information School?
2
• Machine Learning (AI) lets us automate many
useful tasks, eg. natural language processing (NLP)
• Crowdsourcing enables new levels of efficiency &
scalability in data collection & processing
• Human Computation lets us build next-generation
applications today, with capabilities beyond AI
Roadmap
Motivation: Applications
@mattlease
Automatic/Hybrid Fact Checking
• http://fcweb.pythonanywhere.com
– Nguyen et al., AAAI 2018
5
• http://odyssey.ischool.utexas.edu/mb/
– Ryu et al., HyperText 2012
MemeBrowser
6
• Kumar et al., CIKM 2011
Dating Biographies without Time Mentions
Plato (428-348 B.C.) Lincoln (1809-1865)
7
Transcription & Copy-Editing
• Spontaneous speech is often disfluent, with repetitions,
corrections, and vocalized space-fillers
• Lease, Charniak, and Johnson, 2005
• Zhou, Baskov, and Lease, 2013 (& Zhou’s Thesis)
S1: Uh first um i need to know uh how do you feel about uh about
sending uh an elderly uh family member to a nursing home
S2: Well of course it's you know it's one of the last few things in the
world you'd ever want to do you know unless it's just you know really
you know uh for their uh you know for their own good
Transcription & Copy-Editing
• Spontaneous speech is often disfluent, with repetitions,
corrections, and vocalized space-fillers
• Lease, Charniak, and Johnson, 2005
• Zhou, Baskov, and Lease, 2013 (& Zhou’s Thesis)
S1: Uh first um i need to know uh how do you feel about uh about
sending uh an elderly uh family member to a nursing home
S2: Well of course it's you know it's one of the last few things in the
world you'd ever want to do you know unless it's just you know really
you know uh for their uh you know for their own good
Two Problems
@mattlease
Machine Learning - Supervised
Slide courtesy of Byron Wallace (Northeastern)
11
AI effectiveness is often limited by training data size
Problem: creating labeled data is expensive!
Banko and Brill (2001)
What do we do when state-of-art AI
still isn’t good enough?
Crowdsourcing
@mattlease
Crowdsourcing
• Jeff Howe. Wired, June 2006.
• Take a job traditionally
performed by a known agent
(often an employee)
• Outsource it to an undefined,
generally large group of
people via an open call
15
Volunteer Crowd Success Stories
Zooniverse
17
• Marketplace for paid crowd work (“micro-tasks”)
– Created in 2005 (remains in “beta” today)
• On-demand, scalable, 24/7 global workforce
• API lets human labor be integrated into software
– “You’ve heard of software-as-a-service. Now this is human-as-a-service.”
Amazon Mechanical Turk (MTurk)
Collecting Data from Crowds
2008: MTurk sparks “gold rush” for ML training data
• Information Retrieval: Alonso et al., SIGIR Forum
• Human-Computer Interaction: Kittur et al., CHI
• Computer Vision: Sorokin & Forsythe, CVPR
• NLP: Snow et al, EMNLP
– Annotating human language
– 22,000 labels for only US $26
– Crowd’s consensus labels can
replace traditional expert labels
Human Computation
@mattlease
21
ACM Queue, May 2006
22
“Software developers with innovative ideas for
businesses and technologies are constrained by the
limits of artificial intelligence… If software developers
could programmatically access and incorporate human
intelligence into their applications, a whole new class
of innovative businesses and applications would be
possible. This is the goal of Amazon Mechanical Turk…
people are freer to innovate because they can now
imbue software with real human intelligence.”
PlateMate: Counting Calories
Noronha et al., UIST’10
23
Bederson et al., 2010; Morita & Ishidi, 2009
MonoTrans
Translation by Monolingual Speakers + AI
24
Zensors
Laput et al., CSCW 2015
25
But Who Protects the Moderators?
Dang et al., HCOMP’18 & CI’18 26
What about ethics?
• Silberman, Irani, and Ross (2010)
– “How should we… conceptualize the role of these people
who we ask to power our computing?”
• Irani and Silberman (2013)
– “…by hiding workers behind web forms and APIs…
employers see themselves as builders of innovative
technologies, rather than… unconcerned with working
conditions… redirecting focus to the innovation of human
computation as a field of technological achievement.”
• Fort, Adda, and Cohen (2011)
– “…opportunities for our community to deliberately
value ethics above cost savings.” 27
Summary
• Machine Learning (AI) lets us automate many
useful tasks, eg. natural language processing (NLP)
• Crowdsourcing enables new levels of efficiency &
scalability in data collection & processing
• Human Computation lets us build next-generation
applications today, with capabilities beyond AI
The Future of Crowd Work
Paper @ CSCW 2013 by
Kittur, Nickerson, Bernstein, Gerber,
Shaw, Zimmerman, Lease, and Horton 29
Matt Lease - ml@utexas.edu - @mattlease
Thank You!
Slides: slideshare.net/mattlease
Lab: ir.ischool.utexas.edu

More Related Content

What's hot

The Future of work and impact on the technology worker
The Future of work and impact on the technology workerThe Future of work and impact on the technology worker
The Future of work and impact on the technology workerPeter Cosgrove
 
The Impact of Automation & AI in the Workplace
The Impact of Automation & AI in the WorkplaceThe Impact of Automation & AI in the Workplace
The Impact of Automation & AI in the WorkplaceCollabor8now Ltd
 
Data urban service science 20130617 v2
Data urban service science 20130617 v2Data urban service science 20130617 v2
Data urban service science 20130617 v2ISSIP
 
20220103 jim spohrer hicss v9
20220103 jim spohrer hicss v920220103 jim spohrer hicss v9
20220103 jim spohrer hicss v9ISSIP
 
AI & Business - Opportunities & Dangers
AI & Business - Opportunities & DangersAI & Business - Opportunities & Dangers
AI & Business - Opportunities & Dangerswillmurphy
 
People's Interactions with Cognitive Assistants for Enhanced Performance
People's Interactions with Cognitive Assistants for Enhanced PerformancePeople's Interactions with Cognitive Assistants for Enhanced Performance
People's Interactions with Cognitive Assistants for Enhanced PerformanceMd. Abul Kalam Siddike
 
Frontiers sutton spohrer 20150711 v2
Frontiers sutton spohrer 20150711 v2Frontiers sutton spohrer 20150711 v2
Frontiers sutton spohrer 20150711 v2ISSIP
 
20211103 jim spohrer oecd ai_science_productivity_panel v5
20211103 jim spohrer oecd ai_science_productivity_panel v520211103 jim spohrer oecd ai_science_productivity_panel v5
20211103 jim spohrer oecd ai_science_productivity_panel v5ISSIP
 
20210322 jim spohrer eaae deans summit v13
20210322 jim spohrer eaae deans summit v1320210322 jim spohrer eaae deans summit v13
20210322 jim spohrer eaae deans summit v13ISSIP
 
Artificial Intelligence (AI) and Job Loss
Artificial Intelligence (AI) and Job LossArtificial Intelligence (AI) and Job Loss
Artificial Intelligence (AI) and Job LossIkhlaq Sidhu
 
Effects of ai on job market
Effects of ai on job marketEffects of ai on job market
Effects of ai on job marketOmar Ahmed
 
Will robots take our jobs (short version) for Women Techmakers Talk
Will robots take our jobs (short version) for Women Techmakers TalkWill robots take our jobs (short version) for Women Techmakers Talk
Will robots take our jobs (short version) for Women Techmakers TalkAva Meredith
 
Korea day1 keynote 20161013 v6
Korea day1 keynote 20161013 v6Korea day1 keynote 20161013 v6
Korea day1 keynote 20161013 v6ISSIP
 
Smart Machines: Driving the 4th Industrial Revolution?
Smart Machines: Driving the 4th Industrial Revolution?Smart Machines: Driving the 4th Industrial Revolution?
Smart Machines: Driving the 4th Industrial Revolution?Bijilash Babu
 
Applying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to BusinessApplying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to BusinessRussell Miles
 
20210519 jim spohrer sir rel future_ai v14
20210519 jim spohrer sir rel future_ai v1420210519 jim spohrer sir rel future_ai v14
20210519 jim spohrer sir rel future_ai v14ISSIP
 
How Artificial Intelligence is taking over Human Jobs
How Artificial Intelligence is taking over Human JobsHow Artificial Intelligence is taking over Human Jobs
How Artificial Intelligence is taking over Human JobsShradha Jindal
 
Japan 20200724 v13
Japan 20200724 v13Japan 20200724 v13
Japan 20200724 v13ISSIP
 

What's hot (20)

The Future of work and impact on the technology worker
The Future of work and impact on the technology workerThe Future of work and impact on the technology worker
The Future of work and impact on the technology worker
 
The Impact of Automation & AI in the Workplace
The Impact of Automation & AI in the WorkplaceThe Impact of Automation & AI in the Workplace
The Impact of Automation & AI in the Workplace
 
Data urban service science 20130617 v2
Data urban service science 20130617 v2Data urban service science 20130617 v2
Data urban service science 20130617 v2
 
20220103 jim spohrer hicss v9
20220103 jim spohrer hicss v920220103 jim spohrer hicss v9
20220103 jim spohrer hicss v9
 
Skills Requirements for Future Jobs - 10 Facts
Skills Requirements for Future Jobs - 10 FactsSkills Requirements for Future Jobs - 10 Facts
Skills Requirements for Future Jobs - 10 Facts
 
AI & Business - Opportunities & Dangers
AI & Business - Opportunities & DangersAI & Business - Opportunities & Dangers
AI & Business - Opportunities & Dangers
 
People's Interactions with Cognitive Assistants for Enhanced Performance
People's Interactions with Cognitive Assistants for Enhanced PerformancePeople's Interactions with Cognitive Assistants for Enhanced Performance
People's Interactions with Cognitive Assistants for Enhanced Performance
 
Frontiers sutton spohrer 20150711 v2
Frontiers sutton spohrer 20150711 v2Frontiers sutton spohrer 20150711 v2
Frontiers sutton spohrer 20150711 v2
 
20211103 jim spohrer oecd ai_science_productivity_panel v5
20211103 jim spohrer oecd ai_science_productivity_panel v520211103 jim spohrer oecd ai_science_productivity_panel v5
20211103 jim spohrer oecd ai_science_productivity_panel v5
 
20210322 jim spohrer eaae deans summit v13
20210322 jim spohrer eaae deans summit v1320210322 jim spohrer eaae deans summit v13
20210322 jim spohrer eaae deans summit v13
 
Artificial Intelligence (AI) and Job Loss
Artificial Intelligence (AI) and Job LossArtificial Intelligence (AI) and Job Loss
Artificial Intelligence (AI) and Job Loss
 
Effects of ai on job market
Effects of ai on job marketEffects of ai on job market
Effects of ai on job market
 
Will robots take our jobs (short version) for Women Techmakers Talk
Will robots take our jobs (short version) for Women Techmakers TalkWill robots take our jobs (short version) for Women Techmakers Talk
Will robots take our jobs (short version) for Women Techmakers Talk
 
The impact of AI on work
The impact of AI on workThe impact of AI on work
The impact of AI on work
 
Korea day1 keynote 20161013 v6
Korea day1 keynote 20161013 v6Korea day1 keynote 20161013 v6
Korea day1 keynote 20161013 v6
 
Smart Machines: Driving the 4th Industrial Revolution?
Smart Machines: Driving the 4th Industrial Revolution?Smart Machines: Driving the 4th Industrial Revolution?
Smart Machines: Driving the 4th Industrial Revolution?
 
Applying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to BusinessApplying Machine Learning and Artificial Intelligence to Business
Applying Machine Learning and Artificial Intelligence to Business
 
20210519 jim spohrer sir rel future_ai v14
20210519 jim spohrer sir rel future_ai v1420210519 jim spohrer sir rel future_ai v14
20210519 jim spohrer sir rel future_ai v14
 
How Artificial Intelligence is taking over Human Jobs
How Artificial Intelligence is taking over Human JobsHow Artificial Intelligence is taking over Human Jobs
How Artificial Intelligence is taking over Human Jobs
 
Japan 20200724 v13
Japan 20200724 v13Japan 20200724 v13
Japan 20200724 v13
 

Similar to What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for Scalable Data Processing

CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK Project
 
Robotisation of Knowledge and Service Work
Robotisation of Knowledge and Service WorkRobotisation of Knowledge and Service Work
Robotisation of Knowledge and Service WorkDr. Crispin Coombs
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Matthew Lease
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningMykola Dobrochynskyy
 
Deep Learning for AI - Yoshua Bengio, Mila
Deep Learning for AI - Yoshua Bengio, MilaDeep Learning for AI - Yoshua Bengio, Mila
Deep Learning for AI - Yoshua Bengio, MilaLucidworks
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?Matthew Lease
 
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICS
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICSGENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICS
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICSNITHYA637064
 
Présentation de Bruno Schroder au 20e #mforum (07/12/2016)
Présentation de Bruno Schroder au 20e #mforum (07/12/2016)Présentation de Bruno Schroder au 20e #mforum (07/12/2016)
Présentation de Bruno Schroder au 20e #mforum (07/12/2016)Agence du Numérique (AdN)
 
Spohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptxSpohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptxISSIP
 
Seminar 20221027 v4.pptx
Seminar 20221027 v4.pptxSeminar 20221027 v4.pptx
Seminar 20221027 v4.pptxISSIP
 
Humans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industryHumans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industryPaco Nathan
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
 
Art of artificial intelligence and automation
Art of artificial intelligence and automationArt of artificial intelligence and automation
Art of artificial intelligence and automationLiew Wei Da Andrew
 
Webinar on AI in IoT applications KCG Connect Alumni Digital Series by Rajkumar
Webinar on AI in IoT applications KCG Connect Alumni Digital Series by RajkumarWebinar on AI in IoT applications KCG Connect Alumni Digital Series by Rajkumar
Webinar on AI in IoT applications KCG Connect Alumni Digital Series by RajkumarRajkumar R
 
Metrocon-Rise-Of-Crowd-Computing
Metrocon-Rise-Of-Crowd-ComputingMetrocon-Rise-Of-Crowd-Computing
Metrocon-Rise-Of-Crowd-ComputingMatthew Lease
 
Emotional intelligence and artificial intelligence (A comparative analysis)
Emotional intelligence and artificial intelligence (A comparative analysis)Emotional intelligence and artificial intelligence (A comparative analysis)
Emotional intelligence and artificial intelligence (A comparative analysis)Rumbidzai Faith Matanga
 
Semiconductors 20240320 v14 corrected slides.pptx
Semiconductors 20240320 v14 corrected slides.pptxSemiconductors 20240320 v14 corrected slides.pptx
Semiconductors 20240320 v14 corrected slides.pptxISSIP
 

Similar to What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for Scalable Data Processing (20)

CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
 
Robotisation of Knowledge and Service Work
Robotisation of Knowledge and Service WorkRobotisation of Knowledge and Service Work
Robotisation of Knowledge and Service Work
 
Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)Rise of Crowd Computing (December 2012)
Rise of Crowd Computing (December 2012)
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine Learning
 
Deep Learning for AI - Yoshua Bengio, Mila
Deep Learning for AI - Yoshua Bengio, MilaDeep Learning for AI - Yoshua Bengio, Mila
Deep Learning for AI - Yoshua Bengio, Mila
 
But Who Protects the Moderators?
But Who Protects the Moderators?But Who Protects the Moderators?
But Who Protects the Moderators?
 
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICS
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICSGENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICS
GENERATIVE ARTIFICIAL INTELLIGENCE &DATA ANALYTICS
 
Présentation de Bruno Schroder au 20e #mforum (07/12/2016)
Présentation de Bruno Schroder au 20e #mforum (07/12/2016)Présentation de Bruno Schroder au 20e #mforum (07/12/2016)
Présentation de Bruno Schroder au 20e #mforum (07/12/2016)
 
When AI becomes a data-driven machine, and digital is everywhere!
When AI becomes a data-driven machine, and digital is everywhere!When AI becomes a data-driven machine, and digital is everywhere!
When AI becomes a data-driven machine, and digital is everywhere!
 
Spohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptxSpohrer PHD_ICT_KES 20230316 v10.pptx
Spohrer PHD_ICT_KES 20230316 v10.pptx
 
Seminar 20221027 v4.pptx
Seminar 20221027 v4.pptxSeminar 20221027 v4.pptx
Seminar 20221027 v4.pptx
 
Humans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industryHumans in the loop: AI in open source and industry
Humans in the loop: AI in open source and industry
 
NHH 20231105 v6.pptx
NHH 20231105 v6.pptxNHH 20231105 v6.pptx
NHH 20231105 v6.pptx
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Art of artificial intelligence and automation
Art of artificial intelligence and automationArt of artificial intelligence and automation
Art of artificial intelligence and automation
 
Webinar on AI in IoT applications KCG Connect Alumni Digital Series by Rajkumar
Webinar on AI in IoT applications KCG Connect Alumni Digital Series by RajkumarWebinar on AI in IoT applications KCG Connect Alumni Digital Series by Rajkumar
Webinar on AI in IoT applications KCG Connect Alumni Digital Series by Rajkumar
 
Ai
AiAi
Ai
 
Metrocon-Rise-Of-Crowd-Computing
Metrocon-Rise-Of-Crowd-ComputingMetrocon-Rise-Of-Crowd-Computing
Metrocon-Rise-Of-Crowd-Computing
 
Emotional intelligence and artificial intelligence (A comparative analysis)
Emotional intelligence and artificial intelligence (A comparative analysis)Emotional intelligence and artificial intelligence (A comparative analysis)
Emotional intelligence and artificial intelligence (A comparative analysis)
 
Semiconductors 20240320 v14 corrected slides.pptx
Semiconductors 20240320 v14 corrected slides.pptxSemiconductors 20240320 v14 corrected slides.pptx
Semiconductors 20240320 v14 corrected slides.pptx
 

More from Matthew Lease

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesMatthew Lease
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Matthew Lease
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopMatthew Lease
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Matthew Lease
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd Matthew Lease
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Matthew Lease
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Matthew Lease
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Matthew Lease
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information RetrievalMatthew Lease
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Matthew Lease
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingMatthew Lease
 
The Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingThe Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingMatthew Lease
 
Toward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkToward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkMatthew Lease
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Matthew Lease
 
Crowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationCrowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationMatthew Lease
 
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkCrowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkMatthew Lease
 
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsMatthew Lease
 
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Matthew Lease
 

More from Matthew Lease (20)

Automated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey ResponsesAutomated Models for Quantifying Centrality of Survey Responses
Automated Models for Quantifying Centrality of Survey Responses
 
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
Key Challenges in Moderating Social Media: Accuracy, Cost, Scalability, and S...
 
Explainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loopExplainable Fact Checking with Humans in-the-loop
Explainable Fact Checking with Humans in-the-loop
 
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
Adventures in Crowdsourcing : Toward Safer Content Moderation & Better Suppor...
 
AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd AI & Work, with Transparency & the Crowd
AI & Work, with Transparency & the Crowd
 
Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation Designing Human-AI Partnerships to Combat Misinfomation
Designing Human-AI Partnerships to Combat Misinfomation
 
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...
 
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...
 
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...
 
Fact Checking & Information Retrieval
Fact Checking & Information RetrievalFact Checking & Information Retrieval
Fact Checking & Information Retrieval
 
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Systematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s ClothingSystematic Review is e-Discovery in Doctor’s Clothing
Systematic Review is e-Discovery in Doctor’s Clothing
 
The Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject CrowdsourcingThe Search for Truth in Objective & Subject Crowdsourcing
The Search for Truth in Objective & Subject Crowdsourcing
 
Toward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd WorkToward Effective and Sustainable Online Crowd Work
Toward Effective and Sustainable Online Crowd Work
 
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
Multidimensional Relevance Modeling via Psychometrics & Crowdsourcing: ACM SI...
 
Crowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine EvaluationCrowdsourcing: From Aggregation to Search Engine Evaluation
Crowdsourcing: From Aggregation to Search Engine Evaluation
 
Crowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical TurkCrowdsourcing Transcription Beyond Mechanical Turk
Crowdsourcing Transcription Beyond Mechanical Turk
 
Crowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to EthicsCrowdsourcing for Information Retrieval: From Statistics to Ethics
Crowdsourcing for Information Retrieval: From Statistics to Ethics
 
Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences. Crowdsourcing & ethics: a few thoughts and refences.
Crowdsourcing & ethics: a few thoughts and refences.
 

Recently uploaded

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for Scalable Data Processing

  • 1. What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for Scalable Data Processing Matt Lease School of Information @mattlease University of Texas at Austin ml@utexas.edu Slides: slideshare.net/mattlease
  • 2. “The place where people & technology meet” ~ Wobbrock et al., 2009 “iSchools” now exist at 65 universities around the world www.ischools.org What’s an Information School? 2
  • 3. • Machine Learning (AI) lets us automate many useful tasks, eg. natural language processing (NLP) • Crowdsourcing enables new levels of efficiency & scalability in data collection & processing • Human Computation lets us build next-generation applications today, with capabilities beyond AI Roadmap
  • 5. Automatic/Hybrid Fact Checking • http://fcweb.pythonanywhere.com – Nguyen et al., AAAI 2018 5
  • 6. • http://odyssey.ischool.utexas.edu/mb/ – Ryu et al., HyperText 2012 MemeBrowser 6
  • 7. • Kumar et al., CIKM 2011 Dating Biographies without Time Mentions Plato (428-348 B.C.) Lincoln (1809-1865) 7
  • 8. Transcription & Copy-Editing • Spontaneous speech is often disfluent, with repetitions, corrections, and vocalized space-fillers • Lease, Charniak, and Johnson, 2005 • Zhou, Baskov, and Lease, 2013 (& Zhou’s Thesis) S1: Uh first um i need to know uh how do you feel about uh about sending uh an elderly uh family member to a nursing home S2: Well of course it's you know it's one of the last few things in the world you'd ever want to do you know unless it's just you know really you know uh for their uh you know for their own good
  • 9. Transcription & Copy-Editing • Spontaneous speech is often disfluent, with repetitions, corrections, and vocalized space-fillers • Lease, Charniak, and Johnson, 2005 • Zhou, Baskov, and Lease, 2013 (& Zhou’s Thesis) S1: Uh first um i need to know uh how do you feel about uh about sending uh an elderly uh family member to a nursing home S2: Well of course it's you know it's one of the last few things in the world you'd ever want to do you know unless it's just you know really you know uh for their uh you know for their own good
  • 11. Machine Learning - Supervised Slide courtesy of Byron Wallace (Northeastern) 11
  • 12. AI effectiveness is often limited by training data size Problem: creating labeled data is expensive! Banko and Brill (2001)
  • 13. What do we do when state-of-art AI still isn’t good enough?
  • 15. Crowdsourcing • Jeff Howe. Wired, June 2006. • Take a job traditionally performed by a known agent (often an employee) • Outsource it to an undefined, generally large group of people via an open call 15
  • 18. • Marketplace for paid crowd work (“micro-tasks”) – Created in 2005 (remains in “beta” today) • On-demand, scalable, 24/7 global workforce • API lets human labor be integrated into software – “You’ve heard of software-as-a-service. Now this is human-as-a-service.” Amazon Mechanical Turk (MTurk)
  • 19. Collecting Data from Crowds 2008: MTurk sparks “gold rush” for ML training data • Information Retrieval: Alonso et al., SIGIR Forum • Human-Computer Interaction: Kittur et al., CHI • Computer Vision: Sorokin & Forsythe, CVPR • NLP: Snow et al, EMNLP – Annotating human language – 22,000 labels for only US $26 – Crowd’s consensus labels can replace traditional expert labels
  • 21. 21
  • 22. ACM Queue, May 2006 22 “Software developers with innovative ideas for businesses and technologies are constrained by the limits of artificial intelligence… If software developers could programmatically access and incorporate human intelligence into their applications, a whole new class of innovative businesses and applications would be possible. This is the goal of Amazon Mechanical Turk… people are freer to innovate because they can now imbue software with real human intelligence.”
  • 23. PlateMate: Counting Calories Noronha et al., UIST’10 23
  • 24. Bederson et al., 2010; Morita & Ishidi, 2009 MonoTrans Translation by Monolingual Speakers + AI 24
  • 25. Zensors Laput et al., CSCW 2015 25
  • 26. But Who Protects the Moderators? Dang et al., HCOMP’18 & CI’18 26
  • 27. What about ethics? • Silberman, Irani, and Ross (2010) – “How should we… conceptualize the role of these people who we ask to power our computing?” • Irani and Silberman (2013) – “…by hiding workers behind web forms and APIs… employers see themselves as builders of innovative technologies, rather than… unconcerned with working conditions… redirecting focus to the innovation of human computation as a field of technological achievement.” • Fort, Adda, and Cohen (2011) – “…opportunities for our community to deliberately value ethics above cost savings.” 27
  • 28. Summary • Machine Learning (AI) lets us automate many useful tasks, eg. natural language processing (NLP) • Crowdsourcing enables new levels of efficiency & scalability in data collection & processing • Human Computation lets us build next-generation applications today, with capabilities beyond AI
  • 29. The Future of Crowd Work Paper @ CSCW 2013 by Kittur, Nickerson, Bernstein, Gerber, Shaw, Zimmerman, Lease, and Horton 29
  • 30. Matt Lease - ml@utexas.edu - @mattlease Thank You! Slides: slideshare.net/mattlease Lab: ir.ischool.utexas.edu