SlideShare a Scribd company logo
1 of 35
Data Excellence:
Better Data for Better AI
ODSC 2020
Lora Aroyo
http://lora-aroyo.org
@laroyo
By Scanned from The Magic of M. C. Escher. (Harry N. Abrams, Inc. ISBN
0-8109-6720-0) by Justin Foote (talk)., Fair use,
https://en.wikipedia.org/w/index.php?curid=3955850
http://lora-aroyo.org @laroyo
TAKE HOME MESSAGE
2
data lifecycle - just like in software - is needed to
guide data research & development practices
data is the compass for AI - AI advances where
there is data
data is at the center - AI systems success
depends on the quality of their data
https://en.wikipedia.org/wiki/Metamorphosis_II
data quality must be addressed in AI practices
- multitude of notions of truth
- necessity for data quality standards
data lifecycle is the backbone for data
excellence tools and practices to stay ahead of
future unintended AI behaviours
http://lora-aroyo.org @laroyo 3
The Rise of the Machines
“AI Winter”
lab experiments
Expert Systems
small scale
experiments
http://lora-aroyo.org @laroyo 4
The Rise of the Machines
“AI Winter” → “AI Breakthroughs in Games”
IBM Watson Jeopardy
DeepMind AlphaGo
beat the humans
http://lora-aroyo.org @laroyo 5
The Rise of the Machines
“AI Winter” → “AI Breakthroughs in Games” → “Real World Tasks”
Health diagnostics
Flue prediction
Weather prediction
Text, Image and Video classification
Text Generation
Text Translation
Conversational AI
support the humans
http://lora-aroyo.org @laroyo 6
Mainstream Deployment of AI
“Real World Tasks” deployed in the wild → Unintended behaviors
Microsoft Tay bot
IBM Watson Oncology
Amazon Rekognition
Google Photos
Apple Face ID
Facebook chat bots
Various Speech Assistants
http://lora-aroyo.org @laroyo 7
getting computers to “see”
the diversity of data
data quality is essential for
guiding AI away from
unintended behaviours
Data is the compass for AI
http://lora-aroyo.org @laroyo 8
The Life of AI Data
“It exists!”
bootstrapping AI with data
Caltech101
LabelMe
Berkley-3D
https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
http://lora-aroyo.org @laroyo 9
The Life of AI Data
“It exists!” → “It is bigger!”
data hungry AI
ImageNet
SIFT10M
OpenImages
COCO
Web 1T 5-Gram
https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
http://lora-aroyo.org @laroyo 10
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
but before it got better ...
http://lora-aroyo.org @laroyo 11
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
but before it got better ...
it got worse ...
http://lora-aroyo.org @laroyo 12
Unintended Behaviors in AI
Adapted from “AI in the Open World: Discovering Blind Spots of AI”, SafeAI 2020, Ece Kumar
http://lora-aroyo.org @laroyo 13
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
but before it got better ...
reactive
data improvement
http://lora-aroyo.org @laroyo 14
The Life of AI Data
“It exists!” → “It is bigger!” → “It is better!”
to reach here
we need proactive
data improvement
http://lora-aroyo.org @laroyo 15
The Life of AI Data
Alon Halevy, Peter Norvig, and Fernando Pereira. 2009. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems 24, 2 (2009)
In the decade since then, the research community have done a lot
with quantity, but quality has been left behind
http://lora-aroyo.org @laroyo 16
In the 90’s we introduced standards
to achieve Software reliability
introduced software engineering lifecycle
- requirements, design and testing
established processes for software maintenance
- version control, sharing, documenting
established software quality metrics & processes
Ben Hutchinson, 2020
http://lora-aroyo.org @laroyo 17
Now we need the same for Data
introduce data lifecycle
- requirements, design and testing
establish processes for dataset maintenance
- version control, sharing, documenting
establish data quality metrics & processes
Ben Hutchinson, 2020
http://lora-aroyo.org @laroyo 18
data quality is typically not
caused by software bugs or just
by human errors
dataset are not easy to debug
data quality is typically result of:
- how well a dataset
represent the actual task
- how is the annotation done
- are the quality metrics
adequate
Data Quality is not easy ...
http://lora-aroyo.org @laroyo
it is not easy to give Y/N answer
for most of our AI tasks
19
Do these images depict a GUITAR ?
Data Quality is not only human error
✓
✓ ✓
✘
✘
✘✘✓
✓
http://lora-aroyo.org @laroyo 20
Do these images depict NEW ZEALAND ?
Data Quality should consider context of use
it is not easy to give Y/N answer
for most of our AI tasks
the answer typically depends on
the context, on the task, on the
usage, etc
✓ ✘
✓ ✓ ✘
✘
http://lora-aroyo.org @laroyo 21
Do these images depict a WEDDING ?
Data Quality should include real world diversity
it is not easy to give Y/N answer
for most of our AI tasks
the answer typically depends on
the context, on the task, on the
usage, etc
disagreement is signal for
diversity and should be included
in AI training
✓
✘
✓
✓
✘
✓
http://lora-aroyo.org @laroyo 22
Does the Sentence expresses
Does the sentence express TREATS relation between Chloroquine, Malaria?
Data Quality is difficult even with experts
For prevention of malaria, use only in individuals traveling to malarious
areas where CHLOROQUINE resistant P. falciparum MALARIA
has not been reported.
Rheumatoid arthritis and MALARIA have been treated
with CHLOROQUINE for decades.
Among 56 subjects reporting to a clinic with symptoms of MALARIA
53 (95%) had ordinarily effective levels of CHLOROQUINE in blood.
✓
✘
✓
http://lora-aroyo.org @laroyo
DISAGREEMENT IS SIGNAL
Variety of sources for disagreement
http://lora-aroyo.org @laroyo 24
Does the Sentence expresses
Model of semantic interpretation
TRIANGLE OF MEANING
“Three Sides of CrowdTruth”, Human Computation Journal, v1, 2014, L. Aroyo, C. Welty
Workshop on “Subjectivity, Ambiguity and Disagreement (SAD) in Crowdsourcing”, The Web Conference 2019, https://sadworkshop.wordpress.com/
Annotator disagreement
is signal, not noise
Annotator disagreement
is indicative of
variation in human
interpretation
Annotator disagreement
is indicative of
ambiguity, vagueness,
similarity, over-generality,
& quality
http://lora-aroyo.org @laroyo 25
Three sides of human interpretation
CROWDTRUTH Disagreement provides
guidance in task analysis:
● items with poor semantics
● items with salient terms
● items difficult to classify
● items that are ambiguous
● subjective annotations
● time-sensitive annotations
● difficult annotation tasks
● mis-translated annotations
● users with/without
specific knowledge
● communities of thought
● spammers
You can’t remove the corners…
“Three Sides of CrowdTruth”, Human Computation Journal, v1, 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo
THE WORLD IS A SMOOTH SPECTRUM OF TRUTH
26
http://lora-aroyo.org @laroyo 27
One truth: knowledge acquisition typically assumes one
correct interpretation for every example
Experts rule: knowledge is captured from domain experts
One is enough: single expert’s knowledge is sufficient
Disagreement bad: when people disagree, they must not
understand the problem
Detailed explanations help: if examples cause
disagreement - adding instructions should help
Once done, forever valid: knowledge is not updated; new
data not aligned with old
All examples are created equal: triples are triples, one is
not more important than another, they are all either true or
false
… and we force the smoothness into a binary form
7 Myths about Human Annotation
“Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
http://lora-aroyo.org @laroyo 28
High Quality Data
represents a phenomena
accurately and consistently over time
and is replicable, reproducible,
and maintainable over time;
has empirical and explanatory power;
and is collected, stored, and used
responsibly.
Rigorous Evaluation of AI Systems workshop, 2019, Human Computation (HCOMP), http://eval.how/
Evaluating Evaluation for AI Systems workshop, 2020, Association for the Advancement of Artificial Intelligence (AAAI), http://eval.how/aaai-2020/
http://lora-aroyo.org @laroyo 29
From Data Quality to Data Excellence
Data Quality is
- a point-estimate of goodness of data
Data Excellence is
- the set of practices and tools that result in
high quality data
http://lora-aroyo.org @laroyo 30
How do we achieve Data Excellence?
Maintainability
Well documented datasets with
owners, which follow best practices
for data at any scale.
Reproducibility
Basic and critical regression tests
for datasets which suppo solid
conclusions for decision making.
Reliability
Datasets which are internally sound
and consistent; factors that a ect
the data are addressed or disclosed.
Fidelity
Data which faithfully, accurately, and
comprehensively represents the
captured phenomenon.
Validity
Datasets which explain aspects of
the phenomena that they represent
in terms of external measures.
1st International Workshop on Data Excellence: http://eval.how/dew2020/
Utility
Data which adequately and
accurately achieves the intended
product behavior.
http://lora-aroyo.org @laroyo 31
much like in software lifecycles, cutting corners at each stage
cascades to subsequent versions, which lead to technical debt
Dataset [Requirements] Analysis
Requirements Analysis
Stakeholder Input
Privacy, compliance
Trust & safety planning
Dataset Maintenance
Updating data over time
Extending to other languages
Version control
Storage and accessibility
Dataset Design
Data acquisition methodology
Rater guidelines
Construct validation
Dataset Testing
Representation metrics
Fairness metrics
Reliability metrics
Approval process
Dataset Implementation
Human labeled data
Logging interaction data
Data
Lifecycle
Ben Hutchinson, 2020
http://lora-aroyo.org @laroyo
TAKE HOME MESSAGE
32
https://en.wikipedia.org/wiki/Metamorphosis_II
data lifecycle - just like in software - is needed to
guide data research & development practices
data is the compass for AI - AI advances where
there is data
data is at the center - AI systems success
depends on the quality of their data
data quality must be addressed in AI practices
- multitude of notions of truth
- necessity for data quality standards
data lifecycle is the backbone for data
excellence tools and practices to stay ahead of
future unintended AI behaviours
http://lora-aroyo.org @laroyo 33
Collaborators
EthicalAI
Ben Hutchinson
Crowd Platform
Amol Wankhede
Anurag Batra
People + AI Research (PAIR)
Nithya Sambasivan
Kristen Olson
Shivani Kapania
Jess Holbrook
Andrew Zaldivar
Mahima Pushkarna
Maysam Moussalem
Praveen Paritosh Ka Wong
Lora Aroyo Devi Krishna
Likert team
Data Excellence:
Better Data for Better AI
ODSC 2020
Lora Aroyo
http://lora-aroyo.org
@laroyo
By Scanned from The Magic of M. C. Escher. (Harry N. Abrams, Inc. ISBN
0-8109-6720-0) by Justin Foote (talk)., Fair use,
https://en.wikipedia.org/w/index.php?curid=3955850
high profile data failure
not bugs in the software, not mistake of humans
problems caused by quality in the data
just like software quality in 90’s - the same has to happen with data
examples of questionable data
crowdtruth relation extraction
how would you annotate it
how do we know and measure the quality of the data
how well does it represent the actual task we are trying to solve
like software we need to establish data quality standards

More Related Content

What's hot

Google Cloud GenAI Overview_071223.pptx
Google Cloud GenAI Overview_071223.pptxGoogle Cloud GenAI Overview_071223.pptx
Google Cloud GenAI Overview_071223.pptx
VishPothapu
 

What's hot (20)

The-CxO-Guide-to.pdf
The-CxO-Guide-to.pdfThe-CxO-Guide-to.pdf
The-CxO-Guide-to.pdf
 
Using the power of Generative AI at scale
Using the power of Generative AI at scaleUsing the power of Generative AI at scale
Using the power of Generative AI at scale
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
Generative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdfGenerative-AI-in-enterprise-20230615.pdf
Generative-AI-in-enterprise-20230615.pdf
 
Charles Caldwell - Improve Your Life with AI.pdf
Charles Caldwell - Improve Your Life with AI.pdfCharles Caldwell - Improve Your Life with AI.pdf
Charles Caldwell - Improve Your Life with AI.pdf
 
UX Discovery 14th Metaverse & Technology 트래킹 기술
UX Discovery 14th Metaverse & Technology 트래킹 기술UX Discovery 14th Metaverse & Technology 트래킹 기술
UX Discovery 14th Metaverse & Technology 트래킹 기술
 
Google Cloud GenAI Overview_071223.pptx
Google Cloud GenAI Overview_071223.pptxGoogle Cloud GenAI Overview_071223.pptx
Google Cloud GenAI Overview_071223.pptx
 
Alteryx investor presentation
Alteryx investor presentationAlteryx investor presentation
Alteryx investor presentation
 
Explainable AI for non-expert users
Explainable AI for non-expert usersExplainable AI for non-expert users
Explainable AI for non-expert users
 
mlops.community meetup - ML Governance_ A Practical Guide.pptx
mlops.community meetup - ML Governance_ A Practical Guide.pptxmlops.community meetup - ML Governance_ A Practical Guide.pptx
mlops.community meetup - ML Governance_ A Practical Guide.pptx
 
iA Générative : #ChatGPT #MidJourney
iA Générative : #ChatGPT #MidJourney iA Générative : #ChatGPT #MidJourney
iA Générative : #ChatGPT #MidJourney
 
Information Architecture Deliverables
Information Architecture DeliverablesInformation Architecture Deliverables
Information Architecture Deliverables
 
Master Data Management - Aligning Data, Process, and Governance
Master Data Management - Aligning Data, Process, and GovernanceMaster Data Management - Aligning Data, Process, and Governance
Master Data Management - Aligning Data, Process, and Governance
 
Content In The Age of AI
Content In The Age of AIContent In The Age of AI
Content In The Age of AI
 
Generative AI Use-cases for Enterprise - First Session
Generative AI Use-cases for Enterprise - First SessionGenerative AI Use-cases for Enterprise - First Session
Generative AI Use-cases for Enterprise - First Session
 
Generative AI.pptx
Generative AI.pptxGenerative AI.pptx
Generative AI.pptx
 
Great Leadership and Talent Pay Off
Great Leadership and Talent Pay OffGreat Leadership and Talent Pay Off
Great Leadership and Talent Pay Off
 
Race in the workplace: The Black experience in the US private sector
Race in the workplace: The Black experience in the US private sectorRace in the workplace: The Black experience in the US private sector
Race in the workplace: The Black experience in the US private sector
 
GENERATIVE AI, THE FUTURE OF PRODUCTIVITY
GENERATIVE AI, THE FUTURE OF PRODUCTIVITYGENERATIVE AI, THE FUTURE OF PRODUCTIVITY
GENERATIVE AI, THE FUTURE OF PRODUCTIVITY
 
Analysis of big data and analytics market in latin america
Analysis of big data and analytics market in latin americaAnalysis of big data and analytics market in latin america
Analysis of big data and analytics market in latin america
 

Similar to Data excellence: Better data for better AI

EIS-Webinar-data.world-collab-2023-02-15.pptx
EIS-Webinar-data.world-collab-2023-02-15.pptxEIS-Webinar-data.world-collab-2023-02-15.pptx
EIS-Webinar-data.world-collab-2023-02-15.pptx
Earley Information Science
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
Blogtalk 2008
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
Lora Aroyo
 

Similar to Data excellence: Better data for better AI (20)

Technology Governance & Migration In The AI Era
Technology Governance & Migration In The AI EraTechnology Governance & Migration In The AI Era
Technology Governance & Migration In The AI Era
 
Knowledge Graphs, Ontologies, and AI Applications
Knowledge Graphs, Ontologies, and AI ApplicationsKnowledge Graphs, Ontologies, and AI Applications
Knowledge Graphs, Ontologies, and AI Applications
 
UX in the Age of AI: Leading with Design
UX in the Age of AI: Leading with DesignUX in the Age of AI: Leading with Design
UX in the Age of AI: Leading with Design
 
UX in the Age of AI: Leading with Design UXPA2018
UX in the Age of AI: Leading with Design UXPA2018UX in the Age of AI: Leading with Design UXPA2018
UX in the Age of AI: Leading with Design UXPA2018
 
Understanding the New World of Cognitive Computing
Understanding the New World of Cognitive ComputingUnderstanding the New World of Cognitive Computing
Understanding the New World of Cognitive Computing
 
Designing Trustable AI Experiences at IxDA Pittsburgh, Jan 2019
Designing Trustable AI Experiences at IxDA Pittsburgh, Jan 2019Designing Trustable AI Experiences at IxDA Pittsburgh, Jan 2019
Designing Trustable AI Experiences at IxDA Pittsburgh, Jan 2019
 
Designing Trustable AI Experiences at World Usability Day in Cleveland
Designing Trustable AI Experiences at World Usability Day in ClevelandDesigning Trustable AI Experiences at World Usability Day in Cleveland
Designing Trustable AI Experiences at World Usability Day in Cleveland
 
EIS-Webinar-data.world-collab-2023-02-15.pptx
EIS-Webinar-data.world-collab-2023-02-15.pptxEIS-Webinar-data.world-collab-2023-02-15.pptx
EIS-Webinar-data.world-collab-2023-02-15.pptx
 
Artificial Intelligence (AI) – Powering Data and Conversations.pptx
Artificial Intelligence (AI) – Powering Data and Conversations.pptxArtificial Intelligence (AI) – Powering Data and Conversations.pptx
Artificial Intelligence (AI) – Powering Data and Conversations.pptx
 
Prepping the Analytics organization for Artificial Intelligence evolution
Prepping the Analytics organization for Artificial Intelligence evolutionPrepping the Analytics organization for Artificial Intelligence evolution
Prepping the Analytics organization for Artificial Intelligence evolution
 
How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?How do we train AI to be Ethical and Unbiased?
How do we train AI to be Ethical and Unbiased?
 
Interactive XAI for ODSC East 2023
Interactive XAI for ODSC East 2023Interactive XAI for ODSC East 2023
Interactive XAI for ODSC East 2023
 
Catalyze Webcast - Five Myths Of RIA With Laurie Gray - 031808
Catalyze Webcast - Five Myths Of RIA With Laurie Gray - 031808Catalyze Webcast - Five Myths Of RIA With Laurie Gray - 031808
Catalyze Webcast - Five Myths Of RIA With Laurie Gray - 031808
 
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
IA in the Age of AI: Embracing Abstraction and Change at IA Summit 2018
 
Streamlining Information Flows In The Digital Workplace
Streamlining Information Flows In The Digital WorkplaceStreamlining Information Flows In The Digital Workplace
Streamlining Information Flows In The Digital Workplace
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 
Trusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open SourceTrusted, Transparent and Fair AI using Open Source
Trusted, Transparent and Fair AI using Open Source
 
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneMy ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
 
Designing AI for Humanity at dmi:Design Leadership Conference in Boston
Designing AI for Humanity at dmi:Design Leadership Conference in BostonDesigning AI for Humanity at dmi:Design Leadership Conference in Boston
Designing AI for Humanity at dmi:Design Leadership Conference in Boston
 
Practical Applications of Machine Learning in Cybersecurity
Practical Applications of Machine Learning in CybersecurityPractical Applications of Machine Learning in Cybersecurity
Practical Applications of Machine Learning in Cybersecurity
 

More from Lora Aroyo

Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Lora Aroyo
 

More from Lora Aroyo (20)

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfNeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
 
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningCATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
 
Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)Harnessing Human Semantics at Scale (updated)
Harnessing Human Semantics at Scale (updated)
 
CHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH SymposiumCHIP Demonstrator presentation @ CATCH Symposium
CHIP Demonstrator presentation @ CATCH Symposium
 
Semantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP DemonstratorSemantic Web Challenge: CHIP Demonstrator
Semantic Web Challenge: CHIP Demonstrator
 
The Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked DataThe Rijksmuseum Collection as Linked Data
The Rijksmuseum Collection as Linked Data
 
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @RijksmuseumKeynote at International Conference of Art Libraries 2018 @Rijksmuseum
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
 
FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18FAIRview: Responsible Video Summarization @NYCML'18
FAIRview: Responsible Video Summarization @NYCML'18
 
Understanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithmsUnderstanding bias in video news & news filtering algorithms
Understanding bias in video news & news filtering algorithms
 
StorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & MachinesStorySourcing: Telling Stories with Humans & Machines
StorySourcing: Telling Stories with Humans & Machines
 
Data Science with Humans in the Loop
Data Science with Humans in the LoopData Science with Humans in the Loop
Data Science with Humans in the Loop
 
Digital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora AroyoDigital Humanities Benelux 2017: Keynote Lora Aroyo
Digital Humanities Benelux 2017: Keynote Lora Aroyo
 
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
 
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
 
Data Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityData Science with Human in the Loop @Faculty of Science #Leiden University
Data Science with Human in the Loop @Faculty of Science #Leiden University
 
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchSXSW2017 @NewDutchMedia Talk: Exploration is the New Search
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
 
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital AgeEuropeana GA 2016: Harnessing Crowds, Niches & Professionals  in the Digital Age
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
 
"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat"Video Killed the Radio Star": From MTV to Snapchat
"Video Killed the Radio Star": From MTV to Snapchat
 
UMAP 2016 Opening Ceremony
UMAP 2016 Opening CeremonyUMAP 2016 Opening Ceremony
UMAP 2016 Opening Ceremony
 
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 

Data excellence: Better data for better AI

  • 1. Data Excellence: Better Data for Better AI ODSC 2020 Lora Aroyo http://lora-aroyo.org @laroyo By Scanned from The Magic of M. C. Escher. (Harry N. Abrams, Inc. ISBN 0-8109-6720-0) by Justin Foote (talk)., Fair use, https://en.wikipedia.org/w/index.php?curid=3955850
  • 2. http://lora-aroyo.org @laroyo TAKE HOME MESSAGE 2 data lifecycle - just like in software - is needed to guide data research & development practices data is the compass for AI - AI advances where there is data data is at the center - AI systems success depends on the quality of their data https://en.wikipedia.org/wiki/Metamorphosis_II data quality must be addressed in AI practices - multitude of notions of truth - necessity for data quality standards data lifecycle is the backbone for data excellence tools and practices to stay ahead of future unintended AI behaviours
  • 3. http://lora-aroyo.org @laroyo 3 The Rise of the Machines “AI Winter” lab experiments Expert Systems small scale experiments
  • 4. http://lora-aroyo.org @laroyo 4 The Rise of the Machines “AI Winter” → “AI Breakthroughs in Games” IBM Watson Jeopardy DeepMind AlphaGo beat the humans
  • 5. http://lora-aroyo.org @laroyo 5 The Rise of the Machines “AI Winter” → “AI Breakthroughs in Games” → “Real World Tasks” Health diagnostics Flue prediction Weather prediction Text, Image and Video classification Text Generation Text Translation Conversational AI support the humans
  • 6. http://lora-aroyo.org @laroyo 6 Mainstream Deployment of AI “Real World Tasks” deployed in the wild → Unintended behaviors Microsoft Tay bot IBM Watson Oncology Amazon Rekognition Google Photos Apple Face ID Facebook chat bots Various Speech Assistants
  • 7. http://lora-aroyo.org @laroyo 7 getting computers to “see” the diversity of data data quality is essential for guiding AI away from unintended behaviours Data is the compass for AI
  • 8. http://lora-aroyo.org @laroyo 8 The Life of AI Data “It exists!” bootstrapping AI with data Caltech101 LabelMe Berkley-3D https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
  • 9. http://lora-aroyo.org @laroyo 9 The Life of AI Data “It exists!” → “It is bigger!” data hungry AI ImageNet SIFT10M OpenImages COCO Web 1T 5-Gram https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
  • 10. http://lora-aroyo.org @laroyo 10 The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” but before it got better ...
  • 11. http://lora-aroyo.org @laroyo 11 The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” but before it got better ... it got worse ...
  • 12. http://lora-aroyo.org @laroyo 12 Unintended Behaviors in AI Adapted from “AI in the Open World: Discovering Blind Spots of AI”, SafeAI 2020, Ece Kumar
  • 13. http://lora-aroyo.org @laroyo 13 The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” but before it got better ... reactive data improvement
  • 14. http://lora-aroyo.org @laroyo 14 The Life of AI Data “It exists!” → “It is bigger!” → “It is better!” to reach here we need proactive data improvement
  • 15. http://lora-aroyo.org @laroyo 15 The Life of AI Data Alon Halevy, Peter Norvig, and Fernando Pereira. 2009. The Unreasonable Effectiveness of Data. IEEE Intelligent Systems 24, 2 (2009) In the decade since then, the research community have done a lot with quantity, but quality has been left behind
  • 16. http://lora-aroyo.org @laroyo 16 In the 90’s we introduced standards to achieve Software reliability introduced software engineering lifecycle - requirements, design and testing established processes for software maintenance - version control, sharing, documenting established software quality metrics & processes Ben Hutchinson, 2020
  • 17. http://lora-aroyo.org @laroyo 17 Now we need the same for Data introduce data lifecycle - requirements, design and testing establish processes for dataset maintenance - version control, sharing, documenting establish data quality metrics & processes Ben Hutchinson, 2020
  • 18. http://lora-aroyo.org @laroyo 18 data quality is typically not caused by software bugs or just by human errors dataset are not easy to debug data quality is typically result of: - how well a dataset represent the actual task - how is the annotation done - are the quality metrics adequate Data Quality is not easy ...
  • 19. http://lora-aroyo.org @laroyo it is not easy to give Y/N answer for most of our AI tasks 19 Do these images depict a GUITAR ? Data Quality is not only human error ✓ ✓ ✓ ✘ ✘ ✘✘✓ ✓
  • 20. http://lora-aroyo.org @laroyo 20 Do these images depict NEW ZEALAND ? Data Quality should consider context of use it is not easy to give Y/N answer for most of our AI tasks the answer typically depends on the context, on the task, on the usage, etc ✓ ✘ ✓ ✓ ✘ ✘
  • 21. http://lora-aroyo.org @laroyo 21 Do these images depict a WEDDING ? Data Quality should include real world diversity it is not easy to give Y/N answer for most of our AI tasks the answer typically depends on the context, on the task, on the usage, etc disagreement is signal for diversity and should be included in AI training ✓ ✘ ✓ ✓ ✘ ✓
  • 22. http://lora-aroyo.org @laroyo 22 Does the Sentence expresses Does the sentence express TREATS relation between Chloroquine, Malaria? Data Quality is difficult even with experts For prevention of malaria, use only in individuals traveling to malarious areas where CHLOROQUINE resistant P. falciparum MALARIA has not been reported. Rheumatoid arthritis and MALARIA have been treated with CHLOROQUINE for decades. Among 56 subjects reporting to a clinic with symptoms of MALARIA 53 (95%) had ordinarily effective levels of CHLOROQUINE in blood. ✓ ✘ ✓
  • 23. http://lora-aroyo.org @laroyo DISAGREEMENT IS SIGNAL Variety of sources for disagreement
  • 24. http://lora-aroyo.org @laroyo 24 Does the Sentence expresses Model of semantic interpretation TRIANGLE OF MEANING “Three Sides of CrowdTruth”, Human Computation Journal, v1, 2014, L. Aroyo, C. Welty Workshop on “Subjectivity, Ambiguity and Disagreement (SAD) in Crowdsourcing”, The Web Conference 2019, https://sadworkshop.wordpress.com/ Annotator disagreement is signal, not noise Annotator disagreement is indicative of variation in human interpretation Annotator disagreement is indicative of ambiguity, vagueness, similarity, over-generality, & quality
  • 25. http://lora-aroyo.org @laroyo 25 Three sides of human interpretation CROWDTRUTH Disagreement provides guidance in task analysis: ● items with poor semantics ● items with salient terms ● items difficult to classify ● items that are ambiguous ● subjective annotations ● time-sensitive annotations ● difficult annotation tasks ● mis-translated annotations ● users with/without specific knowledge ● communities of thought ● spammers You can’t remove the corners… “Three Sides of CrowdTruth”, Human Computation Journal, v1, 2014, L. Aroyo, C. Welty
  • 26. http://lora-aroyo.org @laroyo THE WORLD IS A SMOOTH SPECTRUM OF TRUTH 26
  • 27. http://lora-aroyo.org @laroyo 27 One truth: knowledge acquisition typically assumes one correct interpretation for every example Experts rule: knowledge is captured from domain experts One is enough: single expert’s knowledge is sufficient Disagreement bad: when people disagree, they must not understand the problem Detailed explanations help: if examples cause disagreement - adding instructions should help Once done, forever valid: knowledge is not updated; new data not aligned with old All examples are created equal: triples are triples, one is not more important than another, they are all either true or false … and we force the smoothness into a binary form 7 Myths about Human Annotation “Truth is a Lie: 7 Myths about Human Annotation”, AI Magazine 2014, L. Aroyo, C. Welty
  • 28. http://lora-aroyo.org @laroyo 28 High Quality Data represents a phenomena accurately and consistently over time and is replicable, reproducible, and maintainable over time; has empirical and explanatory power; and is collected, stored, and used responsibly. Rigorous Evaluation of AI Systems workshop, 2019, Human Computation (HCOMP), http://eval.how/ Evaluating Evaluation for AI Systems workshop, 2020, Association for the Advancement of Artificial Intelligence (AAAI), http://eval.how/aaai-2020/
  • 29. http://lora-aroyo.org @laroyo 29 From Data Quality to Data Excellence Data Quality is - a point-estimate of goodness of data Data Excellence is - the set of practices and tools that result in high quality data
  • 30. http://lora-aroyo.org @laroyo 30 How do we achieve Data Excellence? Maintainability Well documented datasets with owners, which follow best practices for data at any scale. Reproducibility Basic and critical regression tests for datasets which suppo solid conclusions for decision making. Reliability Datasets which are internally sound and consistent; factors that a ect the data are addressed or disclosed. Fidelity Data which faithfully, accurately, and comprehensively represents the captured phenomenon. Validity Datasets which explain aspects of the phenomena that they represent in terms of external measures. 1st International Workshop on Data Excellence: http://eval.how/dew2020/ Utility Data which adequately and accurately achieves the intended product behavior.
  • 31. http://lora-aroyo.org @laroyo 31 much like in software lifecycles, cutting corners at each stage cascades to subsequent versions, which lead to technical debt Dataset [Requirements] Analysis Requirements Analysis Stakeholder Input Privacy, compliance Trust & safety planning Dataset Maintenance Updating data over time Extending to other languages Version control Storage and accessibility Dataset Design Data acquisition methodology Rater guidelines Construct validation Dataset Testing Representation metrics Fairness metrics Reliability metrics Approval process Dataset Implementation Human labeled data Logging interaction data Data Lifecycle Ben Hutchinson, 2020
  • 32. http://lora-aroyo.org @laroyo TAKE HOME MESSAGE 32 https://en.wikipedia.org/wiki/Metamorphosis_II data lifecycle - just like in software - is needed to guide data research & development practices data is the compass for AI - AI advances where there is data data is at the center - AI systems success depends on the quality of their data data quality must be addressed in AI practices - multitude of notions of truth - necessity for data quality standards data lifecycle is the backbone for data excellence tools and practices to stay ahead of future unintended AI behaviours
  • 33. http://lora-aroyo.org @laroyo 33 Collaborators EthicalAI Ben Hutchinson Crowd Platform Amol Wankhede Anurag Batra People + AI Research (PAIR) Nithya Sambasivan Kristen Olson Shivani Kapania Jess Holbrook Andrew Zaldivar Mahima Pushkarna Maysam Moussalem Praveen Paritosh Ka Wong Lora Aroyo Devi Krishna Likert team
  • 34. Data Excellence: Better Data for Better AI ODSC 2020 Lora Aroyo http://lora-aroyo.org @laroyo By Scanned from The Magic of M. C. Escher. (Harry N. Abrams, Inc. ISBN 0-8109-6720-0) by Justin Foote (talk)., Fair use, https://en.wikipedia.org/w/index.php?curid=3955850
  • 35. high profile data failure not bugs in the software, not mistake of humans problems caused by quality in the data just like software quality in 90’s - the same has to happen with data examples of questionable data crowdtruth relation extraction how would you annotate it how do we know and measure the quality of the data how well does it represent the actual task we are trying to solve like software we need to establish data quality standards