SlideShare a Scribd company logo
1 of 14
Download to read offline
Logically at the Factify 2022:
Multimodal Fact Verification
Jie Gao, Hella-Franziska Hoffmann, Stylianos Oikonomou,
David Kiskovski, Anil Bandhakavi
Feb 28, 2022
1
Introduction
Solution: Ensemble Approach
Experiments
Conclusion
2
3
4
Table of content
Factify Challenge: Introduction - Task
Claim
Text: China’s famed wandering
elephants are on the move again,
heading southwest while a male who
broke from the herd is still keeping his
distance. https://t.co/o5j7PDDveJ
Text: By Julia Hollingsworth and Zixu
Wang, CNNUpdated 1:03 AM ET, Fri June
11, 2021 (CNN)At least a dozen buzzing
drones monitor them around the clock.
Wherever they go, they're escorted by
police. And when they eat or sleep, they're
watched by millions online. CNN's Jessie
Yeung contributed to this report.
Document
label:
SUPPORT_MULTIMODAL
_MULTIMODAL
SUPPORT
images similar /
about the same
situation
doc text supports
claim text
Data challenge as part of De-Factify at AAAI ‘22
Train pairs: 35000
Validation pairs: 7500
Test pairs: 7500
4 weeks to train/eval, 1 week to apply to test
Factify Challenge: Introduction - Usage
● Entailment prediction is a
technique for claim
verification, i.e., predict
whether the evidence
supports or refute the claim
● Typically, given a tweet with
text message and image, and
a potential evidence article,
can we automatically predict
the veracity ?
Overview
Claim Detection Claim Verification
Worthiness Prioritising
Evidence
Retrieval
Veracity
Prediction
Produce
Justification
Factify
(Multimodal
Entailment)
Claim Matching
Factify Challenge: Introduction - Usage
● Entailment prediction is a
technique for claim
verification, i.e., predict
whether the evidence
supports or refute the claim
● Typically, given a tweet with
text message and image, and
a potential evidence article,
can we automatically predict
the veracity ?
Overview
Claim Detection Claim Verification
Worthiness Prioritising
Evidence
Retrieval
Veracity
Prediction
Produce
Justification
Claim Matching
Solution: Ensemble Model
● Train two unimodal models:
○ 3-way Textual Entailment:
“What is the relationship between
document and claim?”
support / refute / neutral
○ Image Relatedness:
“Is the doc. image contextually
related to the claim text + image?”
Y / N
● Combine the two unimodal models
with data-specific features into a
multimodal 5-way classifier.
Approach
Experiments: 5-way Multimodal Entailment
● Ensemble Model:
sklearn's DecisionTreeClassifier with
‘best’ split and ‘gini’ impurity matrix
as training criteria and an upper
bound of 8 on the number of layers.
● Feature Creation:
○ Text Entailment:
pre-trained BigBird model
fine-tuned on factify data set
○ pretrained ResNet-50 for
image cosine sim
○ sklearn 1-hot encoders for
image domains
Experiment setup
Validation
Test
Experiments: 5-way Multimodal Entailment
● Ensemble Model:
sklearn's DecisionTreeClassifier with
‘best’ split and ‘gini’ impurity matrix
as training criteria and an upper
bound of 8 on the number of layers.
● Feature Creation:
○ Text Entailment:
pre-trained BigBird model
fine-tuned on factify data set
○ pretrained ResNet-50 for
image cosine sim
○ sklearn 1-hot encoders for
image domains
Experiment setup
Leaderboard
Experiments: 3-way Textual Entailment
● As part of the design we chose to train a
separate model to address the textual
entailment part of the multi-modal task:
“Given a claim and an evidence
document, determine if the text evidence
supports, refutes, or is neutral towards the
claim.”
● Best model setup:
○ pre-trained Huggingface BigBird
○ fine-tuned for pairwise classification
of claim / doc text pairs over 2 epochs
with AdamW optimizer, learning rate
2e-5, epsilon 1e-8, batch size 4, and
max. sentence length of 1396 tokens.
Experiment setup
Validation Scores
Factify Label Text Entailment Label
Support_Multimodal Support
Support_Text Support
Insufficient_Multimodal Insufficient_Evidence
Insufficient_Text Insufficient_Evidence
Refute Refute
Label Mapping
Data Bias
Text Length Distribution by Label (Train)
OCR Text Length Distribution by Label (Train)
Many of our model choices were inspired by
inherent biases observed in the data.
Generating large annotated gold data sets that
appropriately represent the real-world fact
checking domain remains an ongoing challenge.
Text Word Overlap by Label (Train/Val)
Data Bias
Img Similarity by Label (Val)
Claim Image Source Distribution by Label (Train)
Many of our model choices were inspired by
inherent biases observed in the data.
Generating large annotated gold data sets that
appropriately represent the real-world fact
checking domain remains an ongoing challenge.
Incorrect and Ambiguous Labels
Insufficient_Multimodal
claim: Special counsel Robert
Mueller did not have sufficient
evidence to prosecute
obstruction, but does not
exonerate President Trump.
https://t.co/nfbBsVjDBG
https://t.co/83P7RDQadK
doc: “Attorney General William
Barr will now review the report.
Robert Mueller ends Russia
investigation without more
indictments: SourceSpecial counsel
Robert Mueller's much-anticipated
report -- the product of nearly two
years of investigation -- [..]
Support_Text
doc: In an unprecedented move, her
casket has been placed outside on the
court steps. Remembering Supreme Court
Justice Ruth Bader GinsburgThree days
of public mourning for Justice Ruth Bader
Ginsburg, a champion of equality and
pioneer of women's rights, began
Wednesday when her casket arrived at
the Supreme Court [..]
claim: President Trump and first lady
Melania Trump paid their respects to
Supreme Court Justice Ruth Bader
Ginsburg as a crowd booed and chanted
"Vote him out." https://t.co/M7m7kEIBg7
https://t.co/tWYfyKIdIF
Conclusion and Discussion
Learnings:
● DecisionTree classifier as best performing model
● 3-way text entailment as separate task with its own value
● DNN-based multimodal model suffers overfitting (refer to paper for details)
● Clear data bias and ambiguous labels (e.g. “support_multimodal” vs “support_text”)
Recommendations:
● Improve data creation process to reduce bias
● More practical labels and annotation scheme for real-world applications/challenges
● Further experimentation with multimodal architectures
Thank You!
www.logically.ai

More Related Content

Recently uploaded

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
Cherry
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Cherry
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Cherry
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cherry
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Cherry
 

Recently uploaded (20)

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
PODOCARPUS...........................pptx
PODOCARPUS...........................pptxPODOCARPUS...........................pptx
PODOCARPUS...........................pptx
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Plasmid: types, structure and functions.
Plasmid: types, structure and functions.Plasmid: types, structure and functions.
Plasmid: types, structure and functions.
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptx
 
FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 

Featured

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 

Factify_Data_Challenge.pptx

  • 1. Logically at the Factify 2022: Multimodal Fact Verification Jie Gao, Hella-Franziska Hoffmann, Stylianos Oikonomou, David Kiskovski, Anil Bandhakavi Feb 28, 2022
  • 3. Factify Challenge: Introduction - Task Claim Text: China’s famed wandering elephants are on the move again, heading southwest while a male who broke from the herd is still keeping his distance. https://t.co/o5j7PDDveJ Text: By Julia Hollingsworth and Zixu Wang, CNNUpdated 1:03 AM ET, Fri June 11, 2021 (CNN)At least a dozen buzzing drones monitor them around the clock. Wherever they go, they're escorted by police. And when they eat or sleep, they're watched by millions online. CNN's Jessie Yeung contributed to this report. Document label: SUPPORT_MULTIMODAL _MULTIMODAL SUPPORT images similar / about the same situation doc text supports claim text Data challenge as part of De-Factify at AAAI ‘22 Train pairs: 35000 Validation pairs: 7500 Test pairs: 7500 4 weeks to train/eval, 1 week to apply to test
  • 4. Factify Challenge: Introduction - Usage ● Entailment prediction is a technique for claim verification, i.e., predict whether the evidence supports or refute the claim ● Typically, given a tweet with text message and image, and a potential evidence article, can we automatically predict the veracity ? Overview Claim Detection Claim Verification Worthiness Prioritising Evidence Retrieval Veracity Prediction Produce Justification Factify (Multimodal Entailment) Claim Matching
  • 5. Factify Challenge: Introduction - Usage ● Entailment prediction is a technique for claim verification, i.e., predict whether the evidence supports or refute the claim ● Typically, given a tweet with text message and image, and a potential evidence article, can we automatically predict the veracity ? Overview Claim Detection Claim Verification Worthiness Prioritising Evidence Retrieval Veracity Prediction Produce Justification Claim Matching
  • 6. Solution: Ensemble Model ● Train two unimodal models: ○ 3-way Textual Entailment: “What is the relationship between document and claim?” support / refute / neutral ○ Image Relatedness: “Is the doc. image contextually related to the claim text + image?” Y / N ● Combine the two unimodal models with data-specific features into a multimodal 5-way classifier. Approach
  • 7. Experiments: 5-way Multimodal Entailment ● Ensemble Model: sklearn's DecisionTreeClassifier with ‘best’ split and ‘gini’ impurity matrix as training criteria and an upper bound of 8 on the number of layers. ● Feature Creation: ○ Text Entailment: pre-trained BigBird model fine-tuned on factify data set ○ pretrained ResNet-50 for image cosine sim ○ sklearn 1-hot encoders for image domains Experiment setup Validation Test
  • 8. Experiments: 5-way Multimodal Entailment ● Ensemble Model: sklearn's DecisionTreeClassifier with ‘best’ split and ‘gini’ impurity matrix as training criteria and an upper bound of 8 on the number of layers. ● Feature Creation: ○ Text Entailment: pre-trained BigBird model fine-tuned on factify data set ○ pretrained ResNet-50 for image cosine sim ○ sklearn 1-hot encoders for image domains Experiment setup Leaderboard
  • 9. Experiments: 3-way Textual Entailment ● As part of the design we chose to train a separate model to address the textual entailment part of the multi-modal task: “Given a claim and an evidence document, determine if the text evidence supports, refutes, or is neutral towards the claim.” ● Best model setup: ○ pre-trained Huggingface BigBird ○ fine-tuned for pairwise classification of claim / doc text pairs over 2 epochs with AdamW optimizer, learning rate 2e-5, epsilon 1e-8, batch size 4, and max. sentence length of 1396 tokens. Experiment setup Validation Scores Factify Label Text Entailment Label Support_Multimodal Support Support_Text Support Insufficient_Multimodal Insufficient_Evidence Insufficient_Text Insufficient_Evidence Refute Refute Label Mapping
  • 10. Data Bias Text Length Distribution by Label (Train) OCR Text Length Distribution by Label (Train) Many of our model choices were inspired by inherent biases observed in the data. Generating large annotated gold data sets that appropriately represent the real-world fact checking domain remains an ongoing challenge. Text Word Overlap by Label (Train/Val)
  • 11. Data Bias Img Similarity by Label (Val) Claim Image Source Distribution by Label (Train) Many of our model choices were inspired by inherent biases observed in the data. Generating large annotated gold data sets that appropriately represent the real-world fact checking domain remains an ongoing challenge.
  • 12. Incorrect and Ambiguous Labels Insufficient_Multimodal claim: Special counsel Robert Mueller did not have sufficient evidence to prosecute obstruction, but does not exonerate President Trump. https://t.co/nfbBsVjDBG https://t.co/83P7RDQadK doc: “Attorney General William Barr will now review the report. Robert Mueller ends Russia investigation without more indictments: SourceSpecial counsel Robert Mueller's much-anticipated report -- the product of nearly two years of investigation -- [..] Support_Text doc: In an unprecedented move, her casket has been placed outside on the court steps. Remembering Supreme Court Justice Ruth Bader GinsburgThree days of public mourning for Justice Ruth Bader Ginsburg, a champion of equality and pioneer of women's rights, began Wednesday when her casket arrived at the Supreme Court [..] claim: President Trump and first lady Melania Trump paid their respects to Supreme Court Justice Ruth Bader Ginsburg as a crowd booed and chanted "Vote him out." https://t.co/M7m7kEIBg7 https://t.co/tWYfyKIdIF
  • 13. Conclusion and Discussion Learnings: ● DecisionTree classifier as best performing model ● 3-way text entailment as separate task with its own value ● DNN-based multimodal model suffers overfitting (refer to paper for details) ● Clear data bias and ambiguous labels (e.g. “support_multimodal” vs “support_text”) Recommendations: ● Improve data creation process to reduce bias ● More practical labels and annotation scheme for real-world applications/challenges ● Further experimentation with multimodal architectures