SlideShare a Scribd company logo
1
WIS
Web
Information
Systems
Helping users discover
perspectives
Enhancing opinion mining with joint topic models
Tim Draws, Jody Liu, Nava Tintarev
TU Delft, The Netherlands
t.a.draws@tudelft.nl
https://timdraws.net
2
WIS
Web
Information
Systems
Discovering perspectives
3
WIS
Web
Information
Systems
Discovering perspectives
Unstructured set of
textual opinions
?
Perspective 1
Supporting
Perspective 2
Perspective 3
Perspective 4
Perspective 5
Perspective 6
Perspective 7
Perspective 8
Perspective 9
Perspective 10
Opposing
Structured set of
perspectives
4
WIS
Web
Information
Systems
Topic models
• Topic model = unsupervised model to discover
hidden structures (i.e., topics) in corpora of text
– Example: Latent Dirichlet Allocation (LDA) [1]
– Topics are probability distributions over words
– If applied to a corpus of documents related to a debate,
topics could be interpreted as perspectives
• Joint topic model = adding additional components
(e.g., sentiment analysis) to a classical topic
model (e.g., LDA)
5
WIS
Web
Information
Systems
Our paper
RQ1. Can joint topic models support users in discovering
perspectives in a corpus of opinionated documents?
RQ2. Do users interpret the output of joint topic models
in line with their personal pre-existing stance?
Contributions:
1. Perspective-annotated data set
2. User study
6
WIS
Web
Information
Systems
Data
Document Stance Perspective
You cannot be a
Christian and support
abortion…
Against Abortion is the killing of a human
being, which defies the word of
God.
No one in the world has
any right to judge over
what someone else
does with their body, …
For Reproductive choice empowers
women by giving them control over
their own bodies.
Why put a child through
the pain of an unloving
mother…
For A baby should not come into the
world unwanted.
… … …
Final data set: 600 documents; 6 perspectives
7
WIS
Web
Information
Systems
Experimental setup
1
2
3
4
5
• Ran each model on the final
data set (i.e., for 6 topics)
• Between-subjects study: each
participant sees output of one
of the models
• Participants need to identify
the correct 6 perspectives from
the model output
8
WIS
Web
Information
Systems
Procedure
Step 1 Step 2 Step 3
Participants state:
• Age
• Gender
• Personal stance
towards abortion
• Familiarity with the
abortion debate
Participants state:
• Perceived usefulness
• Perceived awareness
increase
• Confidence in task
performance
9
WIS
Web
Information
Systems
Results: descriptive
• 158 participants (recruited from Prolific)
– After excluding 12 participants due to failing both honeypot topics
– 150 required according to power analysis
• 50.6% female, 49.4% male
• 33.3 years old on average (range 18 to 64)
• Most (57.8%) at least somewhat familiar with the topic
• Sample skewed towards the supporting viewpoint
10
WIS
Web
Information
Systems
Results: hypothesis tests
H1: Users find more correct perspectives when being exposed
to the output of a joint topic model compared to the output of a
regular topic model or baseline.
– We find a difference between models (p < 0.001, η2 = 0.126)
– TAM is the only one that performs significantly better than the baseline
3
4
5
TF−IDF LDA JST VODUM TAM LAM
Model
MeannCor
11
WIS
Web
Information
Systems
Results: hypothesis tests
H2: Users are more likely to identify sets of keywords as
perspectives that are in line with their personal stance compared
to perspectives that they do not agree with.
– No evidence for for such a relationship (ρ = 0.122, p = 0.163)
13
WIS
Web
Information
Systems
Discussion and future work
• Why did TAM perform better?
– It extracted more keywords that appeared explicitly in the perspective expression
Abortion is the killing of a human
being, which defies the word of God.
Reproductive choice empowers
women by giving them control over
their own bodies.
A baby should not come into the
world unwanted.
• Future work: different domains, novel topic models
14
WIS
Web
Information
Systems
Take home
• Joint topic models such as TAM can perform
perspective discovery
• No evidence for tendency of users to interpret
output in line with their personal stance
• Implications for several areas: journalism,
policy-making, generating explanations
(All supplementary materials are openly available at
https://osf.io/uns63/.)
15
WIS
Web
Information
Systems
References
[1] D. Blei, A. Ng, and M. Jordan, “Latent dirichlet allocation,” Journal of Machine Learning Research, vol. 3, pp. 993–
1022, 05 2003.
[2] M. Paul and R. Girju, “A two-dimensional topic-aspect model for discovering multi-faceted topics.” in AAAI, vol. 1, 01
2010. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.
226.3550&rep=rep1&type=pdf
[3] C. Lin and Y. He, “Joint sentiment/topic model for sentiment analysis,” in Proceedings of the 18th ACM Conference on
Information and Knowledge Management, ser. CIKM ’09. New York, NY, USA: Association for Computing
Machinery, 2009, p. 375–384. [Online]. Available: https://doi.org/10.1145/1645953.1646003
[4] T. Thonet, G. Cabanac, M. Boughanem, and K. Pinel-Sauvagnat, “Vodum: A topic model unifying viewpoint, topic and
opinion discovery,” in ECIR, vol. 9626. Toulouse, France: Springer, 03 2016, pp. 533– 545.
[5] D. Vilares and Y. He, “Detecting perspectives in political debates,” in EMNLP. Association for Computational
Linguistics, 01 2017, pp. 1573–1582.

More Related Content

What's hot

Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowBarry Smith
 
Teplovs LA Week 2014
Teplovs LA Week 2014Teplovs LA Week 2014
Teplovs LA Week 2014
Chris Teplovs
 
Introduction and E-Research Timeline Review
Introduction and E-Research Timeline ReviewIntroduction and E-Research Timeline Review
Introduction and E-Research Timeline Review
Khadak Raj Adhikari
 
Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013
Artificial Intelligence Institute at UofSC
 
Would you like to be my friend: Patron responsiveness to academic library Fac...
Would you like to be my friend: Patron responsiveness to academic library Fac...Would you like to be my friend: Patron responsiveness to academic library Fac...
Would you like to be my friend: Patron responsiveness to academic library Fac...
parfitt123
 
02 Network Canvas
02 Network Canvas02 Network Canvas
02 Network Canvas
Duke Network Analysis Center
 
Carma internet research module: Sampling for internet
Carma internet research module: Sampling for internetCarma internet research module: Sampling for internet
Carma internet research module: Sampling for internet
Syracuse University
 
Internet-based research
Internet-based researchInternet-based research
Internet-based research
Vivian Tequillo
 
Student Response Systems Presentation Final
Student Response Systems Presentation FinalStudent Response Systems Presentation Final
Student Response Systems Presentation Final
Sue Miller
 
Empowering Data in Scholarly Publishing
Empowering Data in Scholarly PublishingEmpowering Data in Scholarly Publishing
Empowering Data in Scholarly Publishing
Charleston Conference
 
SocialCite makes its debut at the HighWire Press meeting
SocialCite makes its debut at the HighWire Press meetingSocialCite makes its debut at the HighWire Press meeting
SocialCite makes its debut at the HighWire Press meeting
Kent Anderson
 
Data Science Master Specialisation
Data Science Master SpecialisationData Science Master Specialisation
Data Science Master Specialisation
Arjen de Vries
 
Biomedical Resource Ontology
Biomedical Resource OntologyBiomedical Resource Ontology
Biomedical Resource Ontology
Trish Whetzel
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
William Gunn
 
Research Panel Wcet Oct 2009
Research Panel Wcet Oct 2009Research Panel Wcet Oct 2009
Research Panel Wcet Oct 2009
Terry Anderson
 
Open Data Bay Area: Interesting Problems in Academic Data
Open Data Bay Area: Interesting Problems in Academic DataOpen Data Bay Area: Interesting Problems in Academic Data
Open Data Bay Area: Interesting Problems in Academic Data
William Gunn
 
Identifying and preventing plagiarism: issues for HE tutors in FE Colleges
Identifying and preventing plagiarism: issues for HE tutors in FE CollegesIdentifying and preventing plagiarism: issues for HE tutors in FE Colleges
Identifying and preventing plagiarism: issues for HE tutors in FE Colleges
JISC RSC Southeast
 

What's hot (18)

Ontologies: What Librarians Need to Know
Ontologies: What Librarians Need to KnowOntologies: What Librarians Need to Know
Ontologies: What Librarians Need to Know
 
Teplovs LA Week 2014
Teplovs LA Week 2014Teplovs LA Week 2014
Teplovs LA Week 2014
 
Introduction and E-Research Timeline Review
Introduction and E-Research Timeline ReviewIntroduction and E-Research Timeline Review
Introduction and E-Research Timeline Review
 
okraku_sunbelt-2016-presentation_041016
okraku_sunbelt-2016-presentation_041016okraku_sunbelt-2016-presentation_041016
okraku_sunbelt-2016-presentation_041016
 
Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013
 
Would you like to be my friend: Patron responsiveness to academic library Fac...
Would you like to be my friend: Patron responsiveness to academic library Fac...Would you like to be my friend: Patron responsiveness to academic library Fac...
Would you like to be my friend: Patron responsiveness to academic library Fac...
 
02 Network Canvas
02 Network Canvas02 Network Canvas
02 Network Canvas
 
Carma internet research module: Sampling for internet
Carma internet research module: Sampling for internetCarma internet research module: Sampling for internet
Carma internet research module: Sampling for internet
 
Internet-based research
Internet-based researchInternet-based research
Internet-based research
 
Student Response Systems Presentation Final
Student Response Systems Presentation FinalStudent Response Systems Presentation Final
Student Response Systems Presentation Final
 
Empowering Data in Scholarly Publishing
Empowering Data in Scholarly PublishingEmpowering Data in Scholarly Publishing
Empowering Data in Scholarly Publishing
 
SocialCite makes its debut at the HighWire Press meeting
SocialCite makes its debut at the HighWire Press meetingSocialCite makes its debut at the HighWire Press meeting
SocialCite makes its debut at the HighWire Press meeting
 
Data Science Master Specialisation
Data Science Master SpecialisationData Science Master Specialisation
Data Science Master Specialisation
 
Biomedical Resource Ontology
Biomedical Resource OntologyBiomedical Resource Ontology
Biomedical Resource Ontology
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
 
Research Panel Wcet Oct 2009
Research Panel Wcet Oct 2009Research Panel Wcet Oct 2009
Research Panel Wcet Oct 2009
 
Open Data Bay Area: Interesting Problems in Academic Data
Open Data Bay Area: Interesting Problems in Academic DataOpen Data Bay Area: Interesting Problems in Academic Data
Open Data Bay Area: Interesting Problems in Academic Data
 
Identifying and preventing plagiarism: issues for HE tutors in FE Colleges
Identifying and preventing plagiarism: issues for HE tutors in FE CollegesIdentifying and preventing plagiarism: issues for HE tutors in FE Colleges
Identifying and preventing plagiarism: issues for HE tutors in FE Colleges
 

Similar to Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topic Models

How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research Paper
Anita de Waard
 
Personal online reputation: the development of an approach to investigate how...
Personal online reputation: the development of an approach to investigate how...Personal online reputation: the development of an approach to investigate how...
Personal online reputation: the development of an approach to investigate how...
Frances Ryan
 
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"CITE
 
Explainable AI is not yet Understandable AI
Explainable AI is not yet Understandable AIExplainable AI is not yet Understandable AI
Explainable AI is not yet Understandable AI
epsilon_tud
 
Modeling health related topics in an online forum designed for the deaf & har...
Modeling health related topics in an online forum designed for the deaf & har...Modeling health related topics in an online forum designed for the deaf & har...
Modeling health related topics in an online forum designed for the deaf & har...
Hang Dong
 
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
kristenlabonte
 
Cite track presentation
Cite track presentationCite track presentation
Cite track presentation
Amir Razmjou
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Carole Goble
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
Anita de Waard
 
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Globus
 
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
Lauri Eloranta
 
Social machines: theory design and incentives
Social machines: theory design and incentivesSocial machines: theory design and incentives
Social machines: theory design and incentives
Elena Simperl
 
Discussion 1 Affinity Group Checkpoint #4This week, you will on
Discussion 1 Affinity Group Checkpoint #4This week, you will onDiscussion 1 Affinity Group Checkpoint #4This week, you will on
Discussion 1 Affinity Group Checkpoint #4This week, you will on
VinaOconner450
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystem
Maryann Martone
 
FORCE11: Future of Research Communications and e-Scholarship
FORCE11:  Future of Research Communications and e-ScholarshipFORCE11:  Future of Research Communications and e-Scholarship
FORCE11: Future of Research Communications and e-Scholarship
Maryann Martone
 
Bridging the missing middle for al_tversionfinal_14_08_2014
Bridging the missing middle for al_tversionfinal_14_08_2014Bridging the missing middle for al_tversionfinal_14_08_2014
Bridging the missing middle for al_tversionfinal_14_08_2014
debbieholley1
 
The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?
National Information Standards Organization (NISO)
 
Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"
National Information Standards Organization (NISO)
 
Lern, jan 2015, digital media slides
Lern, jan 2015, digital media slidesLern, jan 2015, digital media slides
Lern, jan 2015, digital media slides
York University - Osgoode Hall Law School
 

Similar to Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topic Models (20)

How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research Paper
 
Personal online reputation: the development of an approach to investigate how...
Personal online reputation: the development of an approach to investigate how...Personal online reputation: the development of an approach to investigate how...
Personal online reputation: the development of an approach to investigate how...
 
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
Gobert, Dede, Martin, Rose "Panel: Learning Analytics and Learning Sciences"
 
Explainable AI is not yet Understandable AI
Explainable AI is not yet Understandable AIExplainable AI is not yet Understandable AI
Explainable AI is not yet Understandable AI
 
Modeling health related topics in an online forum designed for the deaf & har...
Modeling health related topics in an online forum designed for the deaf & har...Modeling health related topics in an online forum designed for the deaf & har...
Modeling health related topics in an online forum designed for the deaf & har...
 
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
 
Cite track presentation
Cite track presentationCite track presentation
Cite track presentation
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
 
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
A Summary of Computational Social Science - Lecture 8 in Introduction to Comp...
 
Social machines: theory design and incentives
Social machines: theory design and incentivesSocial machines: theory design and incentives
Social machines: theory design and incentives
 
Alpsp final martone
Alpsp final martoneAlpsp final martone
Alpsp final martone
 
Discussion 1 Affinity Group Checkpoint #4This week, you will on
Discussion 1 Affinity Group Checkpoint #4This week, you will onDiscussion 1 Affinity Group Checkpoint #4This week, you will on
Discussion 1 Affinity Group Checkpoint #4This week, you will on
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystem
 
FORCE11: Future of Research Communications and e-Scholarship
FORCE11:  Future of Research Communications and e-ScholarshipFORCE11:  Future of Research Communications and e-Scholarship
FORCE11: Future of Research Communications and e-Scholarship
 
Bridging the missing middle for al_tversionfinal_14_08_2014
Bridging the missing middle for al_tversionfinal_14_08_2014Bridging the missing middle for al_tversionfinal_14_08_2014
Bridging the missing middle for al_tversionfinal_14_08_2014
 
The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?
 
Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"
 
Lern, jan 2015, digital media slides
Lern, jan 2015, digital media slidesLern, jan 2015, digital media slides
Lern, jan 2015, digital media slides
 

More from TimDraws

Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
TimDraws
 
A Checklist to Combat Cognitive Biases in Crowdsourcing
A Checklist to Combat Cognitive Biases in CrowdsourcingA Checklist to Combat Cognitive Biases in Crowdsourcing
A Checklist to Combat Cognitive Biases in Crowdsourcing
TimDraws
 
Introducing the Cognitive-Biases-in-Crowdsourcing Checklist
Introducing the Cognitive-Biases-in-Crowdsourcing ChecklistIntroducing the Cognitive-Biases-in-Crowdsourcing Checklist
Introducing the Cognitive-Biases-in-Crowdsourcing Checklist
TimDraws
 
This Is Not What We Ordered: Exploring Why Biased Search Result Rankings Affe...
This Is Not What We Ordered: Exploring Why Biased Search Result Rankings Affe...This Is Not What We Ordered: Exploring Why Biased Search Result Rankings Affe...
This Is Not What We Ordered: Exploring Why Biased Search Result Rankings Affe...
TimDraws
 
Disparate Impact Diminishes Consumer Trust Even for Advantaged Users
Disparate Impact Diminishes Consumer Trust Even for Advantaged UsersDisparate Impact Diminishes Consumer Trust Even for Advantaged Users
Disparate Impact Diminishes Consumer Trust Even for Advantaged Users
TimDraws
 
Assessing Viewpoint Diversity in Search Results Using Ranking Fairness Metrics
Assessing Viewpoint Diversity in Search Results Using Ranking Fairness MetricsAssessing Viewpoint Diversity in Search Results Using Ranking Fairness Metrics
Assessing Viewpoint Diversity in Search Results Using Ranking Fairness Metrics
TimDraws
 

More from TimDraws (6)

Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
Comprehensive Viewpoint Representations for a Deeper Understanding of User In...
 
A Checklist to Combat Cognitive Biases in Crowdsourcing
A Checklist to Combat Cognitive Biases in CrowdsourcingA Checklist to Combat Cognitive Biases in Crowdsourcing
A Checklist to Combat Cognitive Biases in Crowdsourcing
 
Introducing the Cognitive-Biases-in-Crowdsourcing Checklist
Introducing the Cognitive-Biases-in-Crowdsourcing ChecklistIntroducing the Cognitive-Biases-in-Crowdsourcing Checklist
Introducing the Cognitive-Biases-in-Crowdsourcing Checklist
 
This Is Not What We Ordered: Exploring Why Biased Search Result Rankings Affe...
This Is Not What We Ordered: Exploring Why Biased Search Result Rankings Affe...This Is Not What We Ordered: Exploring Why Biased Search Result Rankings Affe...
This Is Not What We Ordered: Exploring Why Biased Search Result Rankings Affe...
 
Disparate Impact Diminishes Consumer Trust Even for Advantaged Users
Disparate Impact Diminishes Consumer Trust Even for Advantaged UsersDisparate Impact Diminishes Consumer Trust Even for Advantaged Users
Disparate Impact Diminishes Consumer Trust Even for Advantaged Users
 
Assessing Viewpoint Diversity in Search Results Using Ranking Fairness Metrics
Assessing Viewpoint Diversity in Search Results Using Ranking Fairness MetricsAssessing Viewpoint Diversity in Search Results Using Ranking Fairness Metrics
Assessing Viewpoint Diversity in Search Results Using Ranking Fairness Metrics
 

Recently uploaded

Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 

Recently uploaded (20)

Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 

Helping Users Discover Perspectives: Enhancing Opinion Mining with Joint Topic Models

  • 1. 1 WIS Web Information Systems Helping users discover perspectives Enhancing opinion mining with joint topic models Tim Draws, Jody Liu, Nava Tintarev TU Delft, The Netherlands t.a.draws@tudelft.nl https://timdraws.net
  • 3. 3 WIS Web Information Systems Discovering perspectives Unstructured set of textual opinions ? Perspective 1 Supporting Perspective 2 Perspective 3 Perspective 4 Perspective 5 Perspective 6 Perspective 7 Perspective 8 Perspective 9 Perspective 10 Opposing Structured set of perspectives
  • 4. 4 WIS Web Information Systems Topic models • Topic model = unsupervised model to discover hidden structures (i.e., topics) in corpora of text – Example: Latent Dirichlet Allocation (LDA) [1] – Topics are probability distributions over words – If applied to a corpus of documents related to a debate, topics could be interpreted as perspectives • Joint topic model = adding additional components (e.g., sentiment analysis) to a classical topic model (e.g., LDA)
  • 5. 5 WIS Web Information Systems Our paper RQ1. Can joint topic models support users in discovering perspectives in a corpus of opinionated documents? RQ2. Do users interpret the output of joint topic models in line with their personal pre-existing stance? Contributions: 1. Perspective-annotated data set 2. User study
  • 6. 6 WIS Web Information Systems Data Document Stance Perspective You cannot be a Christian and support abortion… Against Abortion is the killing of a human being, which defies the word of God. No one in the world has any right to judge over what someone else does with their body, … For Reproductive choice empowers women by giving them control over their own bodies. Why put a child through the pain of an unloving mother… For A baby should not come into the world unwanted. … … … Final data set: 600 documents; 6 perspectives
  • 7. 7 WIS Web Information Systems Experimental setup 1 2 3 4 5 • Ran each model on the final data set (i.e., for 6 topics) • Between-subjects study: each participant sees output of one of the models • Participants need to identify the correct 6 perspectives from the model output
  • 8. 8 WIS Web Information Systems Procedure Step 1 Step 2 Step 3 Participants state: • Age • Gender • Personal stance towards abortion • Familiarity with the abortion debate Participants state: • Perceived usefulness • Perceived awareness increase • Confidence in task performance
  • 9. 9 WIS Web Information Systems Results: descriptive • 158 participants (recruited from Prolific) – After excluding 12 participants due to failing both honeypot topics – 150 required according to power analysis • 50.6% female, 49.4% male • 33.3 years old on average (range 18 to 64) • Most (57.8%) at least somewhat familiar with the topic • Sample skewed towards the supporting viewpoint
  • 10. 10 WIS Web Information Systems Results: hypothesis tests H1: Users find more correct perspectives when being exposed to the output of a joint topic model compared to the output of a regular topic model or baseline. – We find a difference between models (p < 0.001, η2 = 0.126) – TAM is the only one that performs significantly better than the baseline 3 4 5 TF−IDF LDA JST VODUM TAM LAM Model MeannCor
  • 11. 11 WIS Web Information Systems Results: hypothesis tests H2: Users are more likely to identify sets of keywords as perspectives that are in line with their personal stance compared to perspectives that they do not agree with. – No evidence for for such a relationship (ρ = 0.122, p = 0.163)
  • 12. 13 WIS Web Information Systems Discussion and future work • Why did TAM perform better? – It extracted more keywords that appeared explicitly in the perspective expression Abortion is the killing of a human being, which defies the word of God. Reproductive choice empowers women by giving them control over their own bodies. A baby should not come into the world unwanted. • Future work: different domains, novel topic models
  • 13. 14 WIS Web Information Systems Take home • Joint topic models such as TAM can perform perspective discovery • No evidence for tendency of users to interpret output in line with their personal stance • Implications for several areas: journalism, policy-making, generating explanations (All supplementary materials are openly available at https://osf.io/uns63/.)
  • 14. 15 WIS Web Information Systems References [1] D. Blei, A. Ng, and M. Jordan, “Latent dirichlet allocation,” Journal of Machine Learning Research, vol. 3, pp. 993– 1022, 05 2003. [2] M. Paul and R. Girju, “A two-dimensional topic-aspect model for discovering multi-faceted topics.” in AAAI, vol. 1, 01 2010. [Online]. Available: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1. 226.3550&rep=rep1&type=pdf [3] C. Lin and Y. He, “Joint sentiment/topic model for sentiment analysis,” in Proceedings of the 18th ACM Conference on Information and Knowledge Management, ser. CIKM ’09. New York, NY, USA: Association for Computing Machinery, 2009, p. 375–384. [Online]. Available: https://doi.org/10.1145/1645953.1646003 [4] T. Thonet, G. Cabanac, M. Boughanem, and K. Pinel-Sauvagnat, “Vodum: A topic model unifying viewpoint, topic and opinion discovery,” in ECIR, vol. 9626. Toulouse, France: Springer, 03 2016, pp. 533– 545. [5] D. Vilares and Y. He, “Detecting perspectives in political debates,” in EMNLP. Association for Computational Linguistics, 01 2017, pp. 1573–1582.

Editor's Notes

  1. Introduce myself Second year PhD
  2. Imagine you are a journalist writing an article about the abortion debate Abortion is a commonly debated topic with many people on both sides; and many perspectives Explain stance-perspective difference Naturally these debates are carried out online in news, social media, and fora For you as a journalist it would be great to have an automatic way to distil these perspectives
  3. Formalize What existing techniques could be used here? Sentiment analysis and stance detection no good because supervised Perspectives are unstructured and different for every topic  unsupervised
  4. In sum, two research questions To answer them, we Created a data set (openly available) Conducted a user study showing that some joint topic models can perform perspective discovery
  5. Needed data set of opinionated documents with perspective annotation Documents: around 3000 debate forum posts on abortion Human annotator noted stance and perspective Perspectives taken from ProCon list of 31 Then balanced data set of 600 documents
  6. First describe joint topic models, then baselines Ran all these models on the final corpus and then conducted user study with their output
  7. Between-subjects design, randomly assigned each participant to one model Topic model output on the left (6 topics + two honeypots) Select one of 16 different perspectives for each topic Step 3 we measured experience with the task
  8. Interesting that they were skewed; as we performed Prolific pre-screening
  9. Describe again why and what we did in this hypothesis test; we used ANOVA post hoc tests: TAM is the only one that is better than the TF-IDF baseline model
  10. Confirmation bias (ambiguous model output) spearman correlation – no evidence
  11. Normalized distribution over perspectives (x-axis) P1-p6 in the corpus, rest not Plot shows how often each perspective was selected Some perspectives were well represented in all models, like P5 (or people are familiar with them) TAM was good with perspectives that other models struggled with, such as P1 and P6 More exploratory results in the paper
  12. Other topics more sentiment-related words Future work: different domains, novel topic models
  13. Supplementary material is available on our repository Generating explanations to help people overcome biases
  14. Not in that order (see paper)
  15. Not in that order (see paper)