Synthesizing knowledge from disagreement -cwi-2015-04-23

Synthesizing knowledge from
disagreement
Jodi Schneider
ERCIM Marie Curie Postdoctoral Fellow, INRIA
jschneider@pobox.com
2015-04-23
CWI
Amsterdam

Overview
o My Background & Research Themes
o Structuring Evidence in Wikipedia Discussions
o Supporting Systematic Review of Biomedical Evidence
2

Themes in My Research
o How do people collaborate to generate knowledge?
o What counts as evidence in a given community?
o How can structuring evidence help synthesize info?
3

What knowledge should be included
in Wikipedia?
Jodi Schneider, Krystian Samp, Alexandre Passant, and Stefan Decker. “Arguments about Deletion:
How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups”. In CSCW
2013.
Jodi Schneider and Krystian Samp. “Alternative Interfaces for Deletion Discussions in Wikipedia:
Some Proposals Using Decision Factors. [Demo]” In WikiSym2012.
Jodi Schneider, Alexandre Passant, and Stefan Decker. “Deletion Discussions in Wikipedia:
Decision Factors and Outcomes.” In WikiSym2012.
4

Important since Wikipedia is widely used.
5

Example Deletion Discussion
10

Problem: Long, No-consensus Discussions

Problem: Newcomers are confused about
Wikipedia’s standards.
o “Why should a local cricket club not have it's own
page on this website? Obviously a valid club and
been established for a while. Nothing offensive or
false on the page. All need to do is put in Emsworth
Cricket Club into a search engine and information
comes up. Why just because it is a small team
and not major does it not deserve it's own page
on here?” (sic)
o “At the end of the day the club has history which
being 200 years is just as special as a article on a
breed of dog or something similar.”
o “really is worth a mention. Especially on a
website, where pointless people ... gets a
mention.” (sic)
13

o “Why should a local cricket club not have it's own
page on this website? Obviously a valid club and
been established for a while. Nothing offensive or
false on the page. All need to do is put in Emsworth
Cricket Club into a search engine and information
comes up. Why just because it is a small team
and not major does it not deserve it's own page
on here?” (sic)
o “At the end of the day the club has history which
being 200 years is just as special as a article on a
breed of dog or something similar.”
o “really is worth a mention. Especially on a
website, where pointless people ... gets a
mention.” (sic)
14

15

16

17

Problem Summary
o Long, no-consensus discussions
 Summarize discussions
o Newcomers are confused about Wikipedia's standards
 Make article criteria more explicit
18

Approach: Structure Evidence
1. Understand what evidence the community uses to
establish knowledge.
2. Structure the evidence.
3. Build a computer support system.
4. Test and refine the system.
19

1. Understand what evidence the community uses
to establish knowledge.
20

Sample Corpus
o 72 discussions started on 1 day.
Each discussion has
• 3–33 messages
• 2–15 participants
o In total, 741 messages contributed by 244 users.
Each message has
• 3–350+ words
o 98 printed A4 sheets
21

Structuring the Data: Annotation
o Content analysis of the corpus
o Compare two different annotation approaches
o Iterative annotation
• Multiple annotators
• Refine to get good inter-annotator agreement
• 4 rounds of annotation
22

2 Types of Annotation
o 1. Walton’s Argumentation Schemes
(Walton, Reed, and Macagno 2008)
• Informal argumentation
(philosophical & computational argumentation)
• Identify & prevent errors in reasoning (fallacies)
• 60 patterns
o 2. Factors Analysis
(Ashley 1991)
• Case-based reasoning
• E.g. factors for deciding cases in trade secret law,
favoring either party (the plaintiff or the defendant).
23

2 Types of Annotation
1. Walton’s Argumentation Schemes
Informal argumentation
(philosophical & computational argumentation)
Identify & prevent errors in reasoning (fallacies)
60 patterns
(Ashley 1991)
• Case-based reasoning
• E.g. factors for deciding cases in trade secret law,
favoring either party (the plaintiff or the defendant).
24

Factor Example (used to justify ‘keep’)
Notability Anyone covered by another
encyclopedic reference is considered
notable enough for inclusion in
Wikipedia.
Sources Basic information about this album at a
minimum is certainly verifiable, it's a
major label release, and a highly
notable band.
Maintenance …this article is savable but at its
current state, needs a lot of
improvement.
Bias It is by no means spam (it does not
promote the products).
Other I'm advocating a blanket “hangon” for
all articles on newly-drafted players…
Jodi Schneider, Alexandre Passant & Stefan Decker
Deletion Discussions in Wikipedia: Decision Factors and Outcomes
4 Key Factors (& “Other”)
26

Decision factors articulate values/criteria.
o 4 Factors in Deletion Discussions cover:
• 91% of comments
• 70% of discussions
o Readers who understand these criteria:
• Understand what content is appropriate.
• Are less likely to have content deleted, and less likely to
take deletion personally.
27

To structure the data, we chose factors.
o 1. Walton’s Argumentation Schemes
• Most appropriate for writing support
• 15 categories + 2 non-argumentative categories
• Detailed analysis of content
o (drawing on Ashley 1991)
• Close to the community rules & policies
• 4 categories + 1 catchall
• Good domain coverage
28

1. Understand what evidence the community uses
to establish knowledge.
29

30

33

34

Build a computer support system.
Original
Discussion
Ontology
Semantic
Enrichment
Semantically
Enriched
RDFa
Querying
Queryable
User Interface
With Barchart

We add a discussion summary…
36

…by annotating this original content….
37

…to semantically enrich messages.
38

39

40

41

… gives more detail for each decision factor.
On click, open the comments
with that decision factor.

Count & list by decision factor using
JavaScript queries
44

Query to generate the summary.
45

Query to generate the summary.
46

47

Test our Experimental System…

PU* - Perceived usefulness
PE* - Perceived ease of use
DC -Decision completeness
PF - Perceived effort
IC* - Information
completeness
Statistical Significance
PU* p < .001
PE* p .001
IC* p .039
53

Results: 84% prefer our system.
“Information is structured and I can quickly get an
overview of the key arguments.”
“The ability to navigate the comments made it a bit
easier to filter my mind set and to come to a
conclusion.”
“It offers the structure needed to consider each factor
separately, thus making the decision easier. Also, the
number of comments per factor offers a quick
indication of the relevance and the deepness of the
decision.”
16/19, based on a 20 participant user test.
1 participant did not take the final survey
55

4. Test…
… & refine the system.
56

Summary
o Information technology can organize information
based on a community’s key decision factors.
o In Wikipedia, we developed an alternate interface for
deletion discussions.
o In Wikipedia, 4 questions are used to evaluate
borderline articles:
o Notability – Is the topic appropriate for our encyclopedia?
o Sources – Is the article well-sourced?
o Maintenance – Can we maintain this article?
o Bias – Is the article neutral? POV appropriately weighted?
57

Summary: Our Process
1. Get to know a community and its needs.
Ethnography
1. Structure the data.
Annotation & ontology development
Web standards:
HTML, JavaScript, RDF/OWL, SPARQL
1. Test & refine the system.
Human computer interaction
58

SUPPORTING SYSTEMATIC
REVIEW OF BIOMEDICAL
EVIDENCE
59

Info overload now goes beyond
papers
Bastian, Glasziou, and Chalmers. "75 trials and 11 systematic reviews
a day: how will we ever keep up?." PLoS medicine 7.9 (2010): e1000326.

For medication safety, how to
structure evidence on drug-drug
interactions and keep it up-to-date?
Jodi Schneider, Paolo Ciccarese, Tim Clark and Richard D. Boyce. “Using the
Micropublications ontology and the Open Annotation Data Model to represent evidence
within a drug-drug interaction knowledge base.” 4th Workshop on Linked Science 2014—
Making Sense Out of Data (LISC2014) at ISWC 2014
Mathias Brochhausen, Jodi Schneider, Daniel Malone, Philip E. Empey, William R. Hogan
and Richard D. Boyce “Towards a foundational representation of potential drug-drug
interaction knowledge.” First International Workshop on Drug Interaction Knowledge
Representation (DIKR-2014) at the International Conference on Biomedical Ontologies
(ICBO 2014)
Jodi Schneider, Carol Collins, Lisa Hines, John R Horn and Richard Boyce. “Modeling
Arguments in Scientific Papers to Support Pharmacists.” at ArgDiaP 2014, The 12th
ArgDiaP Conference: From Real Data to Argument Mining, Warsaw, Poland
62

Part of a Larger Effort
o “Addressing gaps in clinically useful evidence on
drug-drug interactions”
o 4-year project, U.S. National Library of Medicine R01
grant
(PI, Richard Boyce; 1R01LM011838-01)
o Since February 2013:
evidence panel of domain experts
(Carol Collins, Lisa Hines, John R Horn, Phil Empey)
& informaticists
(Tim Clark, Paolo Ciccarese, Jodi Schneider)
o Programmer: Yifan Ning

Prescribers check for known drug interactions.
64

Prescribers consult drug interaction references
which are maintained by expert pharmacists.
Medscape EpocratesMicromedex 2.0
65

Prescribers consult drug interaction references
which are maintained by expert pharmacists.
Medscape EpocratesMicromedex 2.0
66

Goals
o Support evidence-based updates to
drug-interaction reference databases.
o Make sense of the EVIDENCE:
• New clinical trials
• Adverse drug event reports
• Drug product labels
• FDA regulatory updates
http://jama.jamanetwork.com/article.aspx?articleid=18345467

Evidence Base Competency Questions
o 40 competency questions, such as:
• List all evidence by drug, drug pair, …
• List all default assumptions
(assertions not supported by evidence)
• Which single evidence items act as as support or rebuttal
for multiple assertions of type X?
(e.g., substrate_of assertions)
• What data, methods, materials, were used in the study
reported in evidence item X?
• Which research group conducted the study reported in
evidence item X?
• Show me what evidence has been deprecated since my
last visit?
• Which assertions are supported by a specific FDA
guidance statement?
69

An Ontology for Representing Evidence
Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and
annotations in biomedical communications
70

An Ontology for Representing Evidence
71
Clark, Ciccarese, Goble (2014) Micropublications: a semantic model for claims, evidence, arguments and
annotations in biomedical communications

o Evidence
81
7.19 Drugs Metabolized by Cytochrome P4502D6
In vitro studies did not reveal an inhibitory effect of
escitalopram on CYP2D6.

Next steps
o Continuing data model development & testing.
o NLP support: Create a pipeline for extracting
potential drug-drug interaction mentions from
scientific & clinical literature.
o NLP + "expertsourcing" and crowdsourcing
(distributed annotation).
o Test annotation tools: usability for domain experts.
o Resolving links to paywalled PDFs.
83

Open Annotation Data Model
http://www.openannotation.org/spec/core/

Annotation: Argumentation Schemes

“Rule” Argumentation Scheme

“Evidence” Argumentation Scheme

Walton’s Argumentation Schemes
Example Argumentation Scheme:
Argument from Rules – “we apply rule X”
Critical Questions
1. Does the rule require carrying out this type of
action?
2. Are there other established rules that might conflict
with or override this one?
3. Are there extenuating circumstances or an excuse
for noncompliance?
Walton, Reed, and Macagno 2008

Walton’s Argumentation Schemes
Jodi Schneider, Krystian Samp, Alexandre Passant, Stefan Decker.
“Arguments about Deletion: How Experience Improves the Acceptability of Arguments in
Ad-hoc Online Task Groups”. In CSCW 2013.
! "#$%&' ()*+((&"' *"&, +-&' . &
! "#$%&' ()*"+%), -./&' 0&)(+)123+(4&5.5 67897:
! "#$%&' ()*"+%); $<&5 6=87>:
? +(& 6@8=7:
! "#$%&' (A(.+' )*"+%)BA<$&5 C89>:
! "#$%&' ()*"+%)? &&/)*+")1&<3 C869:
! "#$%&' ()*"+%)D.A5 @8EF:
? +)"&A5+' )#.-&' @899:
! "#$%&' ()*"+%)G+5.(.+' )(+)H' +I @8>J :
! "#$%&' ()*"+%)G"&0&/&' ( @8>J :
! "#$%&' ()*"+%)K#' +"A' 0& 987F:
! "#$%&' ()*"+%)L+%3+5.(.+' 98J =:
! "#$%&' ()*"+%)LA$5&)(+), **&0( 98@6:
! "#$%&' ()*"+%)! ' A<+#2 989@:
! "#$%&' ()*"+%)MA5(& 989@:
G"A0(.0A<); &A5+' .' # 989@:
! "#8)*"+%)B&"NA<)L<A55.*.0A(.+' 98>=:

Synthesizing knowledge from disagreement -cwi-2015-04-23

Recommended

Recommended

More Related Content

Similar to Synthesizing knowledge from disagreement -cwi-2015-04-23

Similar to Synthesizing knowledge from disagreement -cwi-2015-04-23 (20)

More from jodischneider

More from jodischneider (20)

Recently uploaded

Recently uploaded (20)

Synthesizing knowledge from disagreement -cwi-2015-04-23

Editor's Notes