The document discusses a workshop on measuring, evaluating, and managing open online communities held in Brest, France. It provides biographical information on Jodi Schneider, a PhD student studying web sociotechnical systems using multidisciplinary approaches. Schneider's work focuses on extracting and representing argumentation from social media by studying genre, metadata, user properties, dialogue goals and context. The goal is to better understand online discussions and support knowledge sharing in open communities.
1. Digital Enterprise Research Institute www.deri.ie
Argumentative Discussions on the Web
Jodi Schneider
@jschneider
“Towards a Virtual Institute for the Measurement, Wednesday 6 Feb 2013
Evaluation, and Management of Open Online
Communities” - EST Exploratory Workshop
Telecom Bretagne, Brest, France
Copyright 2011 Digital Enterprise Research Institute. All rights reserved.
Enabling Networked Knowledge
1
2. Ph.D. student
o Study the Web &
sociotechical systems
o Multidisciplinary
approaches
o Create new methods &
repurpose existing ones
o Bias: structure data
@jschneider jodischneider.com
3. Key Point
The overall popularity of an opinion is not as
important as the reasons supporting it
Image credit: http://www.nickmilton.com/2012/03/when-people-trust-crowds.html
4. Arguments
Claim: Jaffa Cakes are cakes
Justification: official EU ruling; cakes go hard when
stale
Tweet by @robeastaway
https://twitter.com/robeastaway/status/135838892694839296
5. Arguments
o Claim: If Canon does not
make the repair, you
should take them to small
claims court.
o Justification: it’s easy,
cheap, and can be done
online;
it’s within your rights
Joint work with Adam Wyner (Aberdeen)
6. What aspects of social media are relevant to
extracting and representing argumentation?
o Genre
o Metadata
o Properties of users
o Goals of a particular dialogue
o Context
o Informal language
o Implicit info
o Sentiment techniques
o Subjectivity and objectivity
Jodi Schneider, Brian Davis, and Adam Wyner. “Dimensions of
Argumentation in Social Media.” In 18th International Conference
on Knowledge Engineering and Knowledge Management (EKAW
2012).
10. Goals
- ( .4% )
1#$
! " % ' ( )*+, (
#$ & ; ," %1" 14)< )
- #./ 0" #123 %
, 4)5 #$ 61% $ 8&
7 ( ( 9: % %
/ 1 =#$ ( 4
'
Find and Verify
> "
1?' .+ High Low Low
Evidence
Find and Defend a
! " @ .+
40% ( High Low Low
Suitable Hypothesis
Acquire or Give
> % #/ % C( ( D"
1A .: 1B 1& High Low Low
Information
Coordinate Goals and
! ( $E( .#/ %
" 1 High Varies High
Actions
- ( .4' #4" 1
% Persuade Other Party Varies High High
Get What You Most
F ( & / #/ %
% 1 Low Varies High
Want
Verbally Hit Out at
9." 0
4/ Low High High
Opponent
Walton & Krabbe. D. N. Walton. Commitment in dialogue. State
University of New York Press, Albany, 1995.
12. My thesis in the Social Semantic Web
o Investigate Wikipedia discussions
o Provide semantic support
o Evaluate the impact on the community
13. Deletion threatens Wikipedia
o 1 in 4 new Wikipedia articles is deleted – within
minutes or hours
o Demotivating!
• 1 in 3 newcomers start by writing a new article
• 7X less likely to stay if their article is deleted!
o Can we support editor retention?
14.
15. Article creators
o Misunderstand policy
• “I do understand that articles on wikipedia need to be
sourced… it is due to have two [sources] once [our
website goes] live”
o Express high levels of emotion
• “To be honest it's been a real turn off adding articles to
WP and I don't think I will add articles again. So smile and
enjoy.”
o Learn from discussions
• “much as it would break my heart … it is perhaps sensible
that the piece is deleted.”
Jodi Schneider, Alexandre Passant, and Stefan Decker. “Deletion Discussions
in Wikipedia: Decision Factors and Outcomes.” In WikiSym2012.
16. Novices’ arguments
o Structurally different to experts’ arguments
o More problematic arguments from novices
• Personal preference
• Requesting a favor
• Analogy to other cases
• No harm in keeping an article
• Large number of search engine hits
Jodi Schneider, Krystian Samp, Alexandre Passant, Stefan Decker.
“Arguments about Deletion: How Experience Improves the Acceptability of
Arguments in Ad-hoc Online Task Groups”. In CSCW 2013.
17. Factor Example (used to justify
Articulate criteria `keep')
Notability Anyone covered by another
encyclopedic reference is
considered notable enough for
inclusion in Wikipedia.
4 Factors cover
Sources Basic information about this
• 91% of album at a minimum is certainly
comments verifiable, it's a major label
• 70% of release, and a highly notable
discussions band.
Maintenance …this article is savable but at its
current state, needs a lot of
improvement.
Bias It is by no means spam (it does
not promote the products).
Other I'm advocating a blanket
"hangon" for all articles on newly-
drafted players
Jodi Schneider, Alexandre Passant, and Stefan Decker. “Deletion Discussions in
Wikipedia: Decision Factors and Outcomes.” In WikiSym2012.
18. Add semantic structure
Implementation based on Jodi Schneider and Krystian Samp
“Alternative Interfaces for Deletion Discussions in Wikipedia: Some
Proposals Using Decision Factors. [Demo]” In WikiSym2012.
19. 84% prefer our system
“Information is structured and I can quickly get an
overview of the key arguments.”
“The ability to navigate the comments made it a bit
easier to filter my mind set and to come to a
conclusion.”
“It offers the structure needed to consider each factor
separately, thus making the decision easier. Also, the
number of comments per factor offers a quick
indication of the relevance and the deepness of the
decision.”
21. Acknowledgements
o Thanks to collaborators!
• Adam Wyner
• Brian Davis
• Krystian Samp
o Overall Ph.D. funding, Science Foundation Ireland Grant No.
SFI/08/CE/I1380 (Líon-2), advised by Alexandre Passant &
Stefan Decker
. The issue is whether it is the right product for the buyer, which is a matter not only of the pros and cons, but also of the explanations and counterarguments given. In our view, current approaches detect problems, but obscure the chains of reasoning about them.Image: http://www.nickmilton.com/2012/03/when-people-trust-crowds.html
Isn’t it funny that people tweet about thisDifference between cakes and biscuits? When stale, cakes go hard, biscuits go soft. Hence Jaffa Cakes are cakes. (Was official EU ruling).https://twitter.com/robeastaway/status/135838892694839296
http://www.amazon.co.uk/review/R34F9M7871K2W0/ref=cm_cr_rev_detmd_pl?ie=UTF8&asin=B004M8S152&cdForum=FxM7A99WYT0UJ9&cdMsgID=Mx35ZEPGH4SV7JK&cdMsgNo=1&cdPage=1&cdSort=oldest&cdThread=Tx1C124R97075W9&store=photo#Mx35ZEPGH4SV7JKArgumentation!ClaimsJustificationsContrary Claims & Disagreements…
4 Dimensions of ExpressionTo extract well-formed knowledge bases of argument, we must first chart out the various dimensions of social media, to point the way towards the aspects that argumentation reconstruction will need to consider, so that we later can isolate these aspects.Social media encompasses numerous genres, each with their own conversational styles, which affect what sort of rhetoric and arguments may be made. One key feature is the extent to which a medium is used for broadcasts (e.g. monologues) versus conversations (e.g. dialogues), and in each genre, a prototypical message or messages could be described, but these vary across genres due to social conventions and technical constraints. De Moor and Efimova compared rhetorical and argumentative aspects[4] of listservs and blogs, identifying features such as the likelihood that messages receive responses, and whether spaces are owned communities or by a single individual, and the timeline for replies [5]. Important message characteristics include the typical and allowable message length (e.g. space limitations on microblogs) and whether messages may be continually refined by a group (such as in StackOverflow).Metadata associated with a post (such as poster, timestamp, and subject line for listservs) and additional structure (such as pingbacks and links for blogs) can also be used for argumentation. For example, a user’s most recent post is generally taken to identify their current view, while relationships between messages can indicate a shared topic, and may be associated with agreement or disagreement.Users are different, and properties of users are factors that contribute not only to substance of the user’s comment, but as well to how they react to the comments of others. These include demographic information such as the user’s age, gender, location, education, and so on. In a specific domain, additional user expectations or constraints could also be added. Different users are persuaded by different kinds of information. Therefore, to solve peoples’ problems, based on knowledge bases, when dealing with inconsistency, understanding the purposes and goals that people have would be useful.Therefore, the goals of a particular dialogue also matter. These have been considered in argumentation theory: Walton & Krabbe have categorized dialogue types based on the initial situation, participant’s goal, and the goal of the dialogue [11]. The types they distinguish are inquiry, discovery, information seeking, deliberation, persuasion, negotiation and eristic. These are abstractions–any single conversation moves through various dialogue types. For example, a deliberation may be paused in order to delve into information seeking, then resumed once the needed information has been obtained.Higher level context would also be useful: different amounts of certainty are needed for different purposes. Some of that is inherent in a task: Reasoning about what kind of medical treatment to seek for a long-term illness, based on PatientsLikeMe, requires more certainty than deciding what to buy based on product reviews.Informal language is very typically found in social media. Generic language processing issues, with misspellings and abbreviations, slang, language mixing emoticons, and unusual use of punctuation, must be resolved in order to enable text mining (and subsequently argumentation mining) on informal language. Indirect forms of speech, such as sarcasm, irony, and innuendo, are also common. A step-by-step approach, focusing first on what can be handled, is necessary.Another aspect of the informality is that much information is left implicit. Therefore, inferring from context is essential. Elliptical statements require us to infer common world knowledge, and connecting to existing knowledge bases will be needed.We apply sentiment techniques to provide candidates for argumentation mining and especially to identify textual markers of subjectivity and objectivity. The arguments that are made about or against purported facts have a different form from the arguments that are made about opinions. Arguments about objective statements provide the reasons for believing a purported fact or how certain it is. Subjective arguments might indicate, for instance, which users would benefit from a service or product (those similar to the poster). Another area where subjective arguments may appear is discussions of the trust and credibility about the people making the arguments.
Social media encompasses numerous genres, each with their own conversational styles, which affect what sort of rhetoric and arguments may be made. One key feature is the extent to which a medium is used for broadcasts (e.g. monologues) versus conversations (e.g. dialogues), and in each genre, a prototypical message or messages could be described, but these vary across genres due to social conventions and technical constraints. De Moor and Efimova compared rhetorical and argumentative aspects of listservs and blogs, identifying features such as the likelihood that messages receive responses, and whether spaces are owned communities or by a single individual, and the timeline for replies [5]. Important message characteristics include the typical and allowable message length (e.g. space limitations on microblogs) and whether messages may be continually refined by a group (such as in StackOverflow).
Metadata associated with a post (such as poster, timestamp, and subject line for listservs) and additional structure (such as pingbacks and links for blogs) can also be used for argumentation. For example, a user’s most recent post is generally taken to identify their current view, while relationships between messages can indicate a shared topic, and may be associated with agreement or disagreement.
Users are different, and properties of users are factors that contribute not only to substance of the user’s comment, but as well to how they react to the comments of others. These include demographic information such as the user’s age, gender, location, education, and so on. In a specific domain, additional user expectations or constraints could also be added.
Different users are persuaded by different kinds of information. Therefore, to solve peoples’ problems, based on knowledge bases, when dealing with inconsistency, understanding the purposes and goals that people have would be useful. Therefore, the goals of a particular dialogue also matter. These have been considered in argumentation theory: Walton & Krabbe have categorized dialogue types based on the initial situation, participant’s goal, and the goal of the dialogue [11]. The types they distinguish are inquiry, discovery, information seeking, deliberation, persuasion, negotiation and eristic. These are abstractions–any single conversation moves through various dialogue types. For example, a deliberation may be paused in order to delve into information seeking, then resumed once the needed information has been obtained. Higher level context would also be useful: different amounts of certainty are needed for different purposes. Some of that is inherent in a task: Reasoning about what kind of medical treatment to seek for a long-term illness, based on PatientsLikeMe, requires more certainty than deciding what to buy based on product reviews. Informal language is very typically found in social media. Generic language processing issues, with misspellings and abbreviations, slang, language mixing emoticons, and unusual use of punctuation, must be resolved in order to enable text mining (and subsequently argumentation mining) on informal language. Indirect forms of speech, such as sarcasm, irony, and innuendo, are also common. A step-by-step approach, focusing first on what can be handled, is necessary. Another aspect of the informality is that much information is left implicit. Therefore, inferring from context is essential. Elliptical statements require us to infer common world knowledge, and connecting to existing knowledge bases will be needed. We apply sentiment techniques to provide candidates for argumentation mining and especially to identify textual markers of subjectivity and objectivity. The arguments that are made about or against purported facts have a different form from the arguments that are made about opinions. Arguments about objective statements provide the reasons for believing a purported fact or how certain it is. Subjective arguments might indicate, for instance, which users would benefit from a service or product (those similar to the poster). Another area where subjective arguments may appear is discussions of the trust and credibility about the people making the arguments.
Summarising my research to dateSkills, knowledge and track record I can bring to Understand & improve online discussions Enabling reuse of arguments and opinions from online social disputes
1“only 0.6 percent of those whose articles are met with deletion stayed editing, compared to 4.4 percent of the users whose articles remained”, http://enwp.org/Wikipedia:Wikipedia_ Signpost/2011-04-04/Editor_retention
Mentoring in discussions is effective: Article creators who receive mentoring seem toMake more edits to the articleContinue editingIncrease understanding of policy
Experts argue from precedentNovices: values, analogy, cause to effectJodi Schneider, KrystianSamp, Alexandre Passant, and Stefan Decker. Arguments about Deletion: How Experience Improves the Acceptability of Arguments in Ad-hoc Online Task Groups. Computer-Supported Cooperative Work and Social Computing (CSCW 2013).
500 discussions per week about deleting borderline articles
20 novice participants used both systems“The ability to navigate the comments made it a bit easier to filter my mind set and to come to a conclusion.”“summarise and, at the same time, evaluate which factor should be considered determinant for the final decision”
Want to reuse social media dataGrowing amount of social media dataWant to reuse dataNeed to analyse & structure it
Want to reuse social media dataGrowing amount of social media dataWant to reuse dataNeed to analyse & structure it
Want to reuse social media dataGrowing amount of social media dataWant to reuse dataNeed to analyse & structure it