Understanding and improving Wikipedia article discussion spaces SAC2011
Upcoming SlideShare
Loading in...5

Understanding and improving Wikipedia article discussion spaces SAC2011



How can we make Wikipedia Talk pages easier for readers, editors, and administrators to use? What kind of structure can be added? ...

How can we make Wikipedia Talk pages easier for readers, editors, and administrators to use? What kind of structure can be added?

Symposium on Applied Computing (SAC 2011) paper presentation slides from Taichung, Taiwan

Wikipedia’s article discussion spaces (“Talk pages”) form a large and growing proportion of the encyclopedia, used for collaboration and article improvement. So far there is no in-depth account of how article Talk pages are used, what is wrong with them, and how they can be improved. This paper reports on three contributions promoting the under- standing of and improvement of these spaces:
(1) Wikipedia editor interviews provide an increased understanding of readers’ and editors’ needs,
(2) a large-scale comparative content analysis adds to knowledge of what kinds of discussions and coordination occur on Talk pages,
(3) a prototype bookmarklet-based system, which we test in a formative user evaluation, integrates lightweight semantics.

Full paper at http://jodischneider.com/pubs/sac2011.pdf



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment
  • "only 10% of participants knew that Wikipedia has a policy against posting original research.” – Antin & Cheshire, CSCW 2010
  • Talk pages are LONG!!! six Talk pages can yield over 100 printed pages [3], and individual Talk pages may yield 50 printed pages.
  • Trust & credibility layer Golbeck, Computing with Social Trust, Springer 2008 Hartig, Querying Trust in RDF Data with tSPARQL, ESWC 2009 W3C Provenance Incubator Group Final Report
  • 2 Wikipedia editors, 2 Wikipedia administrators
  • 20 pages per category
  • Partial example of RDFa markup
  • We can also retrieve posts by novices, or which have no replies. Or both! SELECT ?comment ?reply ?user ?name WHERE { ?comment a sioc:Post ; sioc:has_creator ?user . OPTIONAL { ?user sioc:name ?name . } OPTIONAL { ?comment sioc:has_reply ?reply . } FILTER (!BOUND(?name)) FILTER (!BOUND(?reply)) }

Understanding and improving Wikipedia article discussion spaces SAC2011 Understanding and improving Wikipedia article discussion spaces SAC2011 Presentation Transcript

  • Understanding and Improving Wikipedia Article Discussion Spaces Jodi Schneider , Alexandre Passant, John Breslin ACM SAC 2011-03-24 Taichung, Taiwan
  • Wikipedia editors are leaving faster than they can be replaced Felipe Ortega via http://www.businessinsider.com/chart-of-the-day-wikipedia-editors-2009-11 of 27
  • How do we turn readers into editors?
    • Ensure people know they can edit
    • Make editing easier
    • Help learn how things work by reading discussions!
      • “ Reading Talk pages – the behind-the-scenes discussions about Wikipedia articles – signals a transition towards more active forms of participation.” – Antin & Cheshire, CSCW 2010
    • Make more edits “stick”
      • Understand what kinds of contributions are accepted
        • Provide support for creating good arguments
        • Avoid need for reverts
    of 27
  • Wikipedia Discussion Space: “Talk page” of 27
  • Talk pages need semantics
    • Lots of conversations
      • Viégas: “the fastest growing areas of Wikipedia are devoted to coordination and organization”
    • When are people agreeing/ disagreeing?
      • Not well understood!
    • Very little study of Talk pages
      • Largest study: 60 pages, 2 types. Discovered: Featured Articles have 10x discussion!
    • Immense variation between pages
    of 27 Data from Stvilia
  • Social Semantic Web of 27
  • My Research Questions
    • What do Wikipedians do on Talk pages?
    • What kind of arguments happen on Talk pages?
    • Can we add structure to make pages “fit” how editors and readers use them?
    of 27
  • Three ways of understanding Talk pages
    • Interviews with editors and administrators
      • What do Wikipedians do on Talk pages?
    • Hand content analysis of 100 Talk pages
      • What kind of arguments happen on Talk pages?
    • Developing & using a semantic model
      • Can we add structure to make pages “fit” how editors and readers use them?
    of 27
  • 1. Interviews
    • Administrators
      • Frequently monitor conversations
      • Know + meet co-editors
      • Make community-related edits such as adding infoboxes
      • More likely to move/rename articles and Talk pages
    • Editors
      • Mostly read Talk pages
      • “ Get the scoop”—what’s controversial? More details?
      • More likely to read older conversations
      • May learn policy and procedures
    of 27
  • 2. Content Analysis
    • 100 Talk pages
    • 5 categories of pages
      • Most editors (of the article)
      • Most visits (to the article)
      • Controversial
      • Featured Articles
      • Random
    • 15 classifications
    of 27
  • of 27 Classification Example Reference to... Sources outside the wiki ... Not sure where to put it but I’ll leave it here as somebody might find it useful Reverts, removed material, or controversial edits I noticed some people edit the page into what it will be in 10 minutes but someone is reverting it...just let it be Edits the discussant made Added the About.com review since the review was part of the reception section. Requests for... Help with another article, portal, etc. This is just to invite attention to the page Facebook statistics just created…
  • The 15 Classifications
    • References to…
    • Vandalism
    • Guidelines and policies
    • Sources outside Wikipedia
    • Reverts, removed material, or controversial edits
    • Edits the discussant made
    • Internal Wikipedia resources
    • Requests for…
    • Editing coordination
    • Information
    • Help with another article
    • Peer review
    • Etc.
    • Off-topic remarks
    • Polls
    • Information boxes
    • Images
    • Other
    of 39 of 27
  • 3a. Developing a content-based semantic model
    • Represent article structure
      • Reuse existing ontologies (FOAF, SIOC)
    • Represent content (based on the content analysis)
      • Winnow the 15 classifications: relevance & plausibility
        • “ Relevant” for querying and retrieving information
        • “ Plausible” a person would mark their own comment
          • “ Off topic”
          • “ Request for help”
    of 27
  • Represent thread structure of 27 sioc:Thread sioc:Post
  • sioc: links_to http://en.wikipedia.org/wiki/Template:WikiProject_Computing Express relationships of 27
  • Reuse SIOC & FOAF for structure
    • Article
      • sioct:WikiArticle
    • Link the article to the Talk page
      • sioc:has_discussion
    • Discussion threads
      • sioc:Thread
    • Individual comments
      • sioc:Post
    • Commenter
      • foaf:Person / sioc:UserAccount
    of 27
  • Our SIOC WikiTalk ontology
    • WikiDiscussionItem
      • ReferenceItem
        • ReferenceToEdit
        • ReferenceToGuidelinesOrPolicies
        • ReferenceToInternalResources
        • ReferenceToRevertsOrControversialOrRemovedMaterial
        • ReferenceToVandalism
      • RequestItem
        • RequestEditingCoordination
        • RequestHelpElsewhere
        • RequestInfo
        • RequestPeer-review
        • http://rdfs.org/sioc/wikitalk
    of 27
  • 3b. Using our semantic model
    • Hand markup Wikipedia Talk pages with RDFa
    • Query to find comments meeting specified criteria
      • JavaScript and SPARQL
    • Formative evaluations
      • Browsing talk pages, with & without highlighting, to identify particular types of comments
    of 27
    • <p about=&quot;#Thread2Post1&quot; typeof=&quot;siocwt:RequestEditingCoordination&quot;
    • rel=&quot;sioc:has_container&quot; href=&quot;#Rule_Interchange_Format&quot;></p>
    • <div about=&quot;#Thread2Post1&quot; rel=&quot;sioc:has_creator&quot;
    • href=&quot;http://en.wikipedia.org/wiki/User:Nloth&quot;>
    • <div about=&quot;#Thread2Post1&quot; rel=&quot;sioc:last_activity_date&quot;
    • content=&quot;20091116T0432-0000&quot; datatype=&quot;xsd:dateTime&quot;>
    • <p>I'd support having <a href=&quot;http://en.wikipedia.org/wiki/Rule_Interchange_Format&quot;> Rule Interchange Format</a> merged into this article … <a href=http://en.wikipedia.org/wiki/User:Nloth title=&quot;User:Nloth&quot;> Nloth</a> (<a href=&quot;http://en.wikipedia.org/wiki/User_talk:Nloth&quot; title=&quot;User talk:Nloth&quot;>talk</a>) 04:32, 16 November 2009 (UTC) </p></div> </div>
  • Using the markup: JavaScript bookmarklets
    • Highlight posts based on the ontology class – e.g. ReferenceToEdit
    of 27
  • Retrieve RequestInfo posts in WikiProject Computing
    • We retrieve the “RequestInfo” posts with SPARQL:
    • SELECT ?commment ?page
    • WHERE
    • {
    • ?page sioc:links_to <http://en.wikipedia.org/ wiki/ Template:WikiProject_Computing > .
    • ?comment sioc:has_container ?page ;
    • a sioc:Post ; a siocwt:RequestInfo .
    • }
    of 27
  • Summary
    • We can increase the effectiveness of Wikipedia Talk pages by understanding how they are used
    • We add semantic structure to Wikipedia Talk pages which can be used to extract socially useful info
    • Social Semantic Web expertise can benefit Wikipedia
    of 27
  • Thank You!
    • Questions & Comments?
    • Contact:
    • [email_address]
    • Thanks to SAC-STAP for travel support and to Science Foundation Ireland for Ph.D. funding Grant No. SFI/09/CE/I1380 (Líon2)!
    of 27
  • Our Wikipedia-Related Research
    • “ Understanding and Improving Wikipedia Article Discussion Spaces.” In SAC 2011 (Web Track), TaiChung, Taiwan, March 21-25, 2011.
    • “ Enhancing MediaWiki Talk pages with Semantics for Better Coordination - A Proposal.” In The Fifth Workshop on Semantic Wikis: Linking Data and People Workshop at 7th Extended Semantic Web Conference (ESWC), Crete, Greece, May 31, 2010.
    • “ A Content Analysis: How Wikipedia Talk Pages Are Used.” In WebSci2010, Web Science Conference. Raleigh, NC, April 26 & 27 2010.
    of 27
  • References
    • Antin, J., & Cheshire, C. (2010). Readers are not free-riders: Reading as a form of participation on Wikipedia. CSCW 2010. doi: 10.1145/1718918.1718942
    • Stvilia, Twidale, Smith & Gasser, &quot;Information Quality Work Organization in Wikipedia,&quot; JASIST 2008. doi: 10.1002/asi.2081
    • Viégas, Wattenberg, Kriss & Ham, &quot;Talk Before You Type: Coordination in Wikipedia,&quot; HICSS 2007. doi: 10.1109/HICSS.2007.511
    of 27
  • Further image credits
    • Felipe Ortega’s dissertation research
    • Wikipedia logo
    • Talk pages screenshots from
    • http://en.wikipedia.org/Talk : {articlename}
    of 27