A Content Analysis:
            How Wikipedia Talk Pages Are Used
                      Jodi Schneider, Alexandre Passant & John G. Breslin
  Motivation                                     Content Analysis                                                                          Semantic Web
  Wikipedia’s coordination costs—the             We used 15 comment types;                                                                 Opportunities
  number of Talk page edits for each             a comment could have multiple types.                                                      We propose structured, meaningful
  article edit—have increased                    We started with Viégas’ 11 types [2]:                                                     annotations: the type of comment.
  dramatically [1]:                              1. Requests for editing coordination                                                      Comment types could enable new
                                                                                                                                           ways to browse Talk pages, using
                                                 2. Requests for information
                                                                                                                                           Semantic Web technologies. We
                                                 3. References to vandalism                                                                could instantaneously gather and
                                                 4. References to guidelines/policies                                                      show all comments of a certain type.

                                                 5. References to internal resources
                                                 6. Off-topic remarks                                                                      We have created a lightweight
                                                                                                                                           ontology, based on SIOC, where
                                                 7. Polls                                                                                  classes in the ontology correspond
                                                 8. Requests for peer review                                                               to common comment types we
                                                                                                                                           identified in the content analysis [4]:
                                                 9. Information boxes
                                                                                                                                           http://rdfs.org/sioc/wikitalk
  We are analyzing Talk pages to                 10. Images
  suggest how Semantic Web                                                                                                                 Users would tick checkboxes to
                                                 11. Other
  technologies (like structured                                                                                                            indicate a comment’s type(s).
  annotations) could improve                     We added 4 new types:
  coordination.                                  1. References to external sources
                                                                                                                                           A JavaScript plugin could then
 A typical discussion in a Wikipedia Talk page   2. Discussing reverts/removed                                                             highlight only certain comment types
                                                 material/controversial edits                                                              —for instance all “References to
                                                 3. Reference to edits made oneself                                                        external sources”. With SPARQL, we
                                                                                                                                           could show all “help requests” from a
                                                 4. Recruiting help for another article/                                                   group of pages.
                                                 portal




                                                                                                                                                                Talk page postings by type.
                                                                                                                                                                ‘Coordination’ is the most
                                                                                                                                                                common type of comment.
                                                                                                                                                                Comment types depend on
                                                                                                                                                                the page type. Discussions
                                                                                                                                                                of ‘reverts/removed
                                                                                                                                                                material/controversial
                                                                                                                                                                edits’ are three times as
                                                                                                                                                                likely on Talk pages of
                                                                                                                                                                controversial articles.
Method                                                                                                                                                          ‘Guidelines’ and ‘sources’
                                                                                                                                                                are commonly discussed.
We are examining 100 Talk pages, 20
                                                                                                                                                                Info boxes are common in
from each of these categories:
                                                                                                                                                                “most views” and
1.  Articles with the most contributors                                                                                                                         “controversial” samples.
2.  Most-viewed articles
3.  Controversial articles
4.  Featured Articles
5.  Random sample
This will help us to identify the types of
conversations and the variance between                References                                                                            Acknowledgements
pages. Existing studies focus on 1 or 2               [1] B. Stvilia, M.B. Twidale, L.C. Smith, and L. Gasser, “Information Quality Work
                                                      Organization in Wikipedia,” JASIST, vol. 59, 2008, pp. 983-1001.                      The work presented in this paper has
article types and use small sample                    [2] F.B. Viegas, M. Wattenberg, J. Kriss, and F.V. Ham, “Talk Before You Type:
                                                                                                                                            been funded by Science Foundation
sizes of 6 to 60 articles.                            Coordination in Wikipedia,” HICSS 2007, pp. 78-87.
                                                      [3] J. Schneider, A. Passant, and Breslin, John G., “A Content Analysis: How
                                                      Wikipedia Talk Pages Are Used,” WebScience 2010, Raleigh, North Carolina.
                                                                                                                                            Ireland under Grant No. SFI/08/CE/
                                                      [4] ibid, “Enhancing MediaWiki Talk pages with Semantics for Better Coordination      I1380 (Líon-2).
                                                      - A Proposal,” The Fifth Workshop on Semantic Wikis: Linking Data and People at
                                                      the 7th Extended Semantic Web Conference (ESWC), Crete, Greece: 2010.

A Content Analysis: How Wikipedia Talk Pages Are Used (WebSci2010 poster)

  • 1.
    A Content Analysis: How Wikipedia Talk Pages Are Used Jodi Schneider, Alexandre Passant & John G. Breslin Motivation Content Analysis Semantic Web Wikipedia’s coordination costs—the We used 15 comment types; Opportunities number of Talk page edits for each a comment could have multiple types. We propose structured, meaningful article edit—have increased We started with Viégas’ 11 types [2]: annotations: the type of comment. dramatically [1]: 1. Requests for editing coordination Comment types could enable new ways to browse Talk pages, using 2. Requests for information Semantic Web technologies. We 3. References to vandalism could instantaneously gather and 4. References to guidelines/policies show all comments of a certain type. 5. References to internal resources 6. Off-topic remarks We have created a lightweight ontology, based on SIOC, where 7. Polls classes in the ontology correspond 8. Requests for peer review to common comment types we identified in the content analysis [4]: 9. Information boxes http://rdfs.org/sioc/wikitalk We are analyzing Talk pages to 10. Images suggest how Semantic Web Users would tick checkboxes to 11. Other technologies (like structured indicate a comment’s type(s). annotations) could improve We added 4 new types: coordination. 1. References to external sources A JavaScript plugin could then A typical discussion in a Wikipedia Talk page 2. Discussing reverts/removed highlight only certain comment types material/controversial edits —for instance all “References to 3. Reference to edits made oneself external sources”. With SPARQL, we could show all “help requests” from a 4. Recruiting help for another article/ group of pages. portal Talk page postings by type. ‘Coordination’ is the most common type of comment. Comment types depend on the page type. Discussions of ‘reverts/removed material/controversial edits’ are three times as likely on Talk pages of controversial articles. Method ‘Guidelines’ and ‘sources’ are commonly discussed. We are examining 100 Talk pages, 20 Info boxes are common in from each of these categories: “most views” and 1.  Articles with the most contributors “controversial” samples. 2.  Most-viewed articles 3.  Controversial articles 4.  Featured Articles 5.  Random sample This will help us to identify the types of conversations and the variance between References Acknowledgements pages. Existing studies focus on 1 or 2 [1] B. Stvilia, M.B. Twidale, L.C. Smith, and L. Gasser, “Information Quality Work Organization in Wikipedia,” JASIST, vol. 59, 2008, pp. 983-1001. The work presented in this paper has article types and use small sample [2] F.B. Viegas, M. Wattenberg, J. Kriss, and F.V. Ham, “Talk Before You Type: been funded by Science Foundation sizes of 6 to 60 articles. Coordination in Wikipedia,” HICSS 2007, pp. 78-87. [3] J. Schneider, A. Passant, and Breslin, John G., “A Content Analysis: How Wikipedia Talk Pages Are Used,” WebScience 2010, Raleigh, North Carolina. Ireland under Grant No. SFI/08/CE/ [4] ibid, “Enhancing MediaWiki Talk pages with Semantics for Better Coordination I1380 (Líon-2). - A Proposal,” The Fifth Workshop on Semantic Wikis: Linking Data and People at the 7th Extended Semantic Web Conference (ESWC), Crete, Greece: 2010.