CHI2007 talk on Conflicts in Wikipedia
Upcoming SlideShare
Loading in...5
×
 

CHI2007 talk on Conflicts in Wikipedia

on

  • 2,845 views

Aniket Kittur, Bongwon Suh, Bryan Pendleton, Ed H. Chi....

Aniket Kittur, Bongwon Suh, Bryan Pendleton, Ed H. Chi.

He Says, She Says: Conflict and Coordination in Wikipedia.

In Proc. of ACM Conference on Human Factors in Computing Systems (CHI2007), pp. 453--462, April 2007. ACM Press. San Jose, CA.

http://www-users.cs.umn.edu/~echi/papers/2007-CHI/2007-Wikipedia-coordination-PARC-CHI2007.pdf

Statistics

Views

Total Views
2,845
Views on SlideShare
2,843
Embed Views
2

Actions

Likes
3
Downloads
37
Comments
1

1 Embed 2

http://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Hope you guys enjoy this. Let us know if anyone has questions. We have been doing more research on Wikipedia analysis. Check out our blog at http://asc-parc.blogspot.com
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Thank you. Today I’m going to be talking about conflict and coordination in Wikipedia. This is joint work with... Most everyone knows that Wikipedia is an online encyclopedia that anyone can edit. But as I was putting this talk together I thought to myself “how can I describe what makes Wikipedia so special?” And luckily I found this video clip of Steve Carell from the TV show The Office describing it in a much more... interesting way than I possibly could.

CHI2007 talk on Conflicts in Wikipedia CHI2007 talk on Conflicts in Wikipedia Presentation Transcript

  • He Says, She Says: Conflict and Coordination in Wikipedia Aniket Kittur, Bongwon Suh, Bryan Pendleton, Ed Chi UCLA Augmented Social Cognition Group Palo Alto Research Center
  • What is Wikipedia? “ Wikipedia is the best thing ever. Anyone in the world can write anything they want about any subject, so you know you’re getting the best possible information.” – Steve Carell, The Office
  • Spreading conflict View slide
  • Spreading conflict View slide
  • Spreading conflict
  • Spreading conflict
  • Spreading conflict
  • Policy and procedure
    • “ The degree of success that one meets in dealing with conflicts... often depends on the efficiency with which one can quote policy and precedent.”
    - Wikipedia admin (survey data)
  • Collaborative work beneath the surface
    • Visitors only look at article pages
    • But much of Wikipedia comprised of other pages
      • Conflict resolution, coordination, policies and procedures
  • Characterizing coordination and conflict
  • Characterizing coordination and conflict
  • Exponential growth
  • Costs of growth
    • Increase in conflict and coordination costs
      • Software development (Boehm, 1981; Brooks, 1975)
      • MUDs/MOOs (Curtis, 1992; Dibbell, 1993)
      • Mailing lists (Sproull & Kiesler, 1991)
    • How has growth affected Wikipedia?
      • Millions of new users and articles
  • Infrastructure
    • Analyze entire history of Wikipedia
      • Every edit to every article
    • Large amount of data
      • 4+ million pages
      • 58+ million revisions
      • 800+ Gb
      • as of June 2006
    • Distributed processing
      • Hadoop distributed filesystem
      • Map/reduce to process data in parallel
  • Types of work Direct work Immediately consumable Indirect work Coordination, conflict Maintenance work Reverts, vandalism Article Talk, user, procedure
  • Less direct work
    • Decrease in proportion of edits to article page
    70%
  • More indirect work
    • Increase in proportion of edits to user talk
    8%
  • More indirect work
    • Increase in proportion of edits to user talk
    • Increase in proportion of edits to procedure
    11%
  • More maintenance work
    • Increase in proportion of edits that are reverts
    7%
  • More wasted work
    • Increase in proportion of edits that are reverts
    • Increase in proportion of edits reverting vandalism
    1-2%
  • Global level
    • Conflict and coordination costs are growing
      • Less direct work (articles)
      • More indirect work (article talk, user, procedure)
      • More maintenance work (reverts, vandalism)
  • Characterizing coordination and conflict
  • Conflict at the article level
    • What defines conflict in articles?
    • Build a characterization model of article conflict
      • Identify page features and metrics associated with conflict
      • Automatically identify high-conflict articles
  • Page metrics
    • Chose metrics for identifying conflict in articles
      • Easily computable, scalable
    Article Reverts (#, by unique editors) Article, talk Minor edits (#, %) Article, talk Administrator edits (#, %) Article, talk Anonymous edits (#, %) Article, talk Links to other articles Article, talk Links from other articles Article, talk Unique editors / revisions Article, talk, article/talk Unique editors Article, talk, article/talk Page length Article, talk, article/talk Revisions (#) Page Type Metric type
  • Defining conflict
    • Operational definition for conflict
    • Revisions tagged controversial
    • Conflict revision count
  • Machine learning
    • Predict conflict from page metrics
      • Training set of “controversial” pages
      • Support vector machine regression predicting # controversial revisions (SMOreg; Smola & Scholkopf, 1998)
    • Not just conflict/no conflict, but how much conflict
  • Performance: Cross-validation
    • 5x cross-validation, R 2 = 0.897
  • Performance: Cross-validation
    • 5x cross-validation, R 2 = 0.897
  • Determinants of conflict
    • —  Revisions (talk)
    • —  Minor edits (talk)
    • ˜  Unique editors (talk)
    • —  Revisions (article)
    • ˜  Unique editors (article)
    • —  Anonymous edits (talk)
    • ˜  Anonymous edits (article)
    Highly weighted metrics of conflict model:
  • Identifying untagged articles
    • Detect conflicts for unlabeled articles
      • Majority of articles have never been conflict tagged
    • Testing model generalization
      • Applied model to untagged articles
      • Sample rated by expert Wikipedians
    • Significant positive correlation with predicted scores
      • By rank correlation, p < 0.013 (Spearman’s rho)
  • Characterizing coordination and conflict
  • Conflict at the user level
    • How can we identify conflict between users?
    • Reverts as a proxy for user conflict
    • Revert patterns between users
    • Force directed layout to cluster users
      • Group similar viewpoints
      • Find conflicts between groups
  • Dokdo/Takeshima opinion groups Group A Group B Group C Group D
  • Terry Schiavo Mediators Sympathetic to parents Sympathetic to husband Anonymous (vandals/spammers)
  • Summary: Characterizing Wikipedia
    • Coordination costs and conflict are increasing
    • Global-level: Trend identification
      • Decrease in direct article work
      • Increase in indirect coordination work
      • Increase in maintenance work
    • Article-level: Prediction using Machine learning
      • Identify characteristics of article conflict
      • Detect conflict-heavy articles needing extra attention
    • User-level: User Conflict Visualization
      • Make sense of user conflicts and identify shared viewpoints
  • Future Work
    • Applied to many domains
      • Corporate memory (Socialtext)
      • Intelligence gathering (Intellipedia)
      • Scholarly research (Scholarpedia)
      • Collaborative problem solving (Lostpedia)
    • Application: Social Dashboard
      • Identify high conflict articles
      • Surface editing patterns to readers
      • Route attention to articles that need it most
  • Future work
  • He Says, She Says: Conflict and Coordination in Wikipedia Aniket Kittur, Bongwon Suh, Bryan Pendleton, Ed Chi UCLA Augmented Social Cognition Group Palo Alto Research Center Thank you!