Successfully reported this slideshow.
Summarizing

Visual analysis of controversy in
user-generated encyclopedias
by Brandes, U., & Lerner, J. (2008). Informati...
About the article

• In a nutshell: contributing to the understanding of authorship
  dynamics in collaboratively-edited a...
Article contents




                  Why this paper matters          693 words
                What’s been done before  ...
Why this matters

• Wikipedia: reliability depends on its “neutrality”
• Neutrality depends on dynamics of author communit...
What’s been done before

• Wikipedia research
   – Disagreement based on the number of reverts (Kittur et al 2007)
• NLP-b...
Finished product:
"Who-revises-whom"-network in 2D space
Legend



         Opposition (to another node)
         Distance from other node
         Thickness of edge connecting to...
Ingredients:
Wikipedia XML dump of revision history file

                                       This visualization uses t...
How to get from XML dump to network visualization?

Three major problems

•   Inferring who revises who
    (data mining i...
Problem 1: Inferring who revises who


• You are probably revising a revision immediately previous to yours if
    – A sho...
Problem 1: Inferring who revises who

Therefore…

• Assumptions: Authors revise in a timely manner revisions they're inter...
Problem 2: Obtaining a metric of conflict between authors

Agrawal et al (2003)

• Quoting an author from a previous usene...
Problem 2: Obtaining a metric for conflict between authors

Therefore…

• Authors revise only what they don't agree with. ...
Problem 3: Mapping authors to a 2D space

A celebration of matrix math




Symmetric adjacency matrix. (duv = dvu)




Let...
Problem 3: Mapping authors to a 2D space

Solve for all x by minimizing the sum




How? Find the smallest eigenvalue, λmi...
Problem 3: Mapping authors to a 2D space

Results
Authors with largest duv entry are furthest away from each other

      ...
Problem 3: Mapping authors to a 2D space

Scaling and tripolar conflicts

"Real arguments are not quite so polarized," say...
Problem 3: Mapping authors to a 2D space

               Conflict 2 axis




Conflict 1
axis




       1 set of opposing ...
Problem 3: Mapping authors to a 2D space

Explanation about scaling makes more sense after normalization around
  ellipse....
Legend



         Opposition (to another node)
         Distance from other node
         Thickness of edge connecting to...
Filtering by time intervals and by number of edges shown
Contributions
Extends multidimensional scaling by visualizing independent conflict(s) (2 bipolar
   conflicts)
Builds on t...
Question for you
      How do you represent
multipolar conflict in general in 2D?

             (Thanks.)
Upcoming SlideShare
Loading in …5
×

"Visual analysis of controversy in user-generated encyclopedias": A critical summary

590 views

Published on

This is a critical summary of an article by Ulrik Brandes and Jurgenv Lerner from their article in Information Visualization, volume 7, issue 1

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

"Visual analysis of controversy in user-generated encyclopedias": A critical summary

  1. 1. Summarizing Visual analysis of controversy in user-generated encyclopedias by Brandes, U., & Lerner, J. (2008). Information Visualization, 7(1), 34-48. Diego Maranan IAT 814 22 October 2009
  2. 2. About the article • In a nutshell: contributing to the understanding of authorship dynamics in collaboratively-edited articles on controversial topics • 16 pages • 18 citations on Google Scholar • University of Konstanz, Germany • Social network analysis and graph layouts since 1999
  3. 3. Article contents Why this paper matters 693 words What’s been done before 635 words Their unique contribution 228 words How they acquire the data 307 words How they parse and filter the data 121 words How they mine and represent the data 3757 words How they make the visualization interactive 362 words Examples 724 words Conclusion 329 words
  4. 4. Why this matters • Wikipedia: reliability depends on its “neutrality” • Neutrality depends on dynamics of author community – Who contributes? – What are the disputes? – What are the roles of the authors? • Roles – Agree or disagree – Regular or sporadic – Contribute a lot or a little – Revise or be revised (“Reactionary” or “revolutionary”)
  5. 5. What’s been done before • Wikipedia research – Disagreement based on the number of reverts (Kittur et al 2007) • NLP-based research on Web 2.0 applications – Analysis of polarizing factors in user opinions (aka people like The Da Vinci Code for different reasons than they hate it) (Chen et al 2006) – Identifying polarizing language and partitioning into opinion groups (Nigam and Hurst 2004) – Metrics for polarity + buzz + author dispersion (Glance et al 2005) • Non-NLP – Vocabulary used by opponent sides tend to be largely identical (Han & Kamber 2006) – Links carry less noisy information than text (Agrawal et al 2003) – Simpler • Visualization-related – Force-directed layout (equally-spacing nodes) – Multidimensional scaling
  6. 6. Finished product: "Who-revises-whom"-network in 2D space
  7. 7. Legend Opposition (to another node) Distance from other node Thickness of edge connecting to other node Involvement (number of edits made) Node area Number and aggregate thickness of edges Role (revisor vs. revisee) Luminance of edge (in relation to a particular author) Shape (in relation to all authors in general) Variance in edit frequency Node brightness
  8. 8. Ingredients: Wikipedia XML dump of revision history file This visualization uses the following revision data: • Timestamp • Author name (or IP address) • A flag to denote whether a revision was a revert … that's it.
  9. 9. How to get from XML dump to network visualization? Three major problems • Inferring who revises who (data mining issue) • Obtaining a metric for conflict between authors (data mining issue) • Meaningfully mapping the authors into a 2D network based on conflict (visual representation issue) Remaining problems are relatively easy • Visually coding individual involvement and participation variance • Visually coding conflict • Presenting aggregate involvement (bar chart) • Choosing and designing interactive features
  10. 10. Problem 1: Inferring who revises who • You are probably revising a revision immediately previous to yours if – A short amount of time has passed since the latest revision – You revise several times in a row – Your revision is a revert • Example: Alice and Bob
  11. 11. Problem 1: Inferring who revises who Therefore… • Assumptions: Authors revise in a timely manner revisions they're interested in • Does not detect revisions where revisor is revising old content (Assumption: these are a minority compared to intensely conflictual revisions, which happen very quickly) • The probability that revision by author u following revision made by author v can be interpreted as "u is revising v" is approximated by
  12. 12. Problem 2: Obtaining a metric of conflict between authors Agrawal et al (2003) • Quoting an author from a previous usenet post creates a quotation link between authors (Link => relationship) • Quotation link implies disagreement in controversies ("You said such-and- such. Well, here's what I think.") • Limited applicability; “assumes a single topic per posting and poster is ‘for’ or ‘against’ that standpoint” (Nigam)
  13. 13. Problem 2: Obtaining a metric for conflict between authors Therefore… • Authors revise only what they don't agree with. (They don't revise to support a statement.) • A revision between authors is a "vote" for conflict between authors • The degree of conflict duv for authors u, v is given by
  14. 14. Problem 3: Mapping authors to a 2D space A celebration of matrix math Symmetric adjacency matrix. (duv = dvu) Let xu and xv be the x-coordinate of authors u and v respectively. Therefore, if u and v are in conflict, xuxvduv is large and negative.
  15. 15. Problem 3: Mapping authors to a 2D space Solve for all x by minimizing the sum How? Find the smallest eigenvalue, λmin, associated with A. The associated eigenvector has all the x-coordinates. Solve this: Claim: results optimized arrangement of authors along x-axis. Do the same for y-axis by taking second smallest eigenvalue, λ'min
  16. 16. Problem 3: Mapping authors to a 2D space Results Authors with largest duv entry are furthest away from each other x Two unrelated bipolar conflicts are separated along x and y axes. (They prove this for the general case.) y x
  17. 17. Problem 3: Mapping authors to a 2D space Scaling and tripolar conflicts "Real arguments are not quite so polarized," says authors. Degrees of tripolar conflict Claim: scale the y-values by λmin /λ'min ... really?
  18. 18. Problem 3: Mapping authors to a 2D space Conflict 2 axis Conflict 1 axis 1 set of opposing 2 sets of opposing What does this mean? opinions opinions How do you represent 3-polar Ex: Ripe vs unripe (conflict 1 is conflicts in a Cartesian bananas independent of presentation space, which conflict 2) inherently employs greater-than and less-than ordering? Ex: Ripe vs unripe Ex: bananas Bananas vs apples vs oranges & Green vs red apples
  19. 19. Problem 3: Mapping authors to a 2D space Explanation about scaling makes more sense after normalization around ellipse. Place all the points around an ellipse by "exploding" them from the center.
  20. 20. Legend Opposition (to another node) Distance from other node Thickness of edge connecting to other node Involvement (number of edits made) Node area Number and aggregate thickness of edges Role (revisor vs. revisee) Luminance of edge (in relation to a particular author) Shape (in relation to all authors in general) Variance in edit frequency Node brightness
  21. 21. Filtering by time intervals and by number of edges shown
  22. 22. Contributions Extends multidimensional scaling by visualizing independent conflict(s) (2 bipolar conflicts) Builds on the work by Agrawal and Kittur Some limitations Only a representation; ignores lesser conflicts (but highlights important ones) Major assumptions about nature of content based on limited data! Very specific application Extends the work of Kittur et al to include non-revert based disagreements Some criticisms Question about meaning of the 2D space. "Absolute value of x and y coordinate indicates involvement in the conflict" Illustrations don’t match the description in the text
  23. 23. Question for you How do you represent multipolar conflict in general in 2D? (Thanks.)

×