It’s a Network,Not an Encyclopedia
Upcoming SlideShare
Loading in...5
×
 

It’s a Network, Not an Encyclopedia

on

  • 1,323 views

Kane, G. (2009). It’s a Network, Not an Encyclopedia: A Social Network Perspective on Wikipedia Collaboration. Academy of Management Annual Meeting Proceedings.

Kane, G. (2009). It’s a Network, Not an Encyclopedia: A Social Network Perspective on Wikipedia Collaboration. Academy of Management Annual Meeting Proceedings.

Statistics

Views

Total Views
1,323
Views on SlideShare
1,120
Embed Views
203

Actions

Likes
0
Downloads
4
Comments
0

10 Embeds 203

http://ewakuacyjna.blogspot.com 143
http://sonetlab.fbk.eu 45
http://ewakuacyjna.blogspot.de 4
http://ewakuacyjna.blogspot.it 4
http://translate.googleusercontent.com 2
http://ewakuacyjna.blogspot.co.uk 1
http://ewakuacyjna.blogspot.com.es 1
http://ewakuacyjna.blogspot.in 1
http://www.slideshare.net 1
http://ewakuacyjna.blogspot.gr 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

It’s a Network,Not an Encyclopedia It’s a Network, Not an Encyclopedia Presentation Transcript

  • IT’S A NETWORK, NOT AN ENCYCLOPEDIA: A SOCIAL NETWORK PERSPECTIVE ON WIKIPEDIA COLLABORATION Gerald C. Kane (and Sam Ransbotham) Assistant Professor of Information Systems Carroll School of Management Boston College Paper presented at the Academy of Management Annual Meeting 2009 – Chicago Presentation adapted from http://www.profkane.com/uploads/7/9/1/3/79137/aom_2009.pptx
  • WIKIPEDIA AS KNOWLEDGE NETWORK
    • Wikipedia is a good environment for studying collaborative processes (Kane and Fichman 2009) .
    • Research on collaboration on Wikipedia (Butler et al 2008, Kittur and Kraut 2008, Wilkinson and Huberman 2008) .
      • Views articles as independent
      • But contributors collaborate on multiple articles, can transfer content, process, and reputational knowledge
    • Assumption: Wikipedia’s article are not independent, but interconnected
    • RQ: How the connections between WP articles influence article quality?
  • SOCIAL NETWORK ANALYSIS (SNA)
    • SNA used to study collaborative environments (Borgatti and Foster 2003, Constant et al 1996) .
    • Structure of the network and node’s position within structure describes how knowledge flows through network.
    • Two-mode network is classic, but underused, network conceptualization
      • One type of node (contributor) is viewed as the tie connecting other types of node (articles).
      • Article is unit of analysis.
    • Centrality is among most widely used measures, refers to whether an individual node is situated within core or periphery of network (Wasserman and Faust 1994, Scott 2000).
      • Central nodes typically perform better, because have better access to knowledge contained in network.
  • TWO-MODE NETWORKS (EXAMPLE)
    • “ Mode” = “a distinct set of entities on which the structural variables are measured. […] Structural variables measured on a single set o actors give rise to one-mode network” (Wasserman & Faust, 1994, p. 29)
    • Two-mode network = a network dataset containing two sets of actors. Measure: which actors from one set have ties to actors in the other set
    • Affiliation network : special type of two-mode network, with one set of actors and one set of events. Relation: which actor participate in which events?
  • TWO-MODE NETWORKS (EXAMPLE)
    • People that participate in social events – film festivals (incidence matrix):
    • Adjacency matrix (events by events)
    Cannes Berlin Venice Moscow Shangai Julia 1 1 1 1 0 Sean 1 1 1 0 1 Audrey 0 1 1 1 0 Tom 0 0 1 0 1 Cannes Berlin Venice Moscow Shangai Cannes 2 2 1 1 Berlin 2 3 2 1 Venice 2 3 2 2 Moscow 1 2 2 0 Shangai 1 1 2 0
  • TWO-MODE NETWORKS (EXAMPLE)
    • People that participate in social events – film festivals (incidence matrix):
    • Adjacency matrix (actor by actor)
    Cannes Berlin Venice Moscow Shangai Julia 1 1 1 1 0 Sean 1 1 1 0 1 Audrey 0 1 1 1 0 Tom 0 0 1 0 1 Julia Sean Audrey Tom Julia 3 3 1 Sean 3 2 2 Audrey 3 2 1 Tom 1 2 1
  • TWO TYPES OF CENTRALITY
    • Different measures of centrality assess node’s position at different levels of the network (Faust 1997).
    • Degree centrality = local network
      • Number of nodes directly connected to a node and strength of those relationships.
      • H1: The degree centrality of an article in the two-mode matrix of articles and contributors will be positively related to article quality.
    • Eigenvector centrality = global network.
      • How important (central) are the nodes to which the focal node is directly connected to.
      • H2: The eigenvector centrality of an article in the two-mode matrix of articles and contributors will be positively related to quality.
  • EIGENVECTOR CENTRALITY - EXAMPLE A B 1 B 2 B 3
  • EIGENVECTOR CENTRALITY - EXAMPLE A B 1 B 2 B 3
  • SETTING AND METHOD
    • Sampled 300 (out of 15,000) medical articles on Wikipedia in Wikiproject Medicine (WP:MED).
    • Qualitative analysis confirmed that WP:MED was largely independent sub-community on Wikipedia.
      • 75% of most active contributors worked mostly or exclusively on medical articles.
      • 1/3 medical professionals, 1/3 patients, 1/3 Wikipedians.
    • WP:MED assesses quality of all articles:
      • Featured Article (best): FA
      • A-quality: A
      • Good Article: GA
      • B-quality: B
      • Start-quality (worst): Start
    Collected all FA, A, GA (~100) Random sample of B, Start (~200)
  • TWO-MODE NETWORK OF ARTICLES AND CONTRIBUTORS Squares = contributors Circles = articles Red = Featured Articles Orange = A-quality Articles Yellow = Good Articles Light Blue = B-quality Articles Dark Blue = Start-quality articles
  • VARIABLES
    • Dependent variable: quality
    • Independent variables: Transformed 300 (articles) x 1800 (contributors) two-mode network into 300 x 300 matrix (article by article) . Nodes are articles, ties are the n° of editors who contributed to each pair of articles. UCINet:
      • Degree centrality
      • Eigenvector centrality
  • VARIABLES
    • Control variables:
      • Topic Importance
        • Most important article are likely to receive more attention, to attract divergent opinions and have a greater base of knowledge.
        • (1-4). Assigned by WP:MED.
      • Popularity :
        • Past research: editors may be more likely to contribute to high-traffic articles; patients tend to click on the top search results.
        • Traffic : Average number page views/ month 1Q2008 (Alexa.com).
        • Google Rank : Is it the top Google result? (60% were).
      • Direct collaboration (article and discussion pages).
        • Number of unique contributors , average edits/contributor , number of anonymous contributors .
  • DATA ANALYSIS
    • Ordinal Regression in SPSS 16
      • When there is a progressive relationship within a categorical dependent variable, but it is unclear the magnitude of difference between the categories.
      • Negative log-log link function chosen because high proportion of lower quality articles (200).
  • HYPOTHESIS
    • H1: The degree centrality of an article in the 2-mode incidence matrix of articles and editors will be positively related to article quality.
    • H2: The eigenvector centrality of an article in the 2-mode incidence matrix of articles and editors will be positively related to article quality.
  • Results MODEL 1 MODEL 2 Est. SE Wald Sig. Est. SE Wald Sig. Traffic 0.00** 0.00 7.76 0.005 0.00 0.00 2.26 0.13 Google Rank 0.43** 0.15 7.97 0.005 0.43** 0.16 7.51 0.01 Importance -0.82*** 0.12 49.08 0.000 -0.91*** 0.12 53.72 0.00 Contributors (A) 0.00 0.00 1.17 0.279 0.00 0.00 0.15 0.69 Contributors (D) 0.00 0.00 0.08 0.776 0.00 0.00 0.68 0.41 Edits/Contributor(A) 0.63*** 0.12 28.59 0.000 0.61** 0.12 24.35 0.00 Edits/Contributor(D) -0.06 0.05 1.67 0.196 -0.08 0.05 2.16 0.14 Anon (Article) 4.15*** 1.04 16.00 0.000 3.47*** 1.06 10.72 0.00 Anon (Discussion) -2.06** 0.75 7.54 0.006 -1.46* 0.76 3.71 0.05 Deg. Cent. 0.00** 0.00 7.29 0.01 Eig. Cent. 0.33*** 0.09 12.60 0.00 Pseudo R-Square Cox and Snell 0.379 0.474 Nagelkerke 0.407 0.510
  • RESULTS
    • Both hypotheses supported.
      • Degree and Eigenvector centrality highly significant (p<.01+) in the expected direction.
      • Only one of the measures of direct collaboration is significant (edits/contributor), underscoring importance of network variables.
    • Topic importance. Most highly significant variable in model, but negatively correlated with quality. Implications for value of medical knowledge on WP.
    • Anonymity is positively related on article page, but negative related on talk page. Differential effect noted elsewhere (Sia et al., 2002) .
  • IMPLICATIONS FOR THEORY AND PRACTICE
    • Theoretical Implications.
      • Collaborative features associated with quality in community-based knowledge creation settings.
      • Network variables critically important for predicting quality, should not examine as independent collaborative environments.
      • Validation in other settings needed.
    • Managerial implications.
      • Companies are beginning to employ similar communities for strategic purposes (Tapscott 2006).
      • Should approach as integrated collaborative network, not simply independent efforts.
  • FUTURE DIRECTIONS
    • Connect more directly with existing theories of information quality (e.g. Constant et al. 1996).
      • Volume of information = number of authors.
      • Diversity of information = degree centrality
      • Quality of information = eigenvector centrality.
    • Recruiting team of 4 th year medical school students to validate WP rankings.
      • Pilot (60 articles) = 90% accuracy and 85% IRR
    • Collecting entire WP:MED.
      • full text history of 2,026,992 revisions to all of the 16,354 Wikipedia medical articles
      • Eliminates question of biased sampling, reduces multicollinearity.
      • Additional Controls: Age, Length, Complexity, Sections, References.