Analyzing Multidimensional
Networks within MediaWikis!
WikiSym 2013!
Hong Kong, China!
August 7, 2013!
Brian Keegan, Ph.D....
Outline!
•  Motivation!
•  Relationships within MediaWikis!
•  Multidimensional network exploration!
•  NodeXL platform!
•...
Motivation!
•  Collaboration is fundamentally
relational!
•  Use network analysis methods to
understand success of wikis!!...
Relationship types!
User-Object relationships!
!
!
User-User relationships!
!
!
Object-Object relationships!
4
e	
   a	
  ...
User-Object relationships!
•  Editing!
•  user e makes a revision to article a!
•  Watchlist!
•  user e has article a on w...
Undirected User-User relationships!
•  Co-authorship!
•  e1 and e2 edited the same article !
•  Co-affiliation!
•  e1 and e...
Directed User-User relationships!
•  Discussion!
•  e1 left a message on e2’s talk page !
•  Article trajectory!
•  e2 mod...
Undirected Object-Object relationships!
•  Shared authorship!
•  a1 and a2 were edited by the same users!
•  Category co-m...
Directed Object-Object relationships!
•  Hyperlinks!
•  a1 has a link to a2 !
•  Editor trajectory!
•  a2 is modified by a ...
Multidimensional networks!
•  Multiple types of links between nodes!
•  Hyperlink!
•  Shared authorship!
•  Category co-me...
Network exploration!
11
Network exploration!
12
NodeXL Platform!
•  https://nodexl.codeplex.com/!
•  Lower barriers to entry by using spreadsheet workflows!
•  Network ana...
NodeXL MediaWiki Importer!
•  https://wikiimporter.codeplex.com/!
•  Graph data provider for NodeXL à new “spigot”!
•  Qu...
NodeXL MediaWiki Importer!
15
Rela%onship	
  to	
  crawl	
  
Boundary	
  condi%ons	
  
Case Study!
•  Compare the structures of different relationships across
two types of English Wikipedia articles!
•  “Socia...
17
Co-authorship!
Pope Francis! Social media!
Nodes are editors who contributed to article
Links together if they contribu...
18
Article trajectory!
Pope Francis! Social media!
Nodes are editors who contributed to article
Links together if they edi...
19
User discussion!
Pope Francis! Social media!
Nodes are editors who contributed to article
Links together if they left m...
20
Shared authorship!
Pope Francis! Social media!
Nodes are other articles edited by the users who contributed to article
...
21
Hyperlink!
Pope Francis! Social media!
Nodes are articles linked from seed article
Links together if they link to each ...
Structural Typologies!
22
Discussion!
•  Wikipedia and other MediaWiki projects contain a variety
of complex and multidimensional relationships amon...
Future work!
•  Incorporating additional meta-data!
•  Editors (registered, edit count, block count, tenure)!
•  Objects (...
25
THANK YOU!!
Brian Keegan, Ph.D.
@bkeegan
Arber Ceni Marc A. Smith, Ph.D.
@marc_smith
Upcoming SlideShare
Loading in …5
×

Analyzing Multidimensional Networks within MediaWikis

837 views

Published on

The MediaWiki platform supports popular socio-technical systems such as Wikipedia as well as thousands of other wikis. This software encodes and records a variety of relationships about the content, history, and editors of its articles such as hyperlinks between articles, discussions among editors, and editing histories. These relationships can be analyzed using standard techniques from social network analysis, however, extracting relational data from Wikipedia has traditionally required specialized knowledge of its API, information retrieval, network analysis, and data visualization that has inhibited scholarly analysis. We present a software library called the NodeXL MediaWiki Importer that extracts a variety of relationships from the MediaWiki API and integrates with the popular NodeXL network analysis and visualization software. This library allows users to query and extract a variety of multidimensional relationships from any MediaWiki installation with a publicly-accessible API. We present a case study examining the similarities and differences between di erent relationships for the Wikipedia articles about "Pope Francis" and "Social media." We conclude by discussing the implications this library has for both theoretical and methodological research as well as community management and outline future work to expand the capabilities of the library.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
837
On SlideShare
0
From Embeds
0
Number of Embeds
199
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Analyzing Multidimensional Networks within MediaWikis

  1. 1. Analyzing Multidimensional Networks within MediaWikis! WikiSym 2013! Hong Kong, China! August 7, 2013! Brian Keegan, Ph.D. @bkeegan Arber Ceni Marc A. Smith, Ph.D. @marc_smith
  2. 2. Outline! •  Motivation! •  Relationships within MediaWikis! •  Multidimensional network exploration! •  NodeXL platform! •  NodeXL MediaWiki Importer! •  Case Study! •  Demo! 2
  3. 3. Motivation! •  Collaboration is fundamentally relational! •  Use network analysis methods to understand success of wikis!! •  A variety of MediaWiki meta-data accessible through API are relational! •  Build on top of existing network analysis package to simplify retrieval, structuring, cleanup, and visualization! 3
  4. 4. Relationship types! User-Object relationships! ! ! User-User relationships! ! ! Object-Object relationships! 4 e   a   e   e   a  a  
  5. 5. User-Object relationships! •  Editing! •  user e makes a revision to article a! •  Watchlist! •  user e has article a on watchlist! •  Affiliation! •  user e is a member of project a! 5 e   a  
  6. 6. Undirected User-User relationships! •  Co-authorship! •  e1 and e2 edited the same article ! •  Co-affiliation! •  e1 and e2 are members of the same project! 6 e1   e2  
  7. 7. Directed User-User relationships! •  Discussion! •  e1 left a message on e2’s talk page ! •  Article trajectory! •  e2 modified the article after e1! 7 e1   e2  
  8. 8. Undirected Object-Object relationships! •  Shared authorship! •  a1 and a2 were edited by the same users! •  Category co-membership! •  a1 and a2 are members of the same categories! 8 a2  a1  
  9. 9. Directed Object-Object relationships! •  Hyperlinks! •  a1 has a link to a2 ! •  Editor trajectory! •  a2 is modified by a user after a1! 9 a2  a1  
  10. 10. Multidimensional networks! •  Multiple types of links between nodes! •  Hyperlink! •  Shared authorship! •  Category co-membership! •  Presence of overlapping ties may explain collaboration more richly! •  Absence of overlapping ties may reveal anomalies for follow-on analysis! 10 a2  a1  
  11. 11. Network exploration! 11
  12. 12. Network exploration! 12
  13. 13. NodeXL Platform! •  https://nodexl.codeplex.com/! •  Lower barriers to entry by using spreadsheet workflows! •  Network analysis plug-in for Microsoft Excel! •  “Spigots” to import network data from Twitter, Facebook, Flickr, Email, YouTube, and WWW! 13
  14. 14. NodeXL MediaWiki Importer! •  https://wikiimporter.codeplex.com/! •  Graph data provider for NodeXL à new “spigot”! •  Queries MediaWiki API through DotNetWikiBot framework! •  Given a Page and a Site, returns a PageList! 14
  15. 15. NodeXL MediaWiki Importer! 15 Rela%onship  to  crawl   Boundary  condi%ons  
  16. 16. Case Study! •  Compare the structures of different relationships across two types of English Wikipedia articles! •  “Social media”! •  “Pope Francis”! •  Node layout via “Harel-Koren Fast Multiscale”! •  Spring-embedding layout to emphasize clusters of ties! •  Nodes grouped via “Clauset-Newman-Moore”! •  Nodes assigned to group if more ties within group than outside! •  “Group-in-a-box” layout! •  Ties within group visualized individually, ties between groups collapsed together! 16
  17. 17. 17 Co-authorship! Pope Francis! Social media! Nodes are editors who contributed to article Links together if they contributed to other articles
  18. 18. 18 Article trajectory! Pope Francis! Social media! Nodes are editors who contributed to article Links together if they edited after one another
  19. 19. 19 User discussion! Pope Francis! Social media! Nodes are editors who contributed to article Links together if they left messages on other users’ talk
  20. 20. 20 Shared authorship! Pope Francis! Social media! Nodes are other articles edited by the users who contributed to article Links together if they share multiple co-authors
  21. 21. 21 Hyperlink! Pope Francis! Social media! Nodes are articles linked from seed article Links together if they link to each other
  22. 22. Structural Typologies! 22
  23. 23. Discussion! •  Wikipedia and other MediaWiki projects contain a variety of complex and multidimensional relationships among users and objects! •  NodeXL MediaWiki Importer is a tool for simplifying complex data extraction and analysis workflows! •  NodeXL provides a powerful suite of tools to analyze and visualize the structure of multidimensional relationships! •  Empirical testing of social theories as well as diagnosing the health of online communities! 23
  24. 24. Future work! •  Incorporating additional meta-data! •  Editors (registered, edit count, block count, tenure)! •  Objects (namespace, age, edit count, assessment, pageviews)! •  Content-level features (images, keywords)! •  Temporal features! •  Additional relationships! •  Inter-language links! •  Backlinks! •  Wiki-love! •  Blocks (users and objects)! 24
  25. 25. 25 THANK YOU!! Brian Keegan, Ph.D. @bkeegan Arber Ceni Marc A. Smith, Ph.D. @marc_smith

×