Empirical studies about wikipedia
  • 1. Dynamics of libre software communities Introduction to empirical studies about contributions to Wikipedia. Master on Free Software
  • 2. 1. Contextual framework ● Libre and open source software have been thoroughly studied. ● However, that is not the case for open contents creation. – Wikipedia is the most successful example of libre contents creation projects. ● A freely accessible and editable encyclopaedia. Master on Free Software
  • 3. 2. Wikipedia features ● Based on MediaWiki. Master on Free Software
  • 4. 2. Wikipedia features ● Embedded tools in MediaWiki. Master on Free Software
  • 5. 2. Wikipedia features ● Wiki style to add and update contents. – It supports multimedia contents, image galleries, bibliography references, mathematical equations... ● Provide many templates for automatic arranging and indexing of contents. – Table of contents. – Archive pages. – Transcluded pages. Master on Free Software
  • 6. 2. Wikipedia features ● Talk pages. Master on Free Software
  • 7. 2. Wikipedia features ● Everyone can edit contents. – Respect NPoV. – Provide sources. Do not use propietary content. ● http://en.wikipedia.org/wiki/Wikipedia:Annotated_article – Trustworthy contents? ● Nature magazine's study about accuracy of Wikipiedia articles. – Vandalism. – “Edit wars”. Master on Free Software
  • 8. 3. Main areas of study. ● We can study many aspects in Wikipedia. – General statistics and evolution in time. – Community related parameters and content creation process. ● Author's reputation. – Article contents: mainly quality. – Content semantic... – Social networks... Master on Free Software
  • 9. 4. “Measuring Wikipedia” ● By Jakob Voss. – Focused on the German edition. – Exponential growth. – Articles' size ditribution --> tends to log-norm – Distinct authors/article ● Power law γ=2,7 – Distinct articles/author ● Power law γ=1,5 – Edits/author ● Power law γ=0,5 Master on Free Software
  • 10. 4. “On the evolution of Wikipedia” ● By Almeida, Mozafari and Cho. – Revision and creation of articles. ● Follows a self-similar process. – Number of articles contributed per user. ● Decreasing over time. – Time between edits to an article. ● Power law distribution. – Users tend to focus their contributions on a small set of articles. Master on Free Software
  • 11. 4. “Studying cooperation and conflicts...” ● By Viégas, Wattemberg and Dave. – Types of vandalism. ● Mass deletion, offensive copy, phony copy... ● Half of mass deletions reverted within 3 mins. – Visualization of negotiation process. ● Edit wars --> zig-zag patterns. – Contents of pages with at least 100 edits tend to grow linearly. – Superb visualization tool-->History flow. Master on Free Software
  • 12. 4. “On the inequality of contributions...” ● Ortega, Gonzalez-Barahona and Robles. – Gini coefficients (tot. num. revisions). ● Between 0.925 and 0.963. ● It tends to decrease as the number of authors and the number of articles grow. – Evolution in time of Gini coefficients. ● Stabilized between 80-85% in the top-ten language editions (last 2 years). – Japanese language edition. ● Outlier values --> deserves further investigation. Master on Free Software