From libre software to Wikipedia: A tour of open collaborationFelipe OrtegaLibresoft, Universidad Rey Juan Carlose-mail: firstname.lastname@example.orgTwitter | Identi.ca: @jfelipeXerox PARCJune 14, 2011 By Diego GrezCC-BY-SA 3.0, Wikimedia Commons
“Think of how Wikipedia works, how Amazon harnessesuser annotation on its site, the way photo-sharing siteslike Flickr are bleeding out into other applications...Were entering an era in which software learns fromits users and all of the users are connected”.Tim OReilly.TIME Magazine, 24 October 2005. By Felipe Ortega, CC-BY-SA 3.0
In the beginning...● ...all started with “real programmers” and FLOSS. ● FSF, GNU, free licenses. ● Open source goes into industry. ● Libre software becomes ubiquitous.● However ● Crowdsourced ! = Open source ● Much betters if results encourage reusing and distribution of derivative works.
The “paradox” of open collaboration“Wikipedia is the best thing ever. Anyone in the world canwrite anything they want about any subject, so you knowyou are getting the best possible information.”.Michael Scott (played by Steve Carell)The Office, "The Negotiation" [3.18], 5 April 2007
3 lessons from libre software● Onion model.● Generational relay.● Lasting participation. By El_T, Public Domain, from Wikimedia Commons
Onion modelThe Social Structure of Free and Open Source Software DevelopmentCrowston & Howison, 2005
Lasting participation● Robles, González-Barahona and Michlmayr. Evolution of Volunteer Participation in Libre Software Projects: Evidence from Debian. OSS 2005. Half-life ratio = 7.5 years!+50% maintainers in Debian 2.0 still present in Debian 3.1
Thesis. Wikipedia: A quantitativeanalysis.● Apply lessons from libre software to under- stand open collaborative process in Wikipedia. ● Content production. ● Effort distribution. ● Implications for quality. ● Participation and sustainability.
Tool: WikiXRayAutomated analysis of Wikipedia dumps.http://git.libresoft.es/WikiXRay Download Local MySQLWikimedia Download Compressed dumps Server Center DB dumps WIKIXRAYResults evaluation Analysis (scripts + GNU R) Preparation for data mining
New articles created in Wikipedia Entered steady-state in 2006, before graph of monthly edits became stable (2007)
Interaction: talk pages100%90%80%70%60%50% no-talk40% talk30%20%10% 0% EN DE FR PL JA NL IT PT ES SV 0.0086% (old talk pages deleted)
Contributions per editor ● Upper truncated Pareto distribution. ● Limit in max. number of revisions by human editors. ● Better to have more editors rather than increasing contributions per editor.
Monthly effort distribution Wikipedia Constant over the whole history! Ortega, F., González-Barahona, J., Robles, G. On the inequality of contributions to Wikipedia. HICSS 2008.
Profile editors in Featured Articles● Most Featured Articles are at least 1,000 days old.● 10 times more editors in FAs than in non-FAs, almost 200 times in EN (!!).● FAs reviewed by significantly older authors (+3 years actively contributing to Wikipedia). FAs non-FAs
The Digital Potlatch● Book with J. Rodríguez (in Spanish). ● Ed. Cátedra, expected September 2011.● Interdisciplinary. ● Anthropology + Engineering.● Meritocracy in Wikipedia.● Effort recognition.● Motivations.● Implications for quality. Public Domain, from Wikimedia Commons
Future lines of work● Study causes of change in evolution patterns and reverts. ● “The singularity is not near” By Bios, CC-BY-SA 3.0, from Wikimedia Commons ASC @PARC, WikiSym 2009.● Edit diffs to study contribution patterns.● Different types of content.● Cross-relation with traffic patterns.
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.