From libre software to Wikipedia:
A tour of open collaboration
Felipe Ortega
Libresoft, Universidad Rey Juan Carlos
e-mail: jfelipe@libresoft.es
Twitter | Identi.ca: @jfelipe
Xerox PARC
June 14, 2011
By Diego GrezCC-BY-SA 3.0, Wikimedia Commons
“Think of how Wikipedia works, how Amazon harnesses
user annotation on its site, the way photo-sharing sites
like Flickr are bleeding out into other applications...
We're entering an era in which software learns from
its users and all of the users are connected”.
Tim O'Reilly.
TIME Magazine, 24 October 2005.
By Felipe Ortega, CC-BY-SA 3.0
In the beginning...
● ...all started with “real programmers” and FLOSS.
● FSF, GNU, free licenses.
● Open source goes into industry.
● Libre software becomes ubiquitous.
● However
● Crowdsourced ! = Open source
● Much betters if results encourage reusing and
distribution of derivative works.
The “paradox” of open collaboration
“Wikipedia is the best thing ever. Anyone in the world can
write anything they want about any subject, so you know
you are getting the best possible information.”.
Michael Scott (played by Steve Carell)
The Office, "The Negotiation" [3.18], 5 April 2007
3 lessons from libre software
● Onion model.
● Generational relay.
● Lasting participation. By El_T, Public Domain,
from Wikimedia Commons
Onion model
The Social Structure of Free and Open Source Software Development
Crowston & Howison, 2005
Lasting participation
● Robles, González-Barahona and Michlmayr.
Evolution of Volunteer Participation in Libre Software
Projects: Evidence from Debian. OSS 2005.
Half-life ratio = 7.5 years!
+50% maintainers in Debian 2.0 still present in Debian 3.1
Thesis. Wikipedia: A quantitative
analysis.
● Apply lessons from libre software to under-
stand open collaborative process in Wikipedia.
● Content production.
● Effort distribution.
● Implications for quality.
● Participation and sustainability.
Tool: WikiXRay
Automated analysis of Wikipedia dumps.
http://git.libresoft.es/WikiXRay
Download
Local MySQL
Wikimedia Download Compressed dumps
Server
Center DB dumps
WIKIXRAY
Results evaluation Analysis (scripts + GNU R) Preparation for
data mining
New articles created in Wikipedia
Entered steady-state in 2006,
before graph of monthly edits
became stable (2007)
Contributions per editor
● Upper truncated Pareto
distribution.
● Limit in max. number of
revisions by human
editors.
● Better to have more
editors rather than
increasing contributions
per editor.
Monthly effort distribution Wikipedia
Constant over the whole history!
Ortega, F., González-Barahona, J., Robles, G.
On the inequality of contributions to Wikipedia.
HICSS 2008.
Profile editors in Featured Articles
● Most Featured Articles are at least 1,000 days old.
● 10 times more editors in FAs than in non-FAs,
almost 200 times in EN (!!).
● FAs reviewed by significantly older authors
(+3 years actively contributing to Wikipedia).
FAs non-FAs
The Digital Potlatch
● Book with J. Rodríguez (in Spanish).
● Ed. Cátedra, expected September 2011.
● Interdisciplinary.
● Anthropology + Engineering.
● Meritocracy in Wikipedia.
● Effort recognition.
● Motivations.
● Implications for quality.
Public Domain, from Wikimedia Commons
Future lines of work
● Study causes of change in
evolution patterns and reverts.
● “The singularity is not near” By Bios, CC-BY-SA 3.0, from
Wikimedia Commons
ASC @PARC, WikiSym 2009.
● Edit diffs to study contribution patterns.
● Different types of content.
● Cross-relation with traffic patterns.