Slideshow transcript
Slide 1: Corporate Involvement in FLOSS Study of Presence in Debian Code over Time Gregorio Robles, Santiago Dueñas, Jesús González-Barahona Universidad Rey Juan Carlos libresoft.urjc.es Limerick, June 13th 2007
Slide 2: Introduction & Motivation FLOSS landscape has a high variety of participants: individuals, universities, foundations... and, of course, companies! Companies in FLOSS have already been studied, especially focusing on the Why? (Building a community, marketing...) – How? (Business models, etc.) – Where? (Software domains...) – ... –
Slide 3: Goal & Research Questions To measure the involvement of companies in FLOSS, specifically of those that deliver code to the community How much is the involvement of companies? – Has it changed over time? – How many companies? – Who are the main actors? –
Slide 4: The Means Analyze source code available in Debian for contribution of companies More than 10,000 source code packages in – Debian 3.1 Wewill apply our methodology from Debian 2.0 (1998) to Debian 3.1 (2005)
Slide 5: Looking for contributions Scanning for copyright statements Companies have a high motivation to be the copyright holders!
Slide 6: Methodology File selection: Source code files >30 programming languages – 'Ownergrep': Heuristic looking for © Cleaning, multiple entries & merging Avoid double counting
Slide 7: The 'ownergrep' Regular expression .*copyright (?:(c))?[d,-s:]+(?:bys+)?([^d]*) Ownergrep algorithm based on one by Rishab – A. Ghosh and Vipul Prakash
Slide 8: Cleaning & Multiple entries Ad-hoc heuristics lower case, removing white spaces, dots, etc. – Splitting of joint © statements Spencer Kimball and Peter Mattis – IBM corporation and others – But... the same copyright holder may appear in (many) different ways! They should really be merged into a unique one –
Slide 9: Example: IBM Corporation international business machines international business machines inc corporation ibm deutschland entwicklung international business machines gmbh, ibm corporation corp international business machines ibm crop corp international business machines international business machines ibm entwicklung gmbh, ibm corporation and corporation ibm corp ibm ibm deutschland entwicklung gmbh ibm deutschland international business machines, ... and 7 other ways! inc
Slide 10: (Avoiding) Double counting Using a hash for each file Not 2-to-2, but only on same filenames
Slide 11: Results Contribution is doubled every 2 years Companies have been verified manually!! by means of Google searches (thanks Diego!) –
Slide 12: Companies vs. rest (with care)
Slide 13: Top10-contributing Companies Debian 3.1 Sarge 2005
Slide 14: Top10-contributing Companies Debian 2.0 Hamm 1998
Slide 15: Hall Of Fame SUN (1,2,2,2,1) Netscape (X,1,1,X,7) Silicon Graphics (3,4,4,5,4) IBM (X,X,6,1,2) Xerox (4,6,10,X,X) [Other top companies can be found in the paper]
Slide 16: Summing up Approach to quantify the contribution that companies give back to the community Threats to validity: use of heuristics: ownergrep, cleaning, merging – completeness (use of © statements) – Source code by companies gets doubled every two years! Share of contribution has almost maintained constant over the last 7 years
Slide 17: Any questions? Thanks for your attendance and interest! More information is available at http://libresoft.urjc.es






Add a comment on Slide 1
If you have a SlideShare account, login to comment; else you can comment as a guest- Favorites & Groups
Showing 1-50 of 0 (more)