Slideshare.net (beta)

 

All comments

Add a comment on Slide 1

If you have a SlideShare account, login to comment; else you can comment as a guest


Showing 1-50 of 0 (more)

Involvement Of Companies In Debian

From amaneiro, 9 months ago

301 views  |  0 comments  |  0 favorites  |  1 download
Embed
options

More Info

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License
This slideshow is Public
Total Views: 301
on Slideshare: 301
from embeds: 0

Slideshow transcript

Slide 1: Corporate Involvement in FLOSS Study of Presence in Debian Code over Time Gregorio Robles, Santiago Dueñas, Jesús González-Barahona Universidad Rey Juan Carlos libresoft.urjc.es Limerick, June 13th 2007    

Slide 2: Introduction & Motivation  FLOSS landscape has a high variety of participants: individuals, universities, foundations... and, of course, companies!  Companies in FLOSS have already been studied, especially focusing on the Why? (Building a community, marketing...) – How? (Business models, etc.) – Where? (Software domains...) – ... –

Slide 3: Goal & Research Questions  To measure the involvement of companies in FLOSS, specifically of those that deliver code to the community How much is the involvement of companies? – Has it changed over time? – How many companies? – Who are the main actors? –

Slide 4: The Means  Analyze source code available in Debian for contribution of companies More than 10,000 source code packages in – Debian 3.1  Wewill apply our methodology from Debian 2.0 (1998) to Debian 3.1 (2005)

Slide 5: Looking for contributions  Scanning for copyright statements  Companies have a high motivation to be the copyright holders!

Slide 6: Methodology   File selection: Source code files >30 programming languages –  'Ownergrep': Heuristic looking for ©  Cleaning, multiple entries & merging  Avoid double counting

Slide 7: The 'ownergrep'  Regular expression .*copyright (?:(c))?[d,-s:]+(?:bys+)?([^d]*)  Ownergrep algorithm based on one by Rishab – A. Ghosh and Vipul Prakash

Slide 8: Cleaning & Multiple entries  Ad-hoc heuristics lower case, removing white spaces, dots, etc. –  Splitting of joint © statements Spencer Kimball and Peter Mattis – IBM corporation and others –  But... the same copyright holder may appear in (many) different ways! They should really be merged into a unique one –

Slide 9: Example: IBM Corporation international business machines international business machines   inc corporation ibm deutschland entwicklung international business machines   gmbh, ibm corporation corp international business machines ibm crop   corp international business machines  international business machines  ibm entwicklung gmbh, ibm  corporation and corporation ibm corp  ibm  ibm deutschland entwicklung gmbh  ibm deutschland  international business machines,  ... and 7 other ways!  inc

Slide 10: (Avoiding) Double counting  Using a hash for each file  Not 2-to-2, but only on same filenames

Slide 11: Results  Contribution is doubled every 2 years  Companies have been verified manually!! by means of Google searches (thanks Diego!) –

Slide 12: Companies vs. rest (with care)

Slide 13: Top10-contributing Companies  Debian 3.1  Sarge  2005

Slide 14: Top10-contributing Companies  Debian 2.0  Hamm  1998

Slide 15: Hall Of Fame  SUN (1,2,2,2,1)  Netscape (X,1,1,X,7)  Silicon Graphics (3,4,4,5,4)  IBM (X,X,6,1,2)  Xerox (4,6,10,X,X)  [Other top companies can be found in the paper]

Slide 16: Summing up  Approach to quantify the contribution that companies give back to the community  Threats to validity: use of heuristics: ownergrep, cleaning, merging – completeness (use of © statements) –  Source code by companies gets doubled every two years!  Share of contribution has almost maintained constant over the last 7 years

Slide 17: Any questions? Thanks for your attendance and interest! More information is available at http://libresoft.urjc.es