Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Involvement Of Companies In Debian

1,249 views

Published on

Published in: Economy & Finance, Technology
  • Be the first to comment

  • Be the first to like this

Involvement Of Companies In Debian

  1. 1. Corporate Involvement in FLOSS Study of Presence in Debian Code over Time Gregorio Robles, Santiago Dueñas, Jesús González-Barahona Universidad Rey Juan Carlos libresoft.urjc.es Limerick, June 13th 2007    
  2. 2. Introduction & Motivation  FLOSS landscape has a high variety of participants: individuals, universities, foundations... and, of course, companies!  Companies in FLOSS have already been studied, especially focusing on the Why? (Building a community, marketing...) – How? (Business models, etc.) – Where? (Software domains...) – ... –
  3. 3. Goal & Research Questions  To measure the involvement of companies in FLOSS, specifically of those that deliver code to the community How much is the involvement of companies? – Has it changed over time? – How many companies? – Who are the main actors? –
  4. 4. The Means  Analyze source code available in Debian for contribution of companies More than 10,000 source code packages in – Debian 3.1  Wewill apply our methodology from Debian 2.0 (1998) to Debian 3.1 (2005)
  5. 5. Looking for contributions  Scanning for copyright statements  Companies have a high motivation to be the copyright holders!
  6. 6. Methodology   File selection: Source code files >30 programming languages –  'Ownergrep': Heuristic looking for ©  Cleaning, multiple entries & merging  Avoid double counting
  7. 7. The 'ownergrep'  Regular expression .*copyright (?:(c))?[d,-s:]+(?:bys+)?([^d]*)  Ownergrep algorithm based on one by Rishab – A. Ghosh and Vipul Prakash
  8. 8. Cleaning & Multiple entries  Ad-hoc heuristics lower case, removing white spaces, dots, etc. –  Splitting of joint © statements Spencer Kimball and Peter Mattis – IBM corporation and others –  But... the same copyright holder may appear in (many) different ways! They should really be merged into a unique one –
  9. 9. Example: IBM Corporation international business machines international business machines   inc corporation ibm deutschland entwicklung international business machines   gmbh, ibm corporation corp international business machines ibm crop   corp international business machines  international business machines  ibm entwicklung gmbh, ibm  corporation and corporation ibm corp  ibm  ibm deutschland entwicklung gmbh  ibm deutschland  international business machines,  ... and 7 other ways!  inc
  10. 10. (Avoiding) Double counting  Using a hash for each file  Not 2-to-2, but only on same filenames
  11. 11. Results  Contribution is doubled every 2 years  Companies have been verified manually!! by means of Google searches (thanks Diego!) –
  12. 12. Companies vs. rest (with care)
  13. 13. Top10-contributing Companies  Debian 3.1  Sarge  2005
  14. 14. Top10-contributing Companies  Debian 2.0  Hamm  1998
  15. 15. Hall Of Fame  SUN (1,2,2,2,1)  Netscape (X,1,1,X,7)  Silicon Graphics (3,4,4,5,4)  IBM (X,X,6,1,2)  Xerox (4,6,10,X,X)  [Other top companies can be found in the paper]
  16. 16. Summing up  Approach to quantify the contribution that companies give back to the community  Threats to validity: use of heuristics: ownergrep, cleaning, merging – completeness (use of © statements) –  Source code by companies gets doubled every two years!  Share of contribution has almost maintained constant over the last 7 years
  17. 17. Any questions? Thanks for your attendance and interest! More information is available at http://libresoft.urjc.es

×