Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Challenges in Software Ecosystem Research

804 views

Published on

In this talk we discuss the results of the survey of software ecosystems researchers conducted in October-December 2014. Researchers have been asked to identify the current trends in ecosystems’ research as well as the challenges the research community has to address in the coming years. We augment discussion of the trends identified by the community by the review of some of the recent results on software ecosystems.

Published in: Software
  • Be the first to comment

Challenges in Software Ecosystem Research

  1. 1. Challenges in Software Ecosystems Research Alexander Serebrenik Eindhoven University of Technology The Netherlands @aserebrenik Tom Mens UMons Belgium @tom_mens
  2. 2. Software ecosystems in scientific literature 0 125 250 375 500 1996199719981999200020012002200320042005200620072008200920102011201220132014 Scholar full text DBLP titles
  3. 3. Future challenges?
  4. 4. Definition of an ecosystem Example of an ecosystem Trends and challenges
  5. 5. 164 authors of an article or a book chapter on SECO, paper in IWSECO, WEA or Big Systems 2014 141 authors with a valid email address 26* answered the survey * response rate 18,4%, comparable with other surveys
  6. 6. Definition of an ecosystem Respondent: “Defining everything as an ecosystem. <…> The word is trend-ish and it causes misunderstandings in the field.”
  7. 7. “The complex system of plant, animal, fungal, and microorganism communities and their associated non-living environment interacting as an ecological unit. Ecosystems have no fixed boundaries”
  8. 8. [Lungu 2008] [Jansen et al. 2009] [Manikas, Hansen 2013] <biological> communities software projects actors actors environment environment shared markt for software and services, shared platform common technological platform interaction developed and evolve together exchange of information, resources & artefacts symbiotic relationships Definition of an ecosystem
  9. 9. social technical economical [Lungu 2008] [Bosch, Bosch- Sijtsema 2009] [*Moore 1993][Jansen et al. 2009] [Mitleton-Kelly 2003] [Manikas, Hansen 2013] Definition of an ecosystem
  10. 10. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  11. 11. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  12. 12. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  13. 13. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  14. 14. companies app stores OS foundations programming languages operation systems forges & social ecosystems Example of an ecosystem Based on the literature
  15. 15. Definition of an ecosystem Example of an ecosystem Respondent: “Defining everything as an ecosystem. <…> The word is trend-ish and it causes misunderstandings in the field.” social economicaltechnical Different perspectives on the same artefacts or different artefacts all together?
  16. 16. Trends and challenges 26 survey answers Literature study {29 challenges 8 categories
  17. 17. One challenge is to be able to characterize the wealth of the community wrt the wealth of the software components. What is the impact of different collaboration and development practices on the quality of the ecosystem?” Trends and challenges
  18. 18. One challenge is to be able to characterize the wealth of the community wrt the wealth of the software components. What is the impact of different collaboration and development practices on the quality of the ecosystem?” Trends and challenges ecosystem quality socio-technical
  19. 19. One challenge is to be able to characterize the wealth of the community wrt the wealth of the software components. What is the impact of different collaboration and development practices on the quality of the ecosystem?” Trends and challenges ecosystem quality socio-technical SECOs may consist of many systems. Analysing all these systems as a whole may raise some technical problems, due to the quantity of data to take into account. data analytics amount (volume) large databases with comparable information about the details of a large collection of ecosystems, so that any research could be conducted in a repeatable and comparable way. database of comparable inforeproducible research
  20. 20. Software Ecosystems are/lead to Big Data ~
  21. 21. male likes games NYC Privacy: digital trace data
  22. 22. Privacy: surveys Minority respondents are easy to identify Reproducibility vs privacy
  23. 23. Non-sensitive Sensitive Zip Age Nationality Condition 1 13053 28 Russian Heart Disease 2 13068 29 American Heart Disease 3 13068 21 Japanese Viral Infection 4 13053 23 American Viral Infection 5 14853 50 Indian Cancer 6 14853 55 Russian Heart Disease 7 14850 47 American Viral Infection 8 14850 49 American Viral Infection 9 13053 31 American Cancer 10 13053 37 Indian Cancer 11 13068 36 Japanese Cancer 12 13068 35 American Cancer
  24. 24. Non-sensitive Sensitive Zip Age Nationality Condition 1 130** <30 * Heart Disease 2 130** <30 * Heart Disease 3 130** <30 * Viral Infection 4 130** <30 * Viral Infection 5 1485* >40 * Cancer 6 1485* >40 * Heart Disease 7 1485* >40 * Viral Infection 8 1485* >40 * Viral Infection 9 130** 30-40 * Cancer 10 130** 30-40 * Cancer 11 130** 30-40 * Cancer 12 130** 30-40 * Cancer
  25. 25. Non-sensitive Sensitive Zip Age Nationality Condition 1 130** <30 * Heart Disease 2 130** <30 * Heart Disease 3 130** <30 * Viral Infection 4 130** <30 * Viral Infection 5 1485* >40 * Cancer 6 1485* >40 * Heart Disease 7 1485* >40 * Viral Infection 8 1485* >40 * Viral Infection 9 130** 30-40 * Cancer 10 130** 30-40 * Cancer 11 130** 30-40 * Cancer 12 130** 30-40 * Cancer
  26. 26. Are some challenges more important than others?
  27. 27. Second survey • Group A: respondents of the previous survey that have provided their email addresses • 26 answers - 20 with mail, invited - 14 responses - 70% • Group B: extended list of ecosystem experts (outside Group A): • 148 invited - 142 valid addresses - 38* responses ~ 27% • Better response rate: 32.1% vs 18.4% (first survey) * One of the respondents that provided an email has not been invited
  28. 28. No difference between Group A and Group B Adonis, Unknown, restored by Duquesnoy (1597–1643), Louvre • Analysis of Similarities (ANOSIM) • R: -0.07564 • more dissimilar closer to 1 • Permutational Multivariate Analysis of Variance Using Distance Matrices (ADONIS) • p-value: 0.192
  29. 29. Ordering challenges 1. Consider both groups as one set of answers 2. Per question: #very important - #moderately important - #slightly important 3. Lexicographic order on the triples (#very important - #moderately important - #slightly important)
  30. 30. Top Three 1. Reproducible and Comparable Research [Providing databases with information about the details of a large collection of ecosystems] 2. Reproducible and Comparable Research [Making research results about ecosystems available in a reproducible way] 3. Offer more advanced ecosystems analysis (e.g., case studies, qualitative and quantitative analysis) [Use more advanced statistical techniques (e.g., survival analysis, econometric aggregation, contrasts)]
  31. 31. 4. Understanding and improving the design, architecture, quality and health of software ecosystems [Socio-technical perspective, e.g., comparing the health of the community against the health of the ecosystem components] 5. Ecosystem Governance [Design perspective, e.g., actively supporting the stakeholders' decisions] 6. Understanding and improving an ecosystem's dynamics and evolution [Generalisation perspective, e.g., transferring insights from evolution of individual software systems to evolution of ecosystems] 7. Understanding and improving the design, architecture, quality and health of software ecosystems [Social perspective, e.g., creating an active community around the ecosystem] 8. Interdisciplinary research [Applying ecosystem research techniques to non-classical software ecosystems, e.g., spreadsheets or Matlab Simulink models] 9. Understanding and improving an ecosystem's dynamics and evolution [Design perspective, e.g., providing upgrade strategies when one of the ecosystem elements changes] 10.Ecosystem Governance [Generalisation perspective, e.g., going beyond anecdotal evidence]
  32. 32. Reproducible Research: SE problem? Raw$data! Processed$ data$set! Tools$&$ scripts! #MSR$papers$ 200482009! Y" Y" Y" 2" Y" Y" N" 2" Y" P" Y" 1" Y" P" P" 2" Y" P" N" 2" Y" N" Y" 16" Y" N" P" 19" Y" N" N" 64" P" N" Y" 1" P" N" N" 2" N" Y" N" 2" N" P" N" 1" N" N" Y" 7" N" N" P" 2" N" N" N" 31" N/A" N/A" N/A" 17" Robles 2010 Ghezzi, Gall 2013: • Replicated 25 • Partially 27 • Not replicated 36
  33. 33. Reproducible and Comparable Research [Providing databases with information about the details of a large collection of ecosystems] Enough? Too big to share? Up-to-date? Still relevant? 1TB
  34. 34. Culture http://www.nickcobbcopywriter.com/wp-content/uploads/2013/03/whats-in-it-for-me.jpg
  35. 35. Advanced statistics 3. Offer more advanced ecosystems analysis (e.g., case studies, qualitative and quantitative analysis) [Use more advanced statistical techniques (e.g., survival analysis, econometric aggregation, contrasts)]
  36. 36. Advanced statistics PAGE 2711/08/15 Two distributions: !  t-test !  Mann-Whitney Multiple distributions: 1.  ANOVA / KW 2.  pairwise t-test / MW Tests can be inconsistent with each other We need a one-phase test!
  37. 37. Advanced statistics PAGE 3211/08/15 Idea: ​"  Pair Low High B-A -0.56 -0.44 C-A -0.50 -0.31 D-A -0.32 -0.03 C-B -0.01 0.24 D-B 0.24 0.47 D-C 0.09 0.40 A→B A→C A→D D→B D→C Konietschke, F., Hothorn, LA, and Brunner, E. Rank-based multiple test procedures and simultaneous confidence intervals. Electron. J. Stat. 6 (2012), 738–759. ~
  38. 38. T and Software Ecosystems • Stack Overflow and GitHub - Vasilescu et al. Social Com 2013 • Simulink models - Dajsuren et al. QoSA 2013 • GNOME - Vasilescu et al. ESE 2014 • Stack Exchange sites - Wang et al. ICSME 2014 • jEdit, ArgoUML, KOffice - Sun et al. Inf & Software Technology 2015 ~
  39. 39. Advanced statistics Mean, median, sum Gini, Theil, Kolm… Choice of an aggregation technique provides different insights but can also affect validity of the results! C. Gini, “Measurement of inequality of incomes,” The Economic Journal, 1921. H. Theil, Economics and Information Theory. North-Holland, 1967 A.B. Atkinson, “On the measurement of inequality,” Journal of Economic Theory, 1970. …
  40. 40. Gini, Theil & Software Ecos • Qualitas - Spasojević et al. ICSME 2014 • GNOME - Mens, Goeminne IWSECO 2011, Vasilescu et al. ESE 2014 • Debian - Serebrenik, vd Brand ICSM 2010 • Market shares - Yu, First Monday 2012
  41. 41. Advanced statistics % of entities still used after time t? Kaplan, E. L.; Meier, P. (1958). "Nonparametric estimation from incomplete observations". J. Amer. Statist. Assn. 53 (282): 457–481
  42. 42. Survival & Software Ecos • FLOSSMetrics DB - Samoladas et al. Information & Software Technology 2010 • Debian packages - Claes et al. MSR 2015 • Databases in Java projects - Goeminne, Mens ICSME 2015
  43. 43. Threats to validity • Representativeness of the respondents wrt the research community
  44. 44. National Oceanic and Atmospheric Administration, USA

×