Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Software Evolution anno 2014: 
directions and challenges 
Alexander Serebrenik 
@aserebrenik 
a.serebrenik@tue.nl
Time for a new book!
2008 vs. 2014 
From systems to ecosystems
Business-oriented view 
“a set of actors functioning as a unit and 
interacting with a shared market for 
software and ser...
Development-centric view 
a collection of software projects 
that are developed and evolve 
together in the same environme...
Socio-technical view 
a community of persons (end-users, 
developers, debuggers, 
…) contributing to a collection 
of proj...
Technical 
Scientific 
Practical 
Legal and ethical
Technical 
challenges
Technical 
challenges 
• eliminate non-names 
• eliminate specific quirks 
• group “similar” names 
– first/last name 
– t...
Technical 
challenges 
• eliminate non-names 
• eliminate specific quirks 
• group “similar” names 
– first/last name 
– t...
Technical 
challenges 
Structured 
data 
2008 
Unstructured 
data 
2014
Technical 
challenges 
Structured 
data 
2008 
Unstructured 
data 
2014
Scientific 
challenges
Scientific 
challenges 
Raw data 
Processed 
data set 
Tools & 
scripts 
#MSR papers 
2004-2009 
Y Y Y 2 
Y Y N 2 
Y P Y 1...
• How can we share our big data with other 
researchers? 
• Different formats, different tools, storage 
Practical 
challe...
http://www.intracto.com/blog/online-privacy-belangrijk 
Legal and ethical 
challenges 
(especially for survey data)
k-anonymity
k-anonymity 
l-diversity 
t-closeness
2008 vs. 2014 
From “traditional” to 
“non-traditional” artifacts: 
What is 
software?
http://ctms.engin.umich.edu/CTMS/index.php?example=Introduction&section=SimulinkModeling 
Maintainability??? 
Evolution???
BumbleBee: a 
refactoring tool 
for spreadsheets 
with thanks to Felienne Hermans
http://help.eclipse.org/juno/index.jsp?topic=%2Forg.eclipse.m2m.atl.doc%2Fguide%2Fconcepts%2FModel-Transformation.html
• describe evolutionary steps 
• relate to changes of other 
artifacts 
• describe prevalence in 
practice 
• support auto...
New kind of 
verification 
artifacts 
2008 
2009 
2012 
2013
2008 vs. 2014 
From technical to socio-technical 
perspective: 
Who are these 
people? 
What do they do?
> 90% in WordPress & Drupal 
> 95% in FLOSS surveys 
> 87% in GNOME 
> 70% in software-related jobs (NSF) 
MEN
FLOSS 
2013 
Europe,US,CA,AU 
Brazil/Argentina
How can we reliably and efficiently 
identify gender, age, location? 
Technical 
challenges
Name + 
Location = 
Gender
Lonzo ⇒ Alonzo 
w35l3y ⇒ wesley 
Name + 
Location = 
Gender
Heuristics: 
title + first h1 
<title>Ben Kamens</title> 
… 
<h1>We’re willing to 
be embarrassed about what 
we 
<em>have...
Quality of gender resolution: Survey 
Self-identification 
As inferred Total 
M F ? 
M 60 3 43 106 
F 2 5 4 11 
+ avatars,...
22-9-2014 PAGE 42 
.cpp .po 
.jpg 
/test/ 
/library/ .doc 
makefile .sql .conf
Occasional 
contributors 
Frequent 
contributors
How can we reliably and efficiently 
identify human activities? 
Technical 
challenges
How can we reliably and efficiently 
identify human activities? 
Technical 
challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges
Upcoming SlideShare
Loading in …5
×

Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges

582 views

Published on

Software evolution research is a thriving area of software engineering research. Recent years have seen a growing interest in variety of evolution topics, as witnessed by the growing number of publications dedicated to the subject. Without attempting to be complete, in this talk we provide an overview of emerging trends in software evolution research, such as extension of the traditional boundaries of software, growing attention for social and socio-technical aspects of software development processes, and interdisciplinary research applying research techniques from other research areas to study software evolution, and software evolution research techniques to other research areas. As a large body of software evolution research is empirical in nature, we are confronted by important challenges pertaining to reproducibility of the research, and its generalizability.

Published in: Science
  • Be the first to comment

  • Be the first to like this

Invited Talk MESOCA 2014: Evolving software systems: emerging trends and challenges

  1. 1. Software Evolution anno 2014: directions and challenges Alexander Serebrenik @aserebrenik a.serebrenik@tue.nl
  2. 2. Time for a new book!
  3. 3. 2008 vs. 2014 From systems to ecosystems
  4. 4. Business-oriented view “a set of actors functioning as a unit and interacting with a shared market for software and services, together with the relationships among them.” with thanks to International Data Corporation (IDC)
  5. 5. Development-centric view a collection of software projects that are developed and evolve together in the same environment with thanks to Bram Adams
  6. 6. Socio-technical view a community of persons (end-users, developers, debuggers, …) contributing to a collection of projects
  7. 7. Technical Scientific Practical Legal and ethical
  8. 8. Technical challenges
  9. 9. Technical challenges • eliminate non-names • eliminate specific quirks • group “similar” names – first/last name – textual similarity – latent semantic analysis • (correct groups manually)
  10. 10. Technical challenges • eliminate non-names • eliminate specific quirks • group “similar” names – first/last name – textual similarity – latent semantic analysis • (correct groups manually)
  11. 11. Technical challenges Structured data 2008 Unstructured data 2014
  12. 12. Technical challenges Structured data 2008 Unstructured data 2014
  13. 13. Scientific challenges
  14. 14. Scientific challenges Raw data Processed data set Tools & scripts #MSR papers 2004-2009 Y Y Y 2 Y Y N 2 Y P Y 1 Y P P 2 Y P N 2 Y N Y 16 Y N P 19 Y N N 64 P N Y 1 P N N 2 N Y N 2 N P N 1 N N Y 7 N N P 2 N N N 31 N/A N/A N/A 17 We share raw data but rarely share tools – reinventing the wheel anybody?
  15. 15. • How can we share our big data with other researchers? • Different formats, different tools, storage Practical challenges problems, … • How can we make our research results useful to practitioners and development communities? • How can we build tools and dashboards that integrate our findings?
  16. 16. http://www.intracto.com/blog/online-privacy-belangrijk Legal and ethical challenges (especially for survey data)
  17. 17. k-anonymity
  18. 18. k-anonymity l-diversity t-closeness
  19. 19. 2008 vs. 2014 From “traditional” to “non-traditional” artifacts: What is software?
  20. 20. http://ctms.engin.umich.edu/CTMS/index.php?example=Introduction&section=SimulinkModeling Maintainability??? Evolution???
  21. 21. BumbleBee: a refactoring tool for spreadsheets with thanks to Felienne Hermans
  22. 22. http://help.eclipse.org/juno/index.jsp?topic=%2Forg.eclipse.m2m.atl.doc%2Fguide%2Fconcepts%2FModel-Transformation.html
  23. 23. • describe evolutionary steps • relate to changes of other artifacts • describe prevalence in practice • support automation http://help.eclipse.org/juno/index.jsp?topic=%2Forg.eclipse.m2m.atl.doc%2Fguide%2Fconcepts%2FModel-Transformation.html
  24. 24. New kind of verification artifacts 2008 2009 2012 2013
  25. 25. 2008 vs. 2014 From technical to socio-technical perspective: Who are these people? What do they do?
  26. 26. > 90% in WordPress & Drupal > 95% in FLOSS surveys > 87% in GNOME > 70% in software-related jobs (NSF) MEN
  27. 27. FLOSS 2013 Europe,US,CA,AU Brazil/Argentina
  28. 28. How can we reliably and efficiently identify gender, age, location? Technical challenges
  29. 29. Name + Location = Gender
  30. 30. Lonzo ⇒ Alonzo w35l3y ⇒ wesley Name + Location = Gender
  31. 31. Heuristics: title + first h1 <title>Ben Kamens</title> … <h1>We’re willing to be embarrassed about what we <em>haven’t</em> done…</h1> Ben Kamens We’re willing to be embarrassed about what we haven’t done… Stanford Named Entity Tagger <PERSON>Ben Kamens</PERSON> We’re willing to be embarrassed about what we haven’t done…
  32. 32. Quality of gender resolution: Survey Self-identification As inferred Total M F ? M 60 3 43 106 F 2 5 4 11 + avatars, other social media sites (manually) Self-identification As inferred Total M F ? M 90 3 13 106 F 2 9 0 11
  33. 33. 22-9-2014 PAGE 42 .cpp .po .jpg /test/ /library/ .doc makefile .sql .conf
  34. 34. Occasional contributors Frequent contributors
  35. 35. How can we reliably and efficiently identify human activities? Technical challenges
  36. 36. How can we reliably and efficiently identify human activities? Technical challenges

×