2. MOTIVATION
Getting a better idea of what’s going on in
software engineering research community
through a quantitative approach
3. RELATED WORKS
•C. Ghezzi - Keynote at ICSE 2008
Reflections on 40+ years of software engineering
research and beyond
•L. Briand - Keynote at ICSM 2011
Useful software engineering research: leading a double
agent life
•D. Rosemblum - Keynote at ASE 2012
Whither software engineering research?
4. SUBJECTS OF OUR STUDY
researchers research topics
affiliations geographical areas
11. AUTHOR ANALYSIS
Who published the most?
Are there sub-communities?
12. MOST PROLIFIC AUTHORS
Software
ICSE ASE ESEC/FSE TSE TOSEM
Engineering
Basili Bohem Xie Clarke Basili Notkin
60 28 24 8 33 13
Notkin Basili Grundy D. Jackson Briand Rothermel
56 26 18 8 26 8
Kramer Osterweil Hosking Ernst Weyuker Roman
49 23 16 7 18 6
Harrold Kramer Egyed Notkin Knight Wolf
46 21 16 7 17 6
Xie Notkin Lo Uchitel Kramer Harrold
46 21 16 7 16 6
13. SUB-COMMUNITY DETECTION
For each venue we
consider the top most
prolific authors
|A B|
We compute the set
J(A, B) =
|A [ B|
similarity between all
the pair of venues
23. TOPICS IN THE ‘70S
By far the most
represented
Topic Fraction of papers
Topics from Programming Languages 16.71%
other fields Performance 7.95%
Operating Systems 7.29%
Database Systems 6.84%
Formal Methods 6.65%
Software Architectures 6.14%
Knowledge Engineering 5.69%
Distributed Systems 4.94%
Software Maintenance 4.18%
24. TOPICS IN THE ‘80S
Topic Fraction of papers
Significant rise
Programming Languages 10.48%
Distributed Systems 9.30%
Other fields, Knowledge Engineering 8.47%
related to Software Reliability 6.68%
distributed systems
Formal Methods 6.51%
Information Systems 5.55%
Software Maintenance 5.04%
Models 4.35%
Artificial Intelligence 3.74%
Not only code
25. TOPICS IN THE ‘90S
Change of the
most published
Topic Fraction of papers topic
Formal Methods 8.29%
Programming Languages 8.13%
Distributed Systems 6.80%
Focus on soft ware Software Maintenance 6.55%
quality
Software Architectures 5.34%
Software Quality 4.80%
Knowledge Engineering 4.67%
Models 4.65%
Information Systems 4.40%
26. TOPICS IN THE 2000S Still lot of
emphasis on
soft ware
Topic Fraction of papers quality
Formal Methods 9.93%
Programming Languages 8.37%
Testing 6.86%
Software Maintenance 6.58%
Software Reliability 6.22%
Analysis of open Software Quality 5.72%
source repositories
Models 4.80%
Empirical Studies 4.76%
Software Architectures 4.38%
27. NEED FOR A FINER ANALYSIS
Topics change constantly, not once in a decade
SOLUTION: sliding window
instead of fixed subdivision
33. PER-VENUE INSIGHTS
Venue Peculiarities
TSE Biased towards empirical works
TOSEM More focused on formal aspects
ICSE Balanced with respect to other venues
Formal, with interests in testing, modeling
ESEC/FSE
and requirements engineering
Interests in program analysis and automated
ASE
reasoning
35. AFFILIATION PROFILE
Author Affiliation Affiliation profile
Author A 1
Affiliation 1 33%
Author B 2
Author B 2 Affiliation 2 66%
36. MOST PROLIFIC AFFILIATIONS
Affiliation Papers
IBM 186.32
Carnegie Mellon University 166.52
University of Texas, Austin 122.62
University of Maryland 106.83
Microsoft 101.63
AT&T Bell Laboratories 101.37
University of California, Irvine 98.17
Georgia Institute of Technology 94.75
Massachusetts Institute of Technology 93.24
University of Virginia 81.55
ALL FROM THE USA
37. PER-VENUE INSIGHTS
Is it linked to the presence
Venue Peculiarities of empirical works?
Is the venue with more industrial
TSE
contribution
European universities among the top
TOSEM
contributors Is Europe more
formal?
Balanced set of contributors we saw in the
ICSE
other venues
Despite ESEC, there is no bias towards
ESEC/FSE It is representative
Europe
Industrial contribution is less relevant.
ASE
Some affiliations appear only in its top list.
40. GEOGRAPHICAL AREAS
Europe
North
Asia
America
&
Oceania
South
America Africa
41. LOCATION OF A PAPER
Affiliation profile Locations
Affiliation 1 20% Affiliation 1 North America
Affiliation 2 30% Affiliation 2 Europe
Affiliation 3 50% Affiliation 3 Europe
Location profile
North America 20%
Europe 80%
43. CONCLUSION
Academic literature contains a lot of
information about a scientific community
With data mining techniques we can unveil it
and get some interesting insights