Abstract:
Though in essence an engineering discipline, software engineering research has always been struggling to demonstrate impact. This is reflected in part by the funding challenges that the discipline faces in many countries, the difficulties we have to attract industrial participants to our conferences, and the scarcity of papers reporting industrial case studies.
There are clear historical reasons for this but we nevertheless need, as a community, to question our research paradigms and peer evaluation processes in order to improve the situation. From a personal standpoint, relevance and impact are concerns that I have been struggling with for a long time, which eventually led me to leave a comfortable academic position and a research chair to work in industry-driven research.
I will use some concrete research project examples to argue why we need more inductive research, that is, research working from specific observations in real settings to broader generalizations and theories. Among other things, the examples will show how a more thorough understanding of practice and closer interactions with practitioners can profoundly influence the definition of research problems, and the development and evaluation of solutions to these problems. Furthermore, these examples will illustrate why, to a large extent, useful research is necessarily multidisciplinary. I will also address issues regarding the implementation of such a research paradigm and show how our own bias as a research community worsens the situation and undermines our very own interests.
On a more humorous note, the title hints at the fact that being a scientist in software engineering and aiming at having impact on practice often entails leading two parallel careers and impersonate different roles to different peers and partners.
Bio:
Lionel Briand is heading the Certus center on software verification and validation at Simula Research Laboratory, where he is leading research projects with industrial partners. He is also a professor at the University of Oslo (Norway). Before that, he was on the faculty of the department of Systems and Computer Engineering, Carleton University, Ottawa, Canada, where he was full professor and held the Canada Research Chair (Tier I) in Software Quality Engineering. He is the coeditor-in-chief of Empirical Software Engineering (Springer) and is a member of the editorial boards of Systems and Software Modeling (Springer) and Software Testing, Verification, and Reliability (Wiley). He was on the board of IEEE Transactions on Software Engineering from 2000 to 2004. Lionel was elevated to the grade of IEEE Fellow for his work on the testing of object-oriented systems. His research interests include: model-driven development, testing and verification, search-based software engineering, and empirical software engineering.
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Lionel Briand ICSM 2011 Keynote
1. Useful Software Engineering
Research: Leading a Double
-Agent Life
Lionel Briand
Certus Center for Software Verification
and Validation
Simula Research Laboratory
& University of Oslo, Norway
1
4. Software Engineering Research
Funding & Relevance
➜ Software Engineering (SE) research should be a
top priority in most countries (except Sweden)
➜ But that is not the case anymore. Hard numbers
are hard to get, and in some cases well protected
➜ Symptoms
➥ Listed priorities by research councils, funding
➥ University hiring
➥ Large centers or institutes being established or closed down
➜ May be partly related to lack of relevance?
➥ Industry participation in leading SE conferences
➥ Application/industry tracks not first class citizens
➥ A very small percentage of research work ever used and assessed on
real industrial software
4
5. Basili’s and Meyer’s Take
➜ Many of the advances in software engineering have come out
of non-university sources
➜ “Academic research has had its part, honorable but
limited.” (Meyer)
➜ Large scale labs don’t get funded, like they do in other
engineering and scientific disciplines (Basili, Meyer)
➜ Software Engineering is “big science” (Basili)
➜ One significant difference though is that we cannot entirely
recreate the phenomena we study within four walls – This,
as discussed later, has significant consequences
➜ Question: What is our responsibility in all this?
5
6. Engineering Research
➜ “Engineering: The application of scientific and
mathematical principles to practical ends such as
the design, manufacture, and operation of
efficient and economical structures, machines,
processes, and systems.” (American Heritage
Dictionary)
➜ Engineering research:
➥ Problem driven
➥ Real world requirements
➥ Scalability
➥ Human factors, where it matters
➥ Economic tradeoffs and cost-benefit analysis
➥ Actually doing it on real artifacts, not just talking
about it
6
7. A Representative Example
➜ Parnin and Orso (ISSTA, 2011) looked at
automated debugging techniques
➜ 50 years of automated debugging research
➜ Only 5 papers have evaluated automated
debugging techniques with actual programmers
➜ Focus since ~2001: dozens of papers ranking
program statements according to their
likelihood of containing a fault
➜ Experiment
➥ How do programmers use the ranking?
➥ Do they see the bugs?
➥ Is the ranking important?
7
8. Results from Parnin and Orso’s
Study
• Only low performers strictly followed the ranking
• Only one out of 10 programmers who checked a
buggy statement stopped the investigation
• Automated support did not speed up debugging
• Developers wanted explanations rather than
recommendations
• We cannot abstract the human away in our
research
• “… we must steer research towards more promising
directions that take into account the way programmers
actually debug in real scenarios.”
8
9. What Happened?
➜ How people debug and what information they need
is poorly understood
➥ Probably varies a great deal according to context and
skills
➜ Researchers focused on providing a solution that
was a mismatch for the actual problem
➜ That line of research became fashionable: a lot of
(cool) ideas could be easily applied and compared,
without involving human participants
➜ Resulted in many, many papers …
➜ Many other examples in SE, e.g., Clone detection?
9
10. Other Examples
➜ Adaptive Random Testing: Many papers since 2004
➥ Mostly simulations and small artifacts, unrealistic
failure rates. Arcuri and Briand (2011), ISSTA
➜ Regression testing: Arguably the most studied
testing problem, perhaps most studied software
engineering problem …
➥ "However, empirical evaluation and application of
regression testing techniques at industrial level seems
to remain limited. Out of the 159 papers …, only 12
papers consider industrial software artefacts as a
subject of the associated empirical studies. This
suggests that a large-scale industrial uptake of these
techniques has yet to occur.” Yoo and Harman (2011)
➥ Possible reason: Strong focus on white-box, not black
-box regression testing? Scalability and Practicality?
10
11. Industry-Driven Research
➜ Let’s take now an entirely different angle …
➜ Research driven by industry needs
➜ Simula’s motto: “The industry is our lab”
➜ Go through recent and successful projects (mostly
@ Simula) with industry partners
➜ Summarize what happened, our experience
➜ Draw conclusions and lessons learned
➥ Patterns for successful research
➥ Challenges and possible solutions
11
12. Mode of Collaboration
Problem
identification Realistic
validation
Release
solution
Ind. Partners
Problem
formulation
Study
Candidate
Research Center State-of-
Solutions
the-art Initial
validation
Gorschek et al., IEEE Software 2006
12
13. Project Example 1
➜ Context: Testing in communication systems (Cisco)
➜ Original scientific problem: Modeling and test case
generation, oracle, coverage strategy
➜ Practical observation: Access to test network
infrastructure limited (emulate network traffic,
etc.). Models get too large and complex.
➜ Modified research objectives: (1) How to select an
optimal subset of test cases matching the time
budget, (2) Modeling cross-cutting concerns
13
14. Project Example 1
➜ Context: Testing in communication systems (Cisco)
➜ Original scientific problem: Modeling and model
-based test case generation, oracle, coverage
strategy
➜ Practical observation: Access to test network
infrastructure limited (emulate network traffic,
etc.). Models get too large and complex.
➜ Modified research objectives: (1) How to select an
optimal subset of test cases matching the time
budget, (2) Modeling cross-cutting concerns
➜ References: Hemmati et al. (2010),
Ali et al. (2011)
14
15. Project Example 2
➜ Context: Testing image segmentation algorithms
for medical applications (Siemens)
➜ Original scientific problem: Define specific test
strategies for segmentation algorithms
➜ Practical observations: Algorithms are validated by
using highly specialized medical experts.
Expensive and slow. No obvious test oracle
➜ Modified research objective: Learning oracles for
image segmentation algorithms in medical
applications. Machine learning.
15
16. Project Example 2
➜ Context: Testing image segmentation algorithms
for medical applications (Siemens)
➜ Original scientific problem: Define specific test
strategies for segmentation algorithms
➜ Practical observations: Algorithms are validated by
using highly specialized medical experts.
Expensive and slow. No obvious test oracle
➜ Modified research objective: Learning oracles for
image segmentation algorithms in medical
applications. Machine learning.
➜ Reference: Frouchni et al. (2011)
16
17. Project Example 3
➜ Context: Subsea integrated control systems (FMC)
➜ Original scientific problem: Architecture-driven integration in
systems of systems
➜ Practical observations: Each subsea installation is unique
(variant), the software configuration is extremely complex
(hundreds of interrelated variation points in software and
hardware)
➜ Modified research objective: Product Line architectures in
integrated control systems to support the configuration
process
➜ Note: Despite decades of research in
PLA, we could not find a methodology
fitting our requirements
17
18. Project Example 3
➜ Context: Subsea integrated control systems (FMC)
➜ Original scientific problem: Architecture-driven integration in
systems of systems
➜ Practical observations: Each subsea installation is unique
(variant), the software configuration is extremely complex
(hundreds of interrelated variation points in software and
hardware)
➜ Modified research objective: Product Line architectures in
integrated control systems to support the configuration
process
➜ Note: Despite decades of research in
PLA, we could not find a methodology
fitting our requirements
➜ Reference: Behjati et al. (2011)
18
19. Project Example 4
➜ Context: safety-critical embedded systems in
the energy and maritime sectors, e.g., fire and
gas monitoring, process shutdown, dynamic
positioning (Kongsberg Maritime)
➜ Original scientific problem: Model-driven
engineering for failure-mode and effect analysis
➜ Practical observations: Certification meetings
with third-party certifiers. Certification is
lengthy, expensive, etc. Traceability in large
complex systems a priority.
➜ Modified research objective: Traceability
between safety requirements and system design
decisions. Solution based on SysML and a simple
traceability language along with model slicing.
19
20. Project Example 4
➜ Context: safety-critical embedded systems in
the energy and maritime sectors, e.g., fire and
gas monitoring, process shutdown, dynamic
positioning (Kongsberg Maritime)
➜ Original scientific problem: Model-driven
engineering for failure-mode and effect analysis
➜ Practical observations: Certification meetings
with third-party certifiers. Certification is
lengthy, expensive, etc. Traceability in large
complex systems a priority.
➜ Modified research objective: Traceability
between safety requirements and system design
decisions. Solution based on SysML and a simple
traceability language along with model slicing.
➜ Reference: Sabetzadeh et al. (2011)
20
21. Project Example 5
➜ Context: Technology qualification (TQ)
in maritime sector (DNV)
➜ Original scientific problem: Model-based
quantitative safety analysis
➜ Practical observations: TQ is not purely
objective, quantitative argument. Great
complexity (e.g., sources of
information) and expert judgment.
Many stakeholders.
➜ Modified research objective: Modeling
safety arguments to support
quantitative reasoning and decision
making by several stakeholders
21
22. Project Example 5
➜ Context: Technology qualification (TQ)
in maritime sector (DNV)
➜ Original scientific problem: Model-based
quantitative safety analysis
➜ Practical observations: TQ is not purely
objective, quantitative argument. Great
complexity (e.g., sources of
information) and expert judgment.
Many stakeholders.
➜ Modified research objective: Modeling
safety arguments to support
quantitative reasoning and decision
making by several stakeholders
➜ Reference: Sabetzadeh et al. (2011)
22
23. Two Other Examples on the
ICSM Program
➜ Erik Rogstad et al., “Industrial Experiences with
Automated Regression Testing of a Legacy
Database Application”
➜ Amir Reza Yazdanshenas and Leon Moonen,
“Crossing the Boundaries While Analyzing
Heterogeneous Component-Based Software
Systems”
23
24. Successful Research Patterns
➜ Successful: Innovative and high impact
➜ Inductive research: Working from specific
observations in real settings to broader
generalizations and theories
➥ Field studies and replications, analyze commonalities
➜ Scalability and practicality considerations must be
part of the initial research problem definition
➜ Researching by doing: Hands-on research. Apply
what exists in well defined, realistic context,
with clear objectives. The observed limitations
become the research objectives.
➜ Multidisciplinary: other CS, Engineering, or non
-technical domains
24
25. So What?
➜ Making a conscious effort to understand the
problem first
➥ Precisely identify the requirements for an applicable solution
➥ More papers focused on understanding the problems
➥ Making industry tracks first class citizens in SE conferences
➜ Better relationships between academia and
industry
➥ Different models, e.g., Research-based innovation centers in Norway
➥ Common labs (e.g., NASA SEL Lab)
➥ Exposing PhD students to industry practice: Ethical considerations
(Fixing the PhD, Nature)
➜ Playing an active role in solving the problem, e.g.,
action research-like
25
26. So What?
➜ Work on end-to-end solutions: Pieces of solutions
are interdependent. Necessary for impact.
➜ Beyond professors and students
➥ Labs with interdisciplinary teams of professional
scientists and engineers within or collaborating with
universities
➥ Used to be the case with corporate research labs: Bell
Labs, Xerox PARC, HP labs, NASA SEL, etc.
➥ Now: Fraunhofer (Germany), Simula (Norway),
Microsoft Research (US), SEI (US), SnT
(Luxembourg)
➥ Corporate labs versus publicly supported ones?
➥ Key point: The level of basic funding must allow high
risk research, performed by professional scientists,
focused on impact in society
26
27. The NASA SEL Experience
Factory Model
Basili et al. (NASA SEL)
27
28. Academic Challenges
➜ Our CS legacy … emancipating ourselves as an engineering
discipline
➥ Systems engineering departments?
➜ How cool is it? SE research is more driven by “fashion” than
needs, a quest for silver bullets
➥ We can only blame ourselves
➜ Counting papers and how the JSS ranking does not help
➥ We are pressuring ourselves into irrelevance
➜ Taking academic tenure and promotion seriously
➥ What about rewarding impact?
➜ One’s research must cover a broader ground and be somewhat
opportunistic – this pushes us out of our comfort zone
➜ Resources to support industry collaborations
➥ Large lab infrastructure, engineers, time
28
29. Industrial Challenges
➜ From a discussion with Bran Selic …
➜ Short term versus longer term goals (next
quarter’s forecast is the priority)
➜ Industrial research groups are often disconnected
from their own business units and external
researchers may be perceived as competitors
➜ Company’s intellectual property regulations may
conflict with those of the research institution
➜ Complexity of industrial systems and technology
➥ Cannot be transplanted in artificial settings for
research - Need studies in real settings
➥ Substantial domain knowledge is required
29
30. A Double-Agent Life
A new idea
Scientist,
(as initially
trying to be
perceived by
discrete, but
our partners)
inquisitive
Warning: No Practitioner
research here (sanitized)
30
31. Conclusions
➜ Software engineering is obviously important in all
aspects of society, but academic software
engineering research is not perceived the same
way
➜ The academic community, at various levels, is
partly responsible for this
➜ How we take up the challenge of increasing our
impact will determine the future of the profession
➜ There are solutions, but no silver bullet
➜ We all have a role to play in this, as deans,
department chairs, professors, scientists,
reviewers, conference organizers, journal editors,
etc. We can all be double-agents …
31
32. Empirical Software Engineering
➜ Springer, 6 issues a year
➜ Both research papers and industry experience
reports
➜ 2nd highest impact factor among SE research
journals
➜ “Applied software engineering research with a
significant empirical component”
32
33. References
➜ Ali, Briand, Hemmati, “Modeling Robustness Behavior Using Aspect-Oriented Modeling
to Support Robustness Testing of Industrial Systems”, Journal of Software and
Systems Modeling, forthcoming, 2011.
➜ Arcuri and Briand, “Adaptive Random Testing: An Illusion of Effectiveness”, ISSTA
2011
➜ Basili, “Learning Through Application: The maturing of the QIP in the SEL”, Making
Software; What really works and why we believe it, Edited by Andy Oram and Greg
Wilson, O’Reilly Publishers, 2011, pp.65-78.
➜ Behjati, Yue, Briand and Selic , “SimPL: A Product-Line Modeling Methodology for
Families of Integrated Control Systems”, Simula Technical Report 2011-14 (V. 2),
Submitted.
➜ Hemmati, Briand, Arcuri, Ali “An Enhanced Test Case Selection Approach for Model
-Based Testing: An Industrial Case Study”, FSE, 2010
➜ Frouchni, Briand, Labiche, Grady, and Subramanyan, “Automating Image Segmentation
Verification and Validation by Learning Test Oracles”, forthcoming in Information and
Software Technology (Elsevier), 2011.
33
34. References II
➜ Sabetzadeh, Nejati, Briand, Evensen Mills “Using SysML for Modeling of Safety
-Critical Software–Hardware Interfaces: Guidelines and Industry Experience”, HASE,
2011
➜ Parnin and Orso, “Are Automated Debugging Techniques Actually Helping
Programmers?”, ISSTA, 2011
➜ Sabetzadeh et al., “Combining Goal Models, Expert Elicitation, and Probabilistic
Simulation for Qualification of New Technology”, HASE, 2011
➜ Yoo and Harman, “Regression testing minimization, selection and prioritization: a
survey”, STVR, Wiley, forthcoming
➜ Bertrand Meyer’s blog: http://bertrandmeyer.com/2010/04/25/the-other
-impediment-to-software-engineering-research/
34