Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Theories	in	Empirical	Software	
Engineering
Roel	Wieringa
Sidekicks:
Daniel	Méndez
Lutz	Prechelt
21	October	2015 IASESE 1
Who	are	we?
Roel Wieringa
University	of	Twente,	Germany
http://wwwhome.ewi.utwente.nl/~roelw/
21	October	2015 IASESE 2
Lut...
Who	are	you?
Quick	round
• Who	are you?
• What is your experience in	conducting
empirical studies?
• What are your expecta...
What	do	you	think?
Why	do	we	need	scientific	theories	in	software	engineering?
4
4. Methodology	(the	study	of	research	methods)	
a. Notion	of	conceptual	framework;	statements	about	them
b. Notion	of	gene...
Agenda
Time Topic
09:00	– 10:30 Opening	and	Introduction
10:30 – 11:00	 Coffee	break
11:00	– 12:30	 Inferring	Theories	fro...
What	is	a	Scientific	Theory
21	October	2015 IASESE 7
Scientific theories
• A	theory is	a	belief	that there is	a	pattern in	phenomena
• A	scientific theory is	a	theory that
– H...
Examples (level	3)
• Theory of	cognitive dissonance
• Theory of	electromagnetism
• The	Balance	theorem in	social networks
...
Design	theories
• A	design	theory is	a	scientific theory about an
artifact in	a	context
• Vriezekolk:	What is	a	theory
• M...
The	Structure	of	Theories
21	October	2015 IASESE 11
The	structure of	scientific theories
1. Conceptual framework
– Constructs used to express beliefs about patternsin	phenome...
The	structure of	design theories
1. Conceptual framework
2. Generalizations
– Artifact specification X	Context	assumptions...
1. Architectural structures:	Class	of	systems,	componentswith
capabilities,	interactions
– E.g.	entities,	(de)composition,...
• Prechelt:	What is	a	theory,	the structure of	
theories
• Vriezekolk:	The	structure of	theories
• Méndez:	The	structure o...
The	Use	of	Theories
21	October	2015 IASESE 16
Uses of	a	conceptual framework
• Framing a	problem or	artifact:	choosing which concepts to
use
– Using	the theory of	infec...
Functions of	generalizations
• Functions of	generalizations
– Explanation:	explain phenomenaby identifyingcauses,	
mechani...
• Prechelt:	the use of	theories
• Vriezekolk:	the use of	theories
• Méndez:	the use of	theories
21	October	2015 IASESE 19
Usability of	theories
• When is	a	design	theory
Context	assumptions X	Artifact design	→	Effects
usable by a	practitioner?
...
• Prechelt:	the usability of	theories
• Vriezekolk:	the usability of	theories
• Méndez:	the usability of	theories
21	Octob...
Agenda
Time Topic
09:00	– 10:30 Opening	and	Introduction
10:30 – 11:00	 Coffee	break
11:00	– 12:30	 Inferring	Theories	fro...
Scientific	Inference
21	October	2015 IASESE 23
Case-based	inference
• Descriptive	inference:	Describing	observations
• Abductive inference:	Providing	an	explanation
• An...
• Architectural explanation must	be the	basis	of	the	
analogic generalization;	
• Otherwise,	we	engage in	wishful/magical ...
Sample-based inference
• Descriptive	inference:	Describe	sample	statistics
• Statistical	inference:	Generalize	to	populati...
• Causal explanations can be supported by sample-based
designs	(treatment	group/control	group)
• Generalization from a	pop...
• Vriezekolk:	Inferring theories from data
• Méndez:	inferring theories from data
• Prechelt:	Applying/inferring theories ...
Agenda
Time Topic
09:00	– 10:30 Opening	and	Introduction
10:30 – 11:00	 Coffee	break
11:00	– 12:30	 Inferring	Theories	fro...
Research	Design
21	October	2015 IASESE 30
The	research	setup
• In	experiments we	are	interested in	the effect	of	the
treatment	on	the OoS
– Requires capabilityto ap...
• Case-based	designs
– provide architecturalexplanations
– generalize	by	architectural	analogy
– Nondeterminism across cas...
Field	versus	lab
21	October	2015 IASESE 33
• If a	phenomenoncannot be (re)produced in	the lab,	it can
only be investigated...
• Vriezekolk	The	research	setup
• Méndez:	The	research	setup
• Prechelt:	The	research	setup
21	October	2015 IASESE 34
Agenda
Time Topic
09:00	– 10:30 Opening	and	Introduction
10:30 – 11:00	 Coffee	break
11:00	– 12:30	 Inferring	Theories	fro...
Hands-on	Working	Session
21	October	2015 IASESE 36
Hands-on	Working Session
1. What is	your research	question?
2. Describe a	research	setup	to answer it
3. What inferences d...
Q&A
21	October	2015 IASESE 38
You	probably	can’t	ask	 anyway,	so	ask	us!
21	October	2015 IASESE 39
“Naming	the	pain	in	requirements	engineering:	A	design	for	a	global	
family	of	surveys	and	first	results	from	Germany”
Mén...
• International	on-linesurvey	of	requirements engineering	
professionals’	opinion	about causes and effects of	RE	
problems...
What	is	a	theory	
• The	researchers	formulated	34	hypotheses	about
– RE	improvement
• Is	beneficial	
• Is	challenging
– RE...
• This	theory	(consisting	of	34	proposed	generalizations)	is	
tested	against
– Opinions of	professionals,	based	on	their	e...
The	structure	of	theories
1. Conceptual	framework
– Requirements,	needs,	goals,	specification,	RE	
skill,	etc.
2. Generali...
45
customer
Project	
team
Requirements
engineer
Product
Requirements
specification
No	solution	 approach
Agile	approach
No...
46
customer
Project	
team
Requirements
engineer
Product
Requirements
specification
No	solution	 approach
Agile	approach
No...
• The		conceptual structure of	social mechanisms in	
the previous two slides	is	architectural:
– Components
– Interactions...
48
• Brazilian respondents’	theory about causes and effects of	
incomplete	requirements
• German respondents’	theory about causes and effects of	
incomplete	requirements
49
The	use	of	theories
• “Requirements	are	incomplete	because	customers	have	
unclear	needs	and	has	no	RE	skills”
– Frame	a	p...
Usability of	theories
• The	theory of	34	hypotheses	is	not intendedto be used by
professionals	to improve their practice.	...
Inferring	theories	from	data
– Description
• Interpretation	of	the	answers	of	the	respondents
• Descriptive	statistics
– S...
The	research	setup
Population
Sample of
Objects of
Study
Represents
one or
more
population
elements
Treatment
instruments
...
21	October	2015 IASESE 54
“Why	Software	Repositories	Are	Not	Used	
For	Defect-Insertion	Circumstance	Analysis	
More	Often:	A	Case	Study”
Lutz Preche...
“Why	Software	Repositories	Are	Not	Used	For	Defect-Insertion	
Circumstance	Analysis	More	Often:	A	Case	Study”
Lutz Prechel...
What	is	a	theory	
• Theory	1,	held	by	the	community:
– MSR	can	provide	information	about	improvement	
opportunities	of	the...
• Theory	2,	proposed	by	Prechelt and	Pepper	based	
on	the	case	study:
– R1:	…
– …
– R5:	There	is	no	affordable	method	to	a...
The	structure	of	theories
• Conceptual	framework
– Definitions	of	change,	defect,	rework,	issue,	bug,	bugfix,	
defect	inse...
The	use	of	theories
• “MSR	can	provide	information	about	improvement	
opportunities	of	the	software	process”
– Frame	a	phe...
Usability of	theories
1. Professional	is	capable to recognize Context	assumptions
– yes
2. Capable to acquire/build Artifa...
Applying existing	theories	to	data	and
Inferring new	or	updated	theories	from	data
• Description
– Case	descriptions	of	ev...
The	research	setup
Population
Sample of
Objects of
Study
Represents
one or
more
population
elements
Treatment
instruments
...
21	October	2015 IASESE 64
“Experimental	Validation	of	a	Risk
Assessment	Method”
Vriezekolk,	Etalle	&Wieringa
21st	Working Conference	on	
Requirement...
• Lab	experiment	to test	reliability of	a	method,	
RASTER,	to assess risk	of	telecom	availability
– Research	question:	How...
What	is	a	theory	
• Design	theory
– RASTER	x	professionals	providing	services	during	incidents	
and	disasters	→	availabili...
The	structure	of	theories
Design	theory
1. Conceptual	framework
– Raster	concepts	(infrastructure	component,	vulnerability...
The	use	of	theories
• “Raster	x	Professionals	→	risk	assessments”
– Frame	a	phenomenon:	risk	assessments	are	made	by	profe...
Usability of	theories
1. Professional	is	capable to recognize Context	assumptions
– Yes
2. Capable to acquire/build Artifa...
Inferring	theories	from	data
– Description
• Outcome	of	RA’s	on	paper
• Krippendorf’s alpha	to	measure	interrater	agreemen...
The	research	setup
Population
Sample of
Objects of
Study
Represents
one or
more
population
elements
Treatment
instruments
...
Upcoming SlideShare
Loading in …5
×

Theories in Empirical Software Engineering

764 views

Published on

Slides from the International Advanced School on Empirical Software Engineering 2015, held as part of the Empirical Software Engineering International Week in Beijing. The slides are posted with the permission of the main organiser Roel Wieringa.

Published in: Science
  • Be the first to comment

Theories in Empirical Software Engineering

  1. 1. Theories in Empirical Software Engineering Roel Wieringa Sidekicks: Daniel Méndez Lutz Prechelt 21 October 2015 IASESE 1
  2. 2. Who are we? Roel Wieringa University of Twente, Germany http://wwwhome.ewi.utwente.nl/~roelw/ 21 October 2015 IASESE 2 Lutz Prechelt FU Berlin http://www.mi.fu-berlin.de/w/Main/LutzPrechelt Daniel Méndez TU München http://www4.in.tum.de/~mendezfe/
  3. 3. Who are you? Quick round • Who are you? • What is your experience in conducting empirical studies? • What are your expectations? 3
  4. 4. What do you think? Why do we need scientific theories in software engineering? 4
  5. 5. 4. Methodology (the study of research methods) a. Notion of conceptual framework; statements about them b. Notion of generalization; statements about them 3. Theory (statement about many research results) a. Conceptual framework b. Generalization 2. Research questions (what, how, when where, …., why) aimed at generalizable knowledge, research method, and research result 1. Practice domain: SW, methods, tools, processes (as is / to be) 21 October 2015 IASESE 5 Looking at research from the sky General knowledge is the gold we are after Hard work to grow knowledge Grass roots • Everything on the slides in this talk , except the examples, is at level 4. • The examples on these slides contain explicit level indications. • The separate example slides report about research that contains 2 and 3. • The reported research studies some aspect of 1.
  6. 6. Agenda Time Topic 09:00 – 10:30 Opening and Introduction 10:30 – 11:00 Coffee break 11:00 – 12:30 Inferring Theories from Data 12:30 – 13:30 Lunch 13:30 – 15:00 Designing Research based on Theories 15:00 – 15:30 Coffee break 15:30 – 16:30 Hands-on Working Session and Q&A 16:30 – 17:00 Wrap up (all) 6
  7. 7. What is a Scientific Theory 21 October 2015 IASESE 7
  8. 8. Scientific theories • A theory is a belief that there is a pattern in phenomena • A scientific theory is a theory that – Has survived tests against experience • Observation, measurement • Possiblyexperiment, simulation, trials – Has survived criticism by critical peers • Anonymous peer review • Publication • Replication 21 October 2015 IASESE 8
  9. 9. Examples (level 3) • Theory of cognitive dissonance • Theory of electromagnetism • The Balance theorem in social networks • Theories X, Y, Z, and W of (project) management • Technology Acceptance Model • Hannayet al. A Systematic “Review of Theory Use in Software Engineering Experiments”. IEEE TOSEM 33(2), February2007 • Lim et al. “Theories Used in Information Systems Research: Identifying Theory Networks in Leading IS Journals”./ ICIC 2009, paper 91. • Non-examples – Speculations based on imagination rather than fact: Conspiracy theories about who killed John Kennedy – Opinions that cannot be refuted: The Dutch lost the World Championship because they play like prima donnas 21 October 2015 IASESE 9
  10. 10. Design theories • A design theory is a scientific theory about an artifact in a context • Vriezekolk: What is a theory • Méndez: What is a theory 21 October 2015 IASESE 4 10
  11. 11. The Structure of Theories 21 October 2015 IASESE 11
  12. 12. The structure of scientific theories 1. Conceptual framework – Constructs used to express beliefs about patternsin phenomena – E.g. The concepts of beamforming, of multi-agent planning, of data location compliance. (level 3) 2. Generalizations – stated in terms of these concepts, that express beliefs about patterns in phenomena. – E.g. relationbetween angle of incidence and phase difference, – Statement about delay reduction on airports. (level 3) • Generalizations have a scope, a.k.a. target of generalization 21 October 2015 IASESE 4 12
  13. 13. The structure of design theories 1. Conceptual framework 2. Generalizations – Artifact specification X Context assumptions → Effects – Effects satisfya requirement to some extent 21 October 2015 IASESE 4 13
  14. 14. 1. Architectural structures: Class of systems, componentswith capabilities, interactions – E.g. entities, (de)composition,taxonomies, cardinality, events, processes, procedures, constraints, … (level 4) – Useful for case-based research (observationalcase studies, case experiments, simulations, technical action research) – Typically qualitative 2. Statistical structures: Population, variables with probability distributions, relations among variables – Useful for sample-based research (surveys, statisticaldifference- making experiments) – Typically quantitative Two kinds of conceptual structures 21 October 2015 IASESE 14
  15. 15. • Prechelt: What is a theory, the structure of theories • Vriezekolk: The structure of theories • Méndez: The structure of theories 21 October 2015 IASESE 15
  16. 16. The Use of Theories 21 October 2015 IASESE 16
  17. 17. Uses of a conceptual framework • Framing a problem or artifact: choosing which concepts to use – Using the theory of infectuous diseases to understand a patient’s symptoms – Using concepts of force & energy to understand behavior of a machine – Using concept of a coordination gatekeeper to understand a distributedSE project (all three examples at level 1) • Describe a problemor specify an artifact: using the concepts • Generalize about the problem or artifact • Analyze a problem or artifact (i.e. analyze the framework) 21 October 2015 IASESE 17
  18. 18. Functions of generalizations • Functions of generalizations – Explanation: explain phenomenaby identifyingcauses, mechanisms or reasons – Prediction: state what will happen in the future • Design: use generalizations to justifya design choice 21 October 2015 IASESE 18
  19. 19. • Prechelt: the use of theories • Vriezekolk: the use of theories • Méndez: the use of theories 21 October 2015 IASESE 19
  20. 20. Usability of theories • When is a design theory Context assumptions X Artifact design → Effects usable by a practitioner? 1. He/she is capable to recognize Context assumptions 2. and to acquire/build Artifact under constraints of practice, 3. effects will indeed occur, and 4. He/she can observe this, and 5. They will contribute to stakeholder goals/satisfy requirements • Practitioner has to asses the risk that each of these fails 21 October 2015 IASESE 20
  21. 21. • Prechelt: the usability of theories • Vriezekolk: the usability of theories • Méndez: the usability of theories 21 October 2015 IASESE 21
  22. 22. Agenda Time Topic 09:00 – 10:30 Opening and Introduction 10:30 – 11:00 Coffee break 11:00 – 12:30 Inferring Theories from Data 12:30 – 13:30 Lunch 13:30 – 15:00 Designing Research based on Theories 15:00 – 15:30 Coffee break 15:30 – 16:30 Hands-on Working Session and Q&A 16:30 – 17:00 Wrap up (all) 22
  23. 23. Scientific Inference 21 October 2015 IASESE 23
  24. 24. Case-based inference • Descriptive inference: Describing observations • Abductive inference: Providing an explanation • Analogic inference: Generalize to similar cases 21 October 2015 IASESE 24 Data Explanations Observations Generalizations Abduction Analogy Description Proposition(s) to generalize Scope of generalization
  25. 25. • Architectural explanation must be the basis of the analogic generalization; • Otherwise, we engage in wishful/magical thinking – You have observed that some small companies did not put a customer representative on-site of an agile project; – you explain this as a result of tight resources (level 3); – you generalize by analogy that this will happen in (almost) all small companies (level 3). 21 October 2015 IASESE 25 Data Explanations Observations Generalizations Abduction Analogy Description Architectural Architectural
  26. 26. Sample-based inference • Descriptive inference: Describe sample statistics • Statistical inference: Generalize to population parameters • Abductive inference: Provide an explanation • Analogic inference: Expand the scope of a theory based on similarity 21 October 2015 IASESE 26 Explanations Observations GeneralizationsStatistical inference AbductionAnalogyData Description
  27. 27. • Causal explanations can be supported by sample-based designs (treatment group/control group) • Generalization from a population, to similar populations must be based on architectural explanation – In an experiment witha sample of students you observe a difference between treatment group and control group; – By randomness you generalize topopulation of students – Your explanation: this difference is caused by the treatment (level 3); – In turn explainedby cognitive processes of students (level 3); – generalizedby analogy to novice software engineers (level 3). 21 October 2015 IASESE 27 Explanations Observations Generalizations AbductionAnalogyData Description Statistical inference Architectural Causal & Architectural
  28. 28. • Vriezekolk: Inferring theories from data • Méndez: inferring theories from data • Prechelt: Applying/inferring theories to/from data 21 October 2015 IASESE 28
  29. 29. Agenda Time Topic 09:00 – 10:30 Opening and Introduction 10:30 – 11:00 Coffee break 11:00 – 12:30 Inferring Theories from Data 12:30 – 13:30 Lunch 13:30 – 15:00 Designing Research based on Theories 15:00 – 15:30 Coffee break 15:30 – 16:30 Hands-on Working Session and Q&A 16:30 – 17:00 Wrap up (all) 29
  30. 30. Research Design 21 October 2015 IASESE 30
  31. 31. The research setup • In experiments we are interested in the effect of the treatment on the OoS – Requires capabilityto applytreatment and control • In observational studies we are interested in the structure and dynamics of the OoS itself – Only weak support for causality 21 October 2015 IASESE 31 Population Sample of Objects of Study Represents one or more population elements Treatment instruments Measure- ment instruments
  32. 32. • Case-based designs – provide architecturalexplanations – generalize by architectural analogy – Nondeterminism across cases is not quantified • Sample-based designs – Collect sample statistics – Infer properties of distributionover population – May be purely descriptive! – Possibly a causal explanation – To generalize further, need architectural explanation too – Nondeterminsim within the population is quantified, but not across analogous populations 21 October 2015 IASESE 32
  33. 33. Field versus lab 21 October 2015 IASESE 33 • If a phenomenoncannot be (re)produced in the lab, it can only be investigatedin the field • Which of the followingdesigns can be done in a lab? Case-based inference Sample-based inference No treatment (observational study) Observational case study Survey Treatment (experimental study) Single-case mechanism experiment, Technical action research Statistical difference- making experiment E.g. simulation, test of individual OoS Treatment group / control group designs E.g. test with client, pilot project
  34. 34. • Vriezekolk The research setup • Méndez: The research setup • Prechelt: The research setup 21 October 2015 IASESE 34
  35. 35. Agenda Time Topic 09:00 – 10:30 Opening and Introduction 10:30 – 11:00 Coffee break 11:00 – 12:30 Inferring Theories from Data 12:30 – 13:30 Lunch 13:30 – 15:00 Designing Research based on Theories 15:00 – 15:30 Coffee break 15:30 – 16:30 Hands-on Working Session and Q&A 16:30 – 17:00 Wrap up (all) 35
  36. 36. Hands-on Working Session 21 October 2015 IASESE 36
  37. 37. Hands-on Working Session 1. What is your research question? 2. Describe a research setup to answer it 3. What inferences do you plan to base on this setup? Groups of 3 • 15:30 Each person first drafts a flipchartwith his/her answers for own research • 15:45 Each group member comments on the two flipcharts of others in his/her group, in particularon: – Are the answers clear? – Are the answers defensible? • 16:30 Each person finalizes (for now) his/her flipchart • 16:31 Paste to the wall. See what you can learn from other designs. • 16:45 Plenary wrap-up 21 October 2015 IASESE 37
  38. 38. Q&A 21 October 2015 IASESE 38 You probably can’t ask anyway, so ask us!
  39. 39. 21 October 2015 IASESE 39
  40. 40. “Naming the pain in requirements engineering: A design for a global family of surveys and first results from Germany” Méndez& Wagner Information & Software technology 2015 “Towards Building Knowledge on Causes of Critical Requirements Engineering Problems” Kalinowski et al Twenty-Seventh International Conference on Software Engineering and Knowledge Engineering (SEKE 2015) pp. 1-6 40
  41. 41. • International on-linesurvey of requirements engineering professionals’ opinion about causes and effects of RE problems • Research questions – RQ 1 What are the expectations on a good RE? – RQ 2 How is RE defined, applied, and controlled? – RQ 3 How is RE continuously improved? – RQ 4 Which contemporary problems exist in RE, and what implications do they have? – RQ 5 Are there observable patterns of expectations, status quo, and problems in RE? • Observational research 41
  42. 42. What is a theory • The researchers formulated 34 hypotheses about – RE improvement • Is beneficial • Is challenging – RE standardization • Hampers creativity • Improves quality • …. – Company-specific standards • …. 42
  43. 43. • This theory (consisting of 34 proposed generalizations) is tested against – Opinions of professionals, based on their experience – Critical peer review in the publication process • The opinions of professionals are themselves theories based on experience, – but not subjected to systematic tests – nor to critical peer reviews 43
  44. 44. The structure of theories 1. Conceptual framework – Requirements, needs, goals, specification, RE skill, etc. 2. Generalizations – All if the claims about social mechanisms on previous slides 44
  45. 45. 45 customer Project team Requirements engineer Product Requirements specification No solution approach Agile approach No experience RE considered unimportant No RE qualification No time Team too small Different interests No domain knowledge No template Poor techniques No completeness check RE considered unimportant No RE skills Unclear needs Unrealistic expectations No engagement Unclear requirements Frequent changes Poorly defined Brazilian theory of social mechanisms that lead to incomplete requirements Artifact: Requirements engineering project Context: software development
  46. 46. 46 customer Project team Requirements engineer Product Requirements specification No solution approach Agile approach No experience RE considered unimportant No RE qualification No time Team too small Different interests No domain knowledge No contact person Solution orientation No template Poor techniques No completeness check No company standard RE considered unimportant No RE skills Unclear needs Unrealistic expectations No engagement Unclear requirements No contact person Solution orientation Domain complexity Frequent changes Poorly defined Business dept conflict German theory of social mechanisms that lead to incomplete requirements
  47. 47. • The conceptual structure of social mechanisms in the previous two slides is architectural: – Components – Interactions • Conceptual structure of the causal theories on the next slides is statistical: – Variables – Distribution over population 47
  48. 48. 48 • Brazilian respondents’ theory about causes and effects of incomplete requirements
  49. 49. • German respondents’ theory about causes and effects of incomplete requirements 49
  50. 50. The use of theories • “Requirements are incomplete because customers have unclear needs and has no RE skills” – Frame a phenomenon: requirements can be completely specified – Describe it: describe all mechanisms that are responsible for incomplete requirements – Specify a treatment: train the customer in RE skills (??) – Analyze it: — – Generalize about it: claim that this is responsible for incomplete requirements more often / always – Predict an effect: predict that it will happen in the next project – Explain an effect: explain that incompleteness is dues to unclear needs and absence of RE skills in customer 50
  51. 51. Usability of theories • The theory of 34 hypotheses is not intendedto be used by professionals to improve their practice. Consider the theory ``improvingRE skills reduces requirements incompleteness’’ 1. Professional is capable to recognize Context assumptions – Yes: recognizable when there is requirements engineering 2. Capable to acquire/build Artifact under constraintsof practice – That depends on the available budget (time, money) for RE training 3. The effects will indeed occur – That depends on the training; and on other factors causingRE incompleteness 4. He/she can observe this – Hard to say whether requirements are more complete 5. They will contribute to stakeholder goals/satisfy requirements – Hard to say whether RE completeness will contribute to stakeholder goals 51
  52. 52. Inferring theories from data – Description • Interpretation of the answers of the respondents • Descriptive statistics – Statistical inference • No statistical inference – Abductiveinference • The assumed explanation of the respondent’s answers is that they base them on experience – Analogic inference • Other professionals will answer similarly; but possibly different across countries/cultures 52
  53. 53. The research setup Population Sample of Objects of Study Represents one or more population elements Treatment instruments Measure- ment instruments 53 All RE professionals Sample of RE professionals No treatment On-line survey tool, questionnaire
  54. 54. 21 October 2015 IASESE 54
  55. 55. “Why Software Repositories Are Not Used For Defect-Insertion Circumstance Analysis More Often: A Case Study” Lutz Prechelt, Alexander Pepper Information and Software Technology 55
  56. 56. “Why Software Repositories Are Not Used For Defect-Insertion Circumstance Analysis More Often: A Case Study” Lutz Prechelt, Alexander Pepper Information and Software Technology • Pepper tried to mine software repositories of the content management system Fiona, produced by Infopark, in order to identify correlates of defect insertion, hoping that they can be used to improve the software process. – Engineering cycle of the client • Pepper and Prechelt observed this. – Case study • Validationof a community-wide development of MSR techniques for DICA. – Engineering cycle of research community • Research question that emerged from the case: why are MSR techniques for DICA not used more often? 56
  57. 57. What is a theory • Theory 1, held by the community: – MSR can provide information about improvement opportunities of the software process (p. 3 right column) • Artifact : MSR • Context: any software development process 57 Descriptive generalization
  58. 58. • Theory 2, proposed by Prechelt and Pepper based on the case study: – R1: … – … – R5: There is no affordable method to assess the reliability of the results of MSR in DICA – R6: The reliability of MSR results in DICA is low – R5 and R6 are the major reasons why MSR is not used for DICA • Artifact: MSR • Context: organizations that develop web applications for a long period of time, confuse defects with issues, and have no dedicated staff to maintain bug tracks (sect 8.1) 58 Descriptive generalizations Rational explanation of a phenomenon. (= architectural explanation, where some components are actors that have goals and may have reasons for actions)
  59. 59. The structure of theories • Conceptual framework – Definitions of change, defect, rework, issue, bug, bugfix, defect insertion, defect correction – Difficulty, cost, utility, reliability of a technique • NB1 concepts shared with the OoS • NB2 architectural framework • Generalizations – Previous slide • NB they are about the effects of a class of artifacts in a class of contexts 59
  60. 60. The use of theories • “MSR can provide information about improvement opportunities of the software process” – Frame a phenomenon: software improvement is a problem of lack of data about the software process – Describe it: describe software repositories – Specify a treatment: specify MSR techniques, tools and steps – Analyze it: analyze the meaning of the output of MSR – Generalize about it: claim that the outcome will be obtained in all software processes – Predict an effect: predict that it will happen in the next project – Explain an effect: explain that an improvement has occurred because of removal of a weak spot in the process 60
  61. 61. Usability of theories 1. Professional is capable to recognize Context assumptions – yes 2. Capable to acquire/build Artifact under constraintsof practice – Prechelt & Pepper: considerable effort in their case 3. The effects will indeed occur – No evidence that reliable information about processes will be produced 4. He/she can observe this – No: considerable uncertaintywhether effects have occured 5. They will contribute to stakeholder goals/satisfy requirements – No evidence that process improvements will occur 61
  62. 62. Applying existing theories to data and Inferring new or updated theories from data • Description – Case descriptions of every step – Interpretation of every step in terms of R1 – R6 • Statistical inference – Not possible from a case – (but there is one inside this case to investigate the relation between defect descriptions and issue descriptions) • Abductiveinference – Explanation of non-use in terms of R1 – R6 – Rational explanation in terms of reasons of actors • Analogic inference – Descriptions and explanation generalized by analogy – Discussion of external validity 62 How did it happen? • Existing theory 1 assumed, and falsified • New theory 2 emerged from the data and from opinions of actors in the OoS. Or were the propositions R1-6 specified before the case study was started?
  63. 63. The research setup Population Sample of Objects of Study Represents one or more population elements Treatment instruments Measure- ment instruments 63 Sources of evidence p. 5: Context information, raw data of version archive and bugtracker, analysis steps taken and not taken, issues and arguments of those steps, data provided by MSR tools, Infopark’s interpretation of the outcomes of the steps MSR tools providing data; Peppers work notes; Pepper’s memory (sect 8.3) MSR tools One complex Object of Study: Infopark and its software repositories Other software development organizations and their repositories Treatment is the 4–step procedure listed in sect 2.3 performed by Pepper at Infopark
  64. 64. 21 October 2015 IASESE 64
  65. 65. “Experimental Validation of a Risk Assessment Method” Vriezekolk, Etalle &Wieringa 21st Working Conference on Requirements Engineering: Foundations for Software Quality (REFSQ) 2015 65
  66. 66. • Lab experiment to test reliability of a method, RASTER, to assess risk of telecom availability – Research question: How reliable is RASTER? – Research setup: Six groupsof three students each had to estimate likelihood and impact of a list of non-availability risks for an email service, using the RASTER method 66
  67. 67. What is a theory • Design theory – RASTER x professionals providing services during incidents and disasters → availability risk assessments • Theory of the experiment – Sources of variability in assessment are • Ambiguity or incompleteness of the method description • Misunderstanding of the method, • Lack of experience • Lack of motivation • Case complexity • Disturbance from the environment 67 Empirical test, Peer review? Empirical test, Peer review? Artefact, context Artefact, context
  68. 68. The structure of theories Design theory 1. Conceptual framework – Raster concepts (infrastructure component, vulnerability, risk, impact, likelihood, …) 2. The design generalization Theory of the experiment 1. Conceptual framework – Risk assessor, team, target of assessment, asse4ssment environment 2. Generalizations – Claims about mechanisms that produce variability 68
  69. 69. The use of theories • “Raster x Professionals → risk assessments” – Frame a phenomenon: risk assessments are made by professionals – Describe it: describe telco infrastructure architecture and its vulnerabilities – Specify a treatment: use RASTER to assess risks – Analyze it: Trace risks to architecture components – Generalize about it: claim that other professionals would find the same risks of similar telco architectures – Predict an effect: predict that this will happen in the next project – Explain an effect: Explain assessments in terms of RASTER method and ToA 69
  70. 70. Usability of theories 1. Professional is capable to recognize Context assumptions – Yes 2. Capable to acquire/build Artifact under constraintsof practice – RASTER requires relativelylittle training; RA is expensive, but not due to RASTER 3. The effects will indeed occur – Has been shown in experiments and pilots 4. He/she can observe this – Plain for all to see 5. They will contribute to stakeholder goals/satisfy requirements – Goal is to obtain accurate and reliable assessments 70
  71. 71. Inferring theories from data – Description • Outcome of RA’s on paper • Krippendorf’s alpha to measure interrater agreement • Outcome of exit questionnaires to asses sources of variability – Statistical inference • Sample non-random, and too small. – Abductiveinference Observed variability explained by 1. lack of expert knowledge, 2. differences in assumptions, 3. difficulty to choose between adjacent ordinal values for likelihood – Analogic inference • 1 and 2 absent/reduced in the field, so less variabilitythere • 3 motivates improvement of the method to reduce this phenomenon 71
  72. 72. The research setup Population Sample of Objects of Study Represents one or more population elements Treatment instruments Measure- ment instruments 72 RA professionals in telco Doing RA in a quiet room Self-selected sample of students In a quiet room Application of RASTER to a small case Personal observation, Exit questionnaire, RASTER forms Oral instruction, written case description and RASTER help Similarities and dissimilarities! Used both to reason from sample to population 1. Theory of variability formulated; 2. Designed a research setup that minimized the impact of these sources; 3. Explained observed variation in terms of this theory 4. Used this to generalize to population and to improve RASTER

×