SlideShare a Scribd company logo
Research Methods
     in Computer Science
(Serge Demeyer — University of Antwerp)

Antwerp Systems and software Modelling

     Universiteit Antwerpen
Helicopter View


       How to perform research ?                    How to write research ?
      (and get “empirical” results)                (and get papers accepted)

          How many of you have
        done / will do a case-study ?

1. Research Methods                                                            2
Zürich Kunsthaus   Antwerp Middelheim
1. Research Methods
  • Origins of Computer Science
  • Research Philosophy
Research Methods
  • 1. Feasibility study
  • 2. Pilot Case
  • 3. Comparative study
  • 4. Observational Study [a.k.a. Etnography]
  • 5. Literature survey
  • 6. Formal Model
  • 7. Simulation
  • Studying a Case
    vs. Performing a Case Study
    + Proposition
    + Unit of Analysis
    + Threats to Validity

1. Research Methods                              4
What is (Ph.d.) Research ?

       Human             Elementary
      Knowledge            School               High School      Bachelor

                                   Ph.D.                              Ph.D.
            Master             (early stages)                      (finished)

1. Research Methods                                                             5
Computer Science

     All science is either physics or stamp collecting (E. Rutherford)
                         We study artifacts produced by humans

            Computer science is no more about computers than
               astronomy is about telescopes. (E. Dijkstra)

                  Computer science
                                                      Computer engineering

                                           Software Engineering

1. Research Methods                                                          6
Science vs. Engineering

                       Science                          Engineering

                                                     Civil Engineering
                           Chemistry   Computer             Electronics
             Biology                      ???
                                       Software Chemistry and Materials
                          Mathematics Engineering
               Geography                                Electro-Mechanical

1. Research Methods                                                          7
Mathematical Origins
Turing Machines
  • Halting problem                       (inductive) Reasoning
                                            • logical argumentation
Algorithmic Complexity                          + formal models,
  • P = ? NP                                       theorem proving, …
                                                + axioms & lemma’s
Compilers                                       + foo, bar type of examples
  • Chomsky hierarchy                       • “deep” and generic universal
  • Relational model

   Gödel theorem: consistency of the system is not provable in the system.

                      A complete and consistent set of axioms
                       for all of mathematics is impossible

1. Research Methods                                                           8
Engineering Origins
Computer Engineering                     Empirical Approach
  • Moore’s law: “the number of           • Tom De Marco: “you cannot
    transistors on a chip will double       control what you cannot
    about every two years”                  measure”
       + Self-fulfilling prophesy              + quantify
  • Hardware technology                        + mathematical model
       + RISC vs. CISC                    • Pareto principle
       + MPSoC                                 + 80 % - 20 % rule
  • Compiler optimization                        (80% of the effects come
       + peephole optimization                   from 20% of the causes)
       + branch prediction

                     As good as your next observation.
      Premise: The sun has risen in the east every morning up until now.
      Conclusion: The sun will also rise in the east tomorrow. … Or Not ?

1. Research Methods                                                         9
Influence of Society
                           Lives are at stake
                           (e.g., automatic pilot,
                           nuclear power plants)

                                         Huge amounts of money
                                         are at stake
                                         (e.g., Ariane V crash,
                                         Denver Airport Baggage)

                      Software became Ubiquitous
                        … its not a hobby anymore

                                                      Corporate success or failure
                                                      is at stake (e.g., telephone
                                                      billing, VTM launching 2nd

1. Research Methods                                                                  10
Interdisciplinary Nature

                        Science                Engineering


                      Economics                Sociology

1. Research Methods                                                     11
The Oak Forest
Robert Zünd - 1882
Franz and Luciano
Franz Gertsch - 1973
Objective             Subjective

   • Plato’s cave

   • Scientific Paradigm (Kuhn)
       + Dominant paradigm / Competing paradigms / Paradigm shift
            ➡ Normal science vs. Revolutionary science

1. Research Methods                                                 14
Dominant view on Research Methods
Physics                                   Medicine
(“The” Scientific method)                 (Double-blind treatment)
  • form hypothesis about a                 • form hypothesis about a
    phenomenon                                treatment
  • design experiment                       • select experimental and control
  • collect data                              groups that are comparable
  • compare data to hypothesis                except for the treatment
  • accept or reject hypothesis             • collect data
       + … publish (in Nature)              • commit statistics on the data
  • get someone else to repeat              • treatment      difference
    experiment (replication)                  (statistically significant)

                      Cannot answer the “big” questions
                      … in timely fashion
                      •smoking is unhealthy
                      •climate change
                      •darwin theory vs. intelligent design
                      •agile methods

1. Research Methods                                                         15
Experiment principles
  source: C. Wohlin, P. Runeson, M. Höst, M. Ohlsson, B. Regnell, and A. Wesslén.
  Experimentation in Software Engineering - An Introduction. Kluwer Academic Publishers,
  2000                                                              “Bo
                                Experiment objective               • To ring
                                                                       o m to r
                                                                   res           ea
  THEORY                                                               ear uch f d” s
                                                                           ch    ocu yn
                                                                              pro s o drom
                                                                                 ced n p     e
                                                                                    ure rop
                                       cause-effect                                         er
                       Cause                                     Effect
                      construct                                construct

                      Treatment                                Outcome

             Independent variable                        Dependent variable
                                  Experiment operation
1. Research Methods                                                                              16

Research Methods in Computer Science
Different Sources                                                           Static analysis

   • Marvin V. Zelkowitz and Dolores R.                                   Lesso ns learned

     Wallace, "Experimental Models for                                        Legacy data

     Validating Technology", IEEE                                        Literat ure search

     Computer, May 1998.                                                        Field st u dy

                                                  Validation method
                                                                                 Assertio n

                                                                                Case st u dy
   • Easterbrook, S. M., Singer, J., Storey,
                                                                       Project mo nit orin g
     M, and Damian, D. Selecting Empirical                                      Simulatio n                                            1995 (152 papers)
     Methods for Software Engineering                                    Dynamic analysis
                                                                                                                                       1990 (217 papers)
                                                                                                                                       1985 (243 papers)
     Research. Appears in F. Shull and J.                                        Syn t hetic
     Singer (eds) "Guide to Advanced                                            Replicated
     Empirical Software Engineering",                                 No experimen tatio n

     Springer, 2007.                           0371(81"#$%&'"()*+,-$.&,/"0&+1"/23+$%&'/"*.,"1&-14&-1+,5"&6"$.*6-,78"
                                                                                                0   5   10     15       20        25         30        35        40

   • Gordona Dodif-Crnkovic, “Scientific            "
                                                                                                              Percen tage o f papers

     Methods in Computer Science”
                                                lection method that conforms to any one of the 12             validate the claims in the paper. For completeness we
   • Andreas Höfer, Walter F. Tichy, Status     given data collection methods.
                                                   Our 12 methods are not the only ways to classify
                                                                                                              added the following two classifications:

     of Empirical Research in Software          data collection, although we believe they are the most         1. Not applicable. Some papers did not address some
                                                comprehensive. For example, Victor Basili6 calls an               new technology, so the concept of data collection does
     Engineering, Empirical Software            experiment in vivo when it is run at a development loca-          not apply. For example, a paper summarizing a recent
                                                tion and in vitro when it is run in an isolated, controlled
     Engineering Issues, p. 10-19,
                                                                                                                  conference or workshop wouldn’t be applicable.
                                                setting. According to Basili, a project may involve one        2. No experiment. Some papers describing a new

     Springer, 2007.                            team of developers or multiple teams, and an experi-
                                                ment may involve one project or multiple projects. This
                                                                                                                  technology contained no experimental validations.

                                                variability permits eight different experiment classifi-          In our survey, we were interested in the data col-
                                                cations. On the other hand, Barbara Kitchenham7 con-          lection methods employed by the authors of the papers
                                                siders nine classifications of experiments divided into        in order to determine our classification scheme’s com-
                                                three general categories: a quantitative experiment to        prehensiveness. We tried to distinguish between data
                                                identify measurable benefits of using a method or tool,        used as a demonstration of concept (which may
                                                a qualitative experiment to assess the features provided      involve some measurements as a “proof of concept,”
                                                by a method or tool, and a benchmarking experiment            but not a full validation of the method) and a true
                                                to determine performance.                                     attempt at validation of their results.
                                                                                                                 As in the study by Walter Tichy,8 we considered a
1. Research Methods                             MODEL VALIDATION
                                                     To test whether the classification presented here
                                                                                                              demonstration of technology via example as part of
                                                                                                              the analytical phase. The paper had to go beyond that "
Case studies - Spectrum

 case studies are widely used in computer science                      7. Simulation
   “studying a case” vs. “doing a case study”
                                                                       • what if ?
                                                           6. Formal Model
                                                           • underlying concepts ?
                                             5. Literature survey
                                             • what is known/unknown ?
                                 4. Observational Study
                                 • What is “it” ?
                      3. Comparative study
                      • is it better ?
         2. Pilot Case, Demonstrator
         • is it appropriate ?
 1. Feasibility study
 • is it possible ?
                                                    Source: Personal experience
                                                    (Guidelines for Master Thesis Research –
                                                    University of Antwerp)

1. Research Methods                                                                            18
The sixteenth of september
            Rene Margritte
Feasibility Study
Here is a new idea, is it possible ?
           ➡ Metaphor: Christopher Columbus and western route to India

   • Is it possible to solve a specific kind of problem … effectively ?
        + computer science perspective (P = NP, Turing test, …)
        + engineering perspective (build efficiently; fast — small)
        + economic perspective (cost effective; profitable)
   • Is the technique new / novel / innovative ?
        + compare against alternatives
             ➡ See literature survey; comparative study

   • Proof by construction
       + build a prototype
       + often by applying on a “CASE”

   • Conclusions
       + primarily qualitative; "lessons learned"
       + quantitative
         - economic perspective: cost - benefit
         - engineering perspective: speed - memory footprint

1. Research Methods                                                       20
The Prophet
Pablo Gargallo
Pilot Case (a.k.a. Demonstrator)
Here is an idea that has proven valuable; does it work for us ?
           ➡ Metaphor: Portugal (Amerigo Vespucci) explores western route

   • proven valuable
       + accepted merits (e.g. “lessons learned” from feasibility study)
       + there is some (implicit) theory explaining why the idea has merit
   • does it work for us
       + context is very important

   • Demonstrated on a simple yet representative “CASE”
       + “Pilot case” ≠ “Pilot Study”

   • Proof by construction
       + build a prototype
       + apply on a “case”

   • Conclusions
       + primarily qualitative; "lessons learned"
       + quantitative; preferably with predefined criteria
           ➡ compare to context before applying the idea !!

1. Research Methods                                                          22
Walking man
  Standing Figure
– Alberto Giacometti
Comparative Study
Here are two techniques, which one is better ?
  • for a given purpose !
       + (Not necessarily absolute ranking)
  • Where are the differences ? What are the tradeoffs ?

   • Criteria check-list
       + predefined
          - should not favor one technique
       + qualitative and quantitative
          - qualitative: how to remain unbiased ?
          - quantitative: represent what you want to know ?
       + Criteria check-list should be complete and reusable !
             ➡ If done well, most important contribution (replication !)
             ➡ See literature survey

   • Score criteria check-list
       + Often by applying the technique on a “CASE”

   • Compare
       + typically in the form of a table

1. Research Methods                                                        24
Observational Study [Ethnography]
Understand phenomena through observations
         ➡ Metaphor: Diane Fossey “Gorillas in the Mist”

   • systematic collection of data derived from direct observation of the
     everyday life
       + phenomena is best understood in the fullest possible context
           ➡ observation & participation
           ➡ interviews & questionnaires

   • Observing a series of cases “CASE”
       + observation vs. participation ?

   • example: Action Research
         + Action research is carried out by people who usually recognize a problem or limitation in
           their workplace situation and, together, devise a plan to counteract the problem,
           implement the plan, observe what happens, reflect on these outcomes, revise the plan,
           implement it, reflect, revise and so on.

   • Conclusions
       + primarily qualitative: classifications/observations/…

1. Research Methods                                                                                    26
Torben Giehler   Paul Klee
Matterhorn         Niesen
Literature Survey
What is known ? What questions are still open ?

   • source: B. A. Kitchenham, “Procedures for Performing Systematic
     Reviews”, Keele University Technical Report EBSE-2007-01, 2007

  • “comprehensive”
           ➡ precise research question is prerequisite
      + defined search strategy (rigor, completeness, replication)
      + clearly defined scope
        - criteria for inclusion and exclusion
      + specify information to be obtained
        - the “CASES” are the selected papers

   • outcome is organized

                 classification   taxonomy       conceptual model

                      table         tree            frequency

1. Research Methods                                                    28
Literature survey - example
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 0, NO. Survey of Program Comprehension through Dynamic Analysis
                         Cornelissen et al. - An Systematic 0, JANUARY 2000                                  5

                                     !"#$%&'()#)*%"$+                                      ;!#1$'#0!#"$+1'-
                                                                                                                                                              Bas Cornelissen, Andy Zaidman, Arie van
            #'1'9!)"09')&'-                                    7)$"$!10-'1'+"$,)           #'*'#')+'0+4'+3$)5            :$)!10-'1'+"$,)
                                              -'1'+"$,)                                                                                                       Deursen, Leon Moonen, Rainer Koschke. A
                                                                                                                                                              Systematic Survey of Program
                                                                                           <#"$+1'-0*#,.                                                      Comprehension through Dynamic Analysis
          !"#$%&'()'&'%#$*+                                                                                                                                   IEEE Transactions on Software Engineering
                                                                                                                                                              (TSE): 35(5): 684-702, 2009.
                 <""#$%&"'                    !""#$%&"'                      7)$"$!1                            !""#$%&"'
               *#!.'2,#3                    5')'#!1$/!"$,)                 !""#$%&"'-                         $(')"$*$+!"$,)


                              !""#$%&"'                        -&..!#$/!"$,)0,*                          +4!#!+"'#$/'(
                             !--$5).')"                          -$.$1!#02,#3                               !#"$+1'-

          !"#$%&'(%10"0%#'"$20#$*+                            Cornelissen et al. - An Systematic Survey of Program Comprehension through Dynamic Analysis
                                                                 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 0, NO. 0, JANUARY 2000                                                                                   22

                                               E'+,..')(!"$,)-                          $)"'#6#'"!"$,)

Fig. 1.     Overview of the systematic survey process.
                                                                   )*                                         (#
                                                                                                                                           #% #(
                                                                   !*                                                                            ##
vague, as it does not specify which properties are analyzed. To allow the definition to serve in
multiple problem domains, the exact properties under analysis are left open.                                                                          ') '!
                                                       $%                 $%                                                                                  $%                                   '* $%
   While the definition of dynamic analysis is rather abstract, we can elaborate upon the benefits                                                                   $# $# $" $'
                                                        &% &%
and limitations of using dynamic analysis in program comprehension contexts. The benefits that
                                             $*                                                   &'                   &&                                                                                  &$ && &*
                                                                                                          %                                                                      &*
we consider are:                                                                                                               (   (   #                                              #   #                           !

   •   The preciseness with regard to the actual*behavior of the software system, for example, in

                                                                            46>6 10
                                                                              1=3 -

                                                                             ;- :
                                                                            5-3 234


                                                                               > A/ 8
                                                                             0- /

                                                                           8- 2-/

                                                                                                                                                                                            1/ -
                                                                             62,8 </

                                                                                                                                                                                        31 7432

                                                                               /7 >

                                                                                                                                                                                                  5; 3

                                                                                                                                                                                        C@ 32?

                                                                             692 34

                                                                         +,/ <-6>
                                                                             5,46- >B

                                                                           D7 +>B
                                                                               4- 0 <


                                                                           @3 2,;/

                                                                              72, /


                                                                                                                                                                                          4,C -
                                                                                                                                                                                           2-0 -

                                                                                                                                                                                        ,1< /:1
                                                                             C- >

                                                                               /4,; /


       the context of object-oriented software software with its late binding mechanism.








                                                                                                                                                                                      @2- 636,+


                                                                       C@ /636
                                                                                 62 3


                                                                                                                                                                                    E1 7/62


                                                                                                                                                                                    97 1=1:

                                                                                + ,-




                                                                              > A3

                                                                          + ,/

       The fact that a goal-oriented strategy can be used, which entails the definition of an



                                                                        < ,/




 1. Research Methods only the parts of interest of the software system are analyzed.                                                                                                                                           29

    execution scenario such that

Vojin Bakic
Formal Model
How can we understand/explain the world ?
  • make a mathematical abstraction of a certain problem
      + analytical model, stochastic model, logical model, re-write
        system, ...
      + often explained using a “CASE”
  • prove some important characteristics
      + based on inductive reasoning, axioms & lemma’s, …

 • which factors are irrelevant (excluded) and which are not (included) ?
 • which properties are worthwhile (proven) ?
          ➡ See literature survey



1. Research Methods                                                         31
Eiger, Mönch and Jungfrau in the Morning Sun

                                       Eiffel Tower
What would happen if … ?

   • study circumstances of phenomena in detail
       + simulated because real world too expensive; too slow or impossible
   • make prognoses about what can happen in certain situations
       + test using real observations, typically obtained via a “CASE”

 • which circumstances are irrelevant (excluded) and which are not
    (included) ?
 • which properties are worthwhile (to be observed/predicted) ?
           ➡ See literature survey

  • distributed systems (grid); network protocols
       + too expensive or too slow to test in real life
  • embedded systems — simulating hardware platforms
       + impossible to observe real clock-speed / memory footprint / …
            ➡ Heisenberg uncertainty principle

1. Research Methods                                                       33
Case studies - Revisited                                       7. Simulation: test
                                                              prognoses with real
 case studies are widely used in computer science           observations obtained
   “studying a case” vs. “doing a case study”                        via a “CASE”
                                                     6. Formal Model
                                                     often explained using a “CASE”

                                            5. Literature survey
                                            “CASES” = selected papers
                                  4. Observational Study
                                  Observing a series of “CASES”
                      3. Comparative study
                      Score criteria check-list; often by applying on a “CASE”
          2. Pilot Case, Demonstrator
          Demonstrated on a simple yet representative “CASE”
  1. Feasibility study
  Proof by construction; often by applying on a “CASE”

1. Research Methods                                                              34
Case Study Research
  • Origins of Computer Science
  • Research Philosophy
Research Methods
  • 1. Feasibility study
  • 2. Pilot Case
  • 3. Comparative study
  • 4. Observational Study [a.k.a. Etnography]
  • 5. Literature survey                 Sources
  • 6. Formal Model                      • Robert K. Yin. Case Study Research:
  • 7. Simulation                          Design and Methods. 3rd Edition. SAGE
                                           Publications. California, 2009.
Conclusion                               • Bent Flyvbjerg, "Five
  • Studying a Case                        Misunderstandings About Case Study
     vs. Performing a Case Study           Research." Qualitative Inquiry, vol. 12,
    + Proposition                          no. 2, April 2006, pp. 219-245.
                                         • Runeson, P. and Höst, M. 2009.
    + Unit of Analysis                     Guidelines for conducting and reporting
    + Threats to Validity                  case study research in software
                                               engineering. Empirical Softw. Eng. 14,
                                               2 (Apr. 2009), 131-164.

1. Research Methods                                                                     35
Spectrum of cases
 created for explanation
 • foo, bar examples                        Toy-example
 • simple model;
   illustrates differences

                                                                              Martin S. Feather , Stephen Fickas ,
                      accepted teaching vehicle                               Anthony Finkelstein , Axel Van
                      • “textbook example”                 Exemplar           Lamsweerde, Requirements and
                      • simple but illustrates                                Specification Exemplars, Automated
                                                                              Software Engineering, v.4 n.4, p.
                        relevant issues                                       419-438, October 1997

                                          real-life example

                                                                                                                     Case study
Runeson, P. and Höst, M. 2009.
Guidelines for conducting and reporting
case study research in software
                                          • industrial system,            Case
engineering. Empirical Softw. Eng. 14,      open-source system
2 (Apr. 2009), 131-164.                   • context is difficult to grasp

      Mining Software Repositories Challenge.           competition (tool oriented)
      [Yearly workshop where research tools compete
      against one another on a common predefined        • approved by community                 Community case
      case.]                                            • comparing

  Susan Elliott Sim, Steve Easterbrook, and Richard C. Holt. Using             benchmark
  Benchmarking to Advance Research: A Challenge to Software
  Engineering, Proceedings of the Twenty-fifth International      Benchmark
                                                                               • approved by community
  Conference on Software Engineering, Portland, Oregon, pp.                    • known context
  74-83, 3-10 May, 2003.                                                       • “planted” issues

1. Research Methods                                                                                                      36
Case study — definition
A case study is an empirical inquiry that investigates a
contemporary phenomenon within its real-life context, especially
when the boundaries between the phenomenon and context are not
clearly evident
[Robert K. Yin. Case Study Research: Design and Methods; p. 13]

   • empirical inquiry: yes, it is empirical research
   • contemporary: (close to) real-time observations
       + incl. interviews
   • boundaries between the phenomenon and context not clear
       + as opposed to “experiment”

      Treatment                Outcome

                  Experiment                    Case Study

1. Research Methods                                                37
Case Study — Counter evidence



   - many more variables than data points
   - multiple sources of evidence; triangulation
   - theoretical propositions guide data collection
     (try to confirm or refute propositions with well-selected cases)

                                                          Case studies also look
                                                          for counter evidence

1. Research Methods                                                                38
Misunderstanding 2: Generalization
One cannot generalize on the basis of an individual case; therefore
the case study cannot contribute to scientific development.
               ➡ [Bent Flyvbjerg, "Five Misunderstandings About Case Study Research."]

   • Understanding
       + The power of examples
       + Formal generalization is overvalued
         - dominant research views of physics and medicine

   • Counterexamples
       + one black swan falsifies “all swans are white”
         - case studies generate deep understanding; what appears to be
           white often turns out to be black

   • sampling logic vs. replication logic
       + sampling logic: operational enumeration of entire universe
         - use statistics: generalize from “randomly selected” observations
       + replication logic: careful selection of boundary values
         - use logic reasoning: presence of absence of property has effect

1. Research Methods                                                                      39
Sampling Logic vs. Replication Logic

                                              Boundary value

                                           Selection of (boundary) value
      Random selection                       understand differences
        generalize for entire population          • propositions
                                                  • units of analysis

1. Research Methods                                                        40
Research questions for Case Studies
Existence:                 Exploratory            Relationship               Explanatory
  • Does X exist?                                   • Are X and Y related?
                                                    • Do occurrences of X correlate with
Description & Classification                           occurrences of Y?
  • What is X like?
  • What are its properties?                      Causality
  • How can it be categorized?                      • What causes X?
  • How can we measure it?                          • What effect does X have on Y?
  • What are its components?                        • Does X cause Y?
                                                    • Does X prevent Y?
  • How does X differ from Y?                     Causality-Comparative
                                                    • Does X cause more Y than does Z?
Frequency and Distribution                          • Is X better at preventing Y than is Z?
  • How often does X occur?                         • Does X cause more Y than does Z
  • What is an average amount of X?                   under one condition but not others?

Descriptive-Process                               Design
  • How does X normally work?                       • What is an effective way to achieve
  • By what process does X happen?                     X?
  • What are the steps as X evolves?                • How can we improve X?

   Source: Empirical Research Methods in Requirements Engineering.
   Tutorial given at RE'07, New Delhi, India, Oct 2007.

1. Research Methods                                                                            41
Proposition (a.k.a. Purpose)

                                                                 Where to expect boundaries ?
                                                                  Thorough preparation is necessary !
                                                                      You need an explicit theory.
        Boundary value

                              Exploratory                                                             Confirmatory

                                                                             Confirmatory case studies are used to test existing
    Exploratory case studies are used as initial                             theories. The latter are especially important for
    investigations of some phenomena to derive new                           refuting theories: a detailed case study of a real
    hypotheses and build theories.(*)                                        situation in which a theory fails may be more
                                                                             convincing than failed experiments in the lab.(*)

    (*) Steve Easterbrook, Janice Singer, Margaret-Anne Storey, and Daniela Damian. Selecting empirical methods for soft- ware engineering
         research. In Forrest Shull, Janice Singer, and Dag I. K. Sjoberg, editors, Guide to Advanced Empirical Software Engineering, pages 285—311.
         Springer London, 2008.

1. Research Methods                                                                                                                                    42
Units of Analysis
What phenomena to analyze
 • depends on research questions
 • affects data collection & interpretation
 • affects generalizability
                                   Example: Clone Detection, Bug Prediction
                                   • the tool/algorithm
                                         Does it work ?
  • individual developer           • the individual developer
  • a team                               How/why does he produce bugs/clones ?
  • a decision                     • about the culture/process in the team
  • a process                            How does the team prevent bugs/clones ?
                                         How successful is this prevention ?
  • a programming language
                                   • about the programming language
  • a tool                               How vulnerable is the programming
                                         language towards clones / bugs ?
Design in advance                        (COBOL vs. AspectJ)
  • avoid “easy” units of analysis
      + cases restricted to Java because parser
        - Is the language really an issue for your research question ?
      + report size of the system (KLOC, # Classes, # Bug reports)
        - Is team composition not more important ?

1. Research Methods                                                                43
Threats to Validity (Experiments)
                                       Experiment objective

                            Cause                               Effect
                           construct                          construct

             OBSERVATION           3                                  3
                           Treatment                          Outcome

                                             1       2
                      Independent variable                Dependent variable

                                       Experiment operation
   1.   Conclusion validity
   2.   Internal validity
   3.   Construct validity
   4.   External validity

1. Research Methods                                                            44
Threats to validity (Case Studies)
   • Source: Runeson, P. and Höst, M. 2009. Guidelines for conducting and
     reporting case study research in software engineering.

1. Construct validity
  • Do the operational measures reflect what the researcher had in mind ?
2. Internal validity
  • Are there any other factors that may affect the results ?
            ➡ Mainly when investigating causality !
3. External validity
  • To what extent can the findings be generalized ?
            ➡ Precise research question & units of analysis required
4. Reliability
  • To what extent is the data and the analysis dependent on the
     researcher (the instruments, …)

Other categories have been proposed as well
  • credibility, transferability, dependability, confirmability

1. Research Methods                                                         45
Threats to validity — Examples (1/2)
1. Construct validity
  • Do the operational measures reflect what the researcher had in mind ?
  • Time recorded vs. time spent
  • Execution time, memory consumption, …
      + noise of operating system, sampling method
  • Human-assigned classifiers (bug severity, …)
      + risk for “default” values
  • Participants in interviews have pressure to answer positively

2. Internal validity
  • Are there any other factors that may affect the results ?
  • Were phenomena observed under special conditions
       + in the lab, close to a deadline, company risked bankruptcy, …
       + major turnover in team, contributors changed (open-source), …
  • Similar observations repeated over time (learning effects)

1. Research Methods                                                         46
Threats to validity — Examples (2/2)
3. External validity
  • To what extent can the findings be generalized ?
  • Does it apply to other languages ? other sizes ? other domains ?
  • Background & education of participants
  • Simplicity & scale of the team
      + small teams & flexible roles vs. large organizations & fixed roles

4. Reliability
  • To what extent is the data and the analysis dependent on the
    researcher (the instruments, …)
  • How did you cope with bugs in the tool, the instrument ?
  • Classification: if others were to classify, would they obtain the same ?
  • How did you search for evidence in mailing archives, bug reports, …

1. Research Methods                                                            47
Threats to validity = Risk Management
No experimental design can be “perfect”
… but you can limit the chance of deriving false conclusions

   • manage the risk of false conclusions as much as possible
      + likelihood
      + impact

   • state clearly what and how you alleviated the risk (replication !)
       + construct validity
          - precise metric definitions
          - GQM paradigm
       + internal & external validity
          - report the context consciously
       + Reliability
          - bugs in tools: testing, usage of well-known libraries, …
          - classification: develop guidelines & others repeat classification
          - search for evidence (mailing archives, bug reports, …):
             have an explicit search procedure

1. Research Methods                                                             48
1. Research Methods
  • Origins of Computer Science
  • Research Philosophy
Research Methods
  • 1. Feasibility study
  • 2. Pilot Case
  • 3. Comparative study
  • 4. Observational Study [a.k.a. Etnography]
  • 5. Literature survey
  • 6. Formal Model
  • 7. Simulation
  • Studying a Case
    vs. Performing a Case Study
    + Proposition
    + Unit of Analysis
    + Threats to Validity

1. Research Methods                              49
Studying a case vs. Performing a case study
1. Questions
  • most likely “How” and “Why”; also sometimes “What”

2. Propositions (a.k.a. Purpose)

                                                                               –––––––––Low hanging fruit–––––––––
  • explanatory: where to look for evidence
  • exploratory: rationale and direction
       + example: Christopher Columbus asks for sponsorship
         - Why three ships (not one, not five) ?
         - Why going westward (not south ?)
  • role of “Theories”
       + possible explanations (how, why) for certain phenomena
            ➡ Obtained through literature survey

3. Unit(s) of analysis
  • What is the case ?

                                                                  Threats to
4. Logic linking data to propositions

+ 5. Criteria for interpreting findings
  • Chain of evidence from multiple sources
  • When does data confirm proposition ? When does it refute ?

1. Research Methods                                                                                 50

More Related Content

Viewers also liked

Migration and Refactoring - Identifying Overly Strong Conditions in Refactori...
Migration and Refactoring - Identifying Overly Strong Conditions in Refactori...Migration and Refactoring - Identifying Overly Strong Conditions in Refactori...
Migration and Refactoring - Identifying Overly Strong Conditions in Refactori...
ICSM 2011
Industry - Estimating software maintenance effort from use cases an indu...
Industry - Estimating software maintenance effort from use cases an      indu...Industry - Estimating software maintenance effort from use cases an      indu...
Industry - Estimating software maintenance effort from use cases an indu...
ICSM 2011
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
ICSM 2011
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
ICSM 2011
ERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to TaskERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to Task
ICSM 2011
Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces
Metrics - Using Source Code Metrics to Predict Change-Prone Java InterfacesMetrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces
Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces
ICSM 2011
ICSM'01 Most Influential Paper - Rainer Koschke
ICSM'01 Most Influential Paper - Rainer KoschkeICSM'01 Most Influential Paper - Rainer Koschke
ICSM'01 Most Influential Paper - Rainer KoschkeICSM 2011
ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild
ICSM 2011
Industry - Testing & Quality Assurance in Data Migration Projects
Industry - Testing & Quality Assurance in Data Migration Projects Industry - Testing & Quality Assurance in Data Migration Projects
Industry - Testing & Quality Assurance in Data Migration Projects
ICSM 2011
Industry - Evolution and migration - Incremental and Iterative Reengineering ...
Industry - Evolution and migration - Incremental and Iterative Reengineering ...Industry - Evolution and migration - Incremental and Iterative Reengineering ...
Industry - Evolution and migration - Incremental and Iterative Reengineering ...
ICSM 2011
Natural Language Analysis - Mining Java Class Naming Conventions
Natural Language Analysis - Mining Java Class Naming ConventionsNatural Language Analysis - Mining Java Class Naming Conventions
Natural Language Analysis - Mining Java Class Naming Conventions
ICSM 2011
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
ICSM 2011
Components - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API LimitationsComponents - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API Limitations
ICSM 2011
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
ICSM 2011
Metrics - You can't control the unfamiliar
Metrics - You can't control the unfamiliarMetrics - You can't control the unfamiliar
Metrics - You can't control the unfamiliar
ICSM 2011
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
ICSM 2011
Lionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 KeynoteLionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 Keynote
ICSM 2011
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
ICSM 2011
ERA - A Comparison of Stemmers on Source Code Identifiers for Software Search
ERA - A Comparison of Stemmers on Source Code Identifiers for Software SearchERA - A Comparison of Stemmers on Source Code Identifiers for Software Search
ERA - A Comparison of Stemmers on Source Code Identifiers for Software Search
ICSM 2011
Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...
ICSM 2011

Viewers also liked (20)

Migration and Refactoring - Identifying Overly Strong Conditions in Refactori...
Migration and Refactoring - Identifying Overly Strong Conditions in Refactori...Migration and Refactoring - Identifying Overly Strong Conditions in Refactori...
Migration and Refactoring - Identifying Overly Strong Conditions in Refactori...
Industry - Estimating software maintenance effort from use cases an indu...
Industry - Estimating software maintenance effort from use cases an      indu...Industry - Estimating software maintenance effort from use cases an      indu...
Industry - Estimating software maintenance effort from use cases an indu...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...
ERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to TaskERA - Clustering and Recommending Collections of Code Relevant to Task
ERA - Clustering and Recommending Collections of Code Relevant to Task
Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces
Metrics - Using Source Code Metrics to Predict Change-Prone Java InterfacesMetrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces
Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces
ICSM'01 Most Influential Paper - Rainer Koschke
ICSM'01 Most Influential Paper - Rainer KoschkeICSM'01 Most Influential Paper - Rainer Koschke
ICSM'01 Most Influential Paper - Rainer Koschke
ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild ERA - Measuring Maintainability of Spreadsheets in the Wild
ERA - Measuring Maintainability of Spreadsheets in the Wild
Industry - Testing & Quality Assurance in Data Migration Projects
Industry - Testing & Quality Assurance in Data Migration Projects Industry - Testing & Quality Assurance in Data Migration Projects
Industry - Testing & Quality Assurance in Data Migration Projects
Industry - Evolution and migration - Incremental and Iterative Reengineering ...
Industry - Evolution and migration - Incremental and Iterative Reengineering ...Industry - Evolution and migration - Incremental and Iterative Reengineering ...
Industry - Evolution and migration - Incremental and Iterative Reengineering ...
Natural Language Analysis - Mining Java Class Naming Conventions
Natural Language Analysis - Mining Java Class Naming ConventionsNatural Language Analysis - Mining Java Class Naming Conventions
Natural Language Analysis - Mining Java Class Naming Conventions
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...
Components - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API LimitationsComponents - Graph Based Detection of Library API Limitations
Components - Graph Based Detection of Library API Limitations
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Tutorial 2 - Practical Combinatorial (t-way) Methods for Detecting Complex Fa...
Metrics - You can't control the unfamiliar
Metrics - You can't control the unfamiliarMetrics - You can't control the unfamiliar
Metrics - You can't control the unfamiliar
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...Industry -  Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...
Lionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 KeynoteLionel Briand ICSM 2011 Keynote
Lionel Briand ICSM 2011 Keynote
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...
ERA - A Comparison of Stemmers on Source Code Identifiers for Software Search
ERA - A Comparison of Stemmers on Source Code Identifiers for Software SearchERA - A Comparison of Stemmers on Source Code Identifiers for Software Search
ERA - A Comparison of Stemmers on Source Code Identifiers for Software Search
Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...Reliability and Quality - Predicting post-release defects using pre-release f...
Reliability and Quality - Predicting post-release defects using pre-release f...

Similar to Tutorial 3 - Research methods - Part 1

Figuring out Computer Science
Figuring out Computer ScienceFiguring out Computer Science
Figuring out Computer Science
nTier Custom Solutions
Scientific software engineering methods and their validity
Scientific software engineering methods and their validityScientific software engineering methods and their validity
Scientific software engineering methods and their validity
Daniel Mendez
Grid07 7 Gagliardi
Grid07 7 GagliardiGrid07 7 Gagliardi
Grid07 7 Gagliardiimec.archive
Empirical Software Engineering - What is it and why do we need it?
Empirical Software Engineering - What is it and why do we need it?Empirical Software Engineering - What is it and why do we need it?
Empirical Software Engineering - What is it and why do we need it?
Daniel Mendez
science _engineering__and_the_liberal_arts
science _engineering__and_the_liberal_artsscience _engineering__and_the_liberal_arts
science _engineering__and_the_liberal_arts
Informatics is a natural science
Informatics is a natural scienceInformatics is a natural science
Informatics is a natural science
Frank van Harmelen
Foundations of Machine Learning
Foundations of Machine LearningFoundations of Machine Learning
Foundations of Machine Learning
2005: Natural Computing - Concepts and Applications
2005: Natural Computing - Concepts and Applications2005: Natural Computing - Concepts and Applications
2005: Natural Computing - Concepts and Applications
Leandro de Castro
The Epistemology of Software Engineering
The Epistemology of Software EngineeringThe Epistemology of Software Engineering
The Epistemology of Software Engineering
Research in Computer Science and Engineering
Research in Computer Science and EngineeringResearch in Computer Science and Engineering
Research in Computer Science and Engineering
ppt_ids-data science.pdf
ppt_ids-data science.pdfppt_ids-data science.pdf
ppt_ids-data science.pdf
computer science engineering spe ialized in artificial Intelligence
computer science engineering spe ialized in artificial Intelligencecomputer science engineering spe ialized in artificial Intelligence
computer science engineering spe ialized in artificial Intelligence
Introduction to Artificial Intelligences
Introduction to Artificial IntelligencesIntroduction to Artificial Intelligences
Introduction to Artificial Intelligences
Meenakshi Paul
Is Computer Science Science?
Is Computer Science Science?Is Computer Science Science?
Is Computer Science Science?
Daniel Cukier
Foundations of Intelligence Agents
Foundations of Intelligence AgentsFoundations of Intelligence Agents
Foundations of Intelligence Agents

Similar to Tutorial 3 - Research methods - Part 1 (20)

Figuring out Computer Science
Figuring out Computer ScienceFiguring out Computer Science
Figuring out Computer Science
Scientific software engineering methods and their validity
Scientific software engineering methods and their validityScientific software engineering methods and their validity
Scientific software engineering methods and their validity
Grid07 7 Gagliardi
Grid07 7 GagliardiGrid07 7 Gagliardi
Grid07 7 Gagliardi
Empirical Software Engineering - What is it and why do we need it?
Empirical Software Engineering - What is it and why do we need it?Empirical Software Engineering - What is it and why do we need it?
Empirical Software Engineering - What is it and why do we need it?
science _engineering__and_the_liberal_arts
science _engineering__and_the_liberal_artsscience _engineering__and_the_liberal_arts
science _engineering__and_the_liberal_arts
Informatics is a natural science
Informatics is a natural scienceInformatics is a natural science
Informatics is a natural science
Foundations of Machine Learning
Foundations of Machine LearningFoundations of Machine Learning
Foundations of Machine Learning
2005: Natural Computing - Concepts and Applications
2005: Natural Computing - Concepts and Applications2005: Natural Computing - Concepts and Applications
2005: Natural Computing - Concepts and Applications
The Epistemology of Software Engineering
The Epistemology of Software EngineeringThe Epistemology of Software Engineering
The Epistemology of Software Engineering
Research in Computer Science and Engineering
Research in Computer Science and EngineeringResearch in Computer Science and Engineering
Research in Computer Science and Engineering
ppt_ids-data science.pdf
ppt_ids-data science.pdfppt_ids-data science.pdf
ppt_ids-data science.pdf
computer science engineering spe ialized in artificial Intelligence
computer science engineering spe ialized in artificial Intelligencecomputer science engineering spe ialized in artificial Intelligence
computer science engineering spe ialized in artificial Intelligence
Introduction to Artificial Intelligences
Introduction to Artificial IntelligencesIntroduction to Artificial Intelligences
Introduction to Artificial Intelligences
Is Computer Science Science?
Is Computer Science Science?Is Computer Science Science?
Is Computer Science Science?
Foundations of Intelligence Agents
Foundations of Intelligence AgentsFoundations of Intelligence Agents
Foundations of Intelligence Agents

Recently uploaded

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood

Recently uploaded (20)

Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...

Tutorial 3 - Research methods - Part 1

  • 1. Research Methods in Computer Science (Serge Demeyer — University of Antwerp) AnSyMo Antwerp Systems and software Modelling Universiteit Antwerpen
  • 2. Helicopter View (Ph.D.) Research How to perform research ? How to write research ? (and get “empirical” results) (and get papers accepted) How many of you have done / will do a case-study ? 1. Research Methods 2
  • 3. Zürich Kunsthaus Antwerp Middelheim
  • 4. 1. Research Methods Introduction • Origins of Computer Science • Research Philosophy Research Methods • 1. Feasibility study • 2. Pilot Case • 3. Comparative study • 4. Observational Study [a.k.a. Etnography] • 5. Literature survey • 6. Formal Model • 7. Simulation Conclusion • Studying a Case vs. Performing a Case Study + Proposition + Unit of Analysis + Threats to Validity 1. Research Methods 4
  • 5. What is (Ph.d.) Research ? Human Elementary Knowledge School High School Bachelor Ph.D. Ph.D. Master (early stages) (finished) 1. Research Methods 5
  • 6. Computer Science All science is either physics or stamp collecting (E. Rutherford) We study artifacts produced by humans Computer science is no more about computers than astronomy is about telescopes. (E. Dijkstra) Computer science Computer engineering Informatics Software Engineering 1. Research Methods 6
  • 7. Science vs. Engineering Science Engineering Physics Civil Engineering ??? Chemistry Computer Electronics Science Biology ??? Software Chemistry and Materials Mathematics Engineering ??? Geography Electro-Mechanical Engineering 1. Research Methods 7
  • 8. Mathematical Origins Turing Machines • Halting problem (inductive) Reasoning • logical argumentation Algorithmic Complexity + formal models, • P = ? NP theorem proving, … + axioms & lemma’s Compilers + foo, bar type of examples • Chomsky hierarchy • “deep” and generic universal knowledge Databases • Relational model Gödel theorem: consistency of the system is not provable in the system. A complete and consistent set of axioms for all of mathematics is impossible 1. Research Methods 8
  • 9. Engineering Origins Computer Engineering Empirical Approach • Moore’s law: “the number of • Tom De Marco: “you cannot transistors on a chip will double control what you cannot about every two years” measure” + Self-fulfilling prophesy + quantify • Hardware technology + mathematical model + RISC vs. CISC • Pareto principle + MPSoC + 80 % - 20 % rule • Compiler optimization (80% of the effects come + peephole optimization from 20% of the causes) + branch prediction As good as your next observation. Premise: The sun has risen in the east every morning up until now. Conclusion: The sun will also rise in the east tomorrow. … Or Not ? 1. Research Methods 9
  • 10. Influence of Society Lives are at stake (e.g., automatic pilot, nuclear power plants) Huge amounts of money are at stake (e.g., Ariane V crash, Denver Airport Baggage) Software became Ubiquitous … its not a hobby anymore Corporate success or failure is at stake (e.g., telephone billing, VTM launching 2nd channel) 1. Research Methods 10
  • 11. Interdisciplinary Nature “Hard” Science Engineering Sciences Computer Science Action Research “Soft” Economics Sociology Sciences Psychology 1. Research Methods 11
  • 12. The Oak Forest Robert Zünd - 1882
  • 13. Franz and Luciano Franz Gertsch - 1973
  • 14. Objective Subjective • Plato’s cave • Scientific Paradigm (Kuhn) + Dominant paradigm / Competing paradigms / Paradigm shift ➡ Normal science vs. Revolutionary science 1. Research Methods 14
  • 15. Dominant view on Research Methods Physics Medicine (“The” Scientific method) (Double-blind treatment) • form hypothesis about a • form hypothesis about a phenomenon treatment • design experiment • select experimental and control • collect data groups that are comparable • compare data to hypothesis except for the treatment • accept or reject hypothesis • collect data + … publish (in Nature) • commit statistics on the data • get someone else to repeat • treatment difference experiment (replication) (statistically significant) Cannot answer the “big” questions … in timely fashion •smoking is unhealthy •climate change •darwin theory vs. intelligent design •… •agile methods 1. Research Methods 15
  • 16. Experiment principles source: C. Wohlin, P. Runeson, M. Höst, M. Ohlsson, B. Regnell, and A. Wesslén. Experimentation in Software Engineering - An Introduction. Kluwer Academic Publishers, 2000 “Bo Experiment objective • To ring o m to r res ea THEORY ear uch f d” s ch ocu yn pro s o drom ced n p e ure rop cause-effect er construct Cause Effect construct construct OBSERVATION treatment- outcome construct Treatment Outcome Independent variable Dependent variable Experiment operation 1. Research Methods 16
  • 17. !"""!"#$%&'()*+%$,(-&./%$(01(23456" Research Methods in Computer Science Different Sources Static analysis • Marvin V. Zelkowitz and Dolores R. Lesso ns learned Wallace, "Experimental Models for Legacy data Validating Technology", IEEE Literat ure search Computer, May 1998. Field st u dy Validation method Assertio n Case st u dy • Easterbrook, S. M., Singer, J., Storey, Project mo nit orin g M, and Damian, D. Selecting Empirical Simulatio n 1995 (152 papers) Methods for Software Engineering Dynamic analysis 1990 (217 papers) 1985 (243 papers) Research. Appears in F. Shull and J. Syn t hetic Singer (eds) "Guide to Advanced Replicated Empirical Software Engineering", No experimen tatio n " Springer, 2007. 0371(81"#$%&'"()*+,-$.&,/"0&+1"/23+$%&'/"*.,"1&-14&-1+,5"&6"$.*6-,78" 0 5 10 15 20 25 30 35 40 • Gordona Dodif-Crnkovic, “Scientific " Percen tage o f papers Methods in Computer Science” lection method that conforms to any one of the 12 validate the claims in the paper. For completeness we • Andreas Höfer, Walter F. Tichy, Status given data collection methods. Our 12 methods are not the only ways to classify added the following two classifications: of Empirical Research in Software data collection, although we believe they are the most 1. Not applicable. Some papers did not address some comprehensive. For example, Victor Basili6 calls an new technology, so the concept of data collection does Engineering, Empirical Software experiment in vivo when it is run at a development loca- not apply. For example, a paper summarizing a recent tion and in vitro when it is run in an isolated, controlled Engineering Issues, p. 10-19, conference or workshop wouldn’t be applicable. setting. According to Basili, a project may involve one 2. No experiment. Some papers describing a new Springer, 2007. team of developers or multiple teams, and an experi- ment may involve one project or multiple projects. This technology contained no experimental validations. variability permits eight different experiment classifi- In our survey, we were interested in the data col- cations. On the other hand, Barbara Kitchenham7 con- lection methods employed by the authors of the papers siders nine classifications of experiments divided into in order to determine our classification scheme’s com- three general categories: a quantitative experiment to prehensiveness. We tried to distinguish between data identify measurable benefits of using a method or tool, used as a demonstration of concept (which may a qualitative experiment to assess the features provided involve some measurements as a “proof of concept,” by a method or tool, and a benchmarking experiment but not a full validation of the method) and a true to determine performance. attempt at validation of their results. As in the study by Walter Tichy,8 we considered a 1. Research Methods MODEL VALIDATION To test whether the classification presented here 17 demonstration of technology via example as part of the analytical phase. The paper had to go beyond that "
  • 18. Case studies - Spectrum case studies are widely used in computer science 7. Simulation “studying a case” vs. “doing a case study” • what if ? 6. Formal Model • underlying concepts ? 5. Literature survey • what is known/unknown ? 4. Observational Study • What is “it” ? 3. Comparative study • is it better ? 2. Pilot Case, Demonstrator • is it appropriate ? 1. Feasibility study • is it possible ? Source: Personal experience (Guidelines for Master Thesis Research – University of Antwerp) 1. Research Methods 18
  • 19. The sixteenth of september Rene Margritte
  • 20. Feasibility Study Here is a new idea, is it possible ? ➡ Metaphor: Christopher Columbus and western route to India • Is it possible to solve a specific kind of problem … effectively ? + computer science perspective (P = NP, Turing test, …) + engineering perspective (build efficiently; fast — small) + economic perspective (cost effective; profitable) • Is the technique new / novel / innovative ? + compare against alternatives ➡ See literature survey; comparative study • Proof by construction + build a prototype + often by applying on a “CASE” • Conclusions + primarily qualitative; "lessons learned" + quantitative - economic perspective: cost - benefit - engineering perspective: speed - memory footprint 1. Research Methods 20
  • 22. Pilot Case (a.k.a. Demonstrator) Here is an idea that has proven valuable; does it work for us ? ➡ Metaphor: Portugal (Amerigo Vespucci) explores western route • proven valuable + accepted merits (e.g. “lessons learned” from feasibility study) + there is some (implicit) theory explaining why the idea has merit • does it work for us + context is very important • Demonstrated on a simple yet representative “CASE” + “Pilot case” ≠ “Pilot Study” • Proof by construction + build a prototype + apply on a “case” • Conclusions + primarily qualitative; "lessons learned" + quantitative; preferably with predefined criteria ➡ compare to context before applying the idea !! 1. Research Methods 22
  • 23. Walking man Standing Figure – Alberto Giacometti
  • 24. Comparative Study Here are two techniques, which one is better ? • for a given purpose ! + (Not necessarily absolute ranking) • Where are the differences ? What are the tradeoffs ? • Criteria check-list + predefined - should not favor one technique + qualitative and quantitative - qualitative: how to remain unbiased ? - quantitative: represent what you want to know ? + Criteria check-list should be complete and reusable ! ➡ If done well, most important contribution (replication !) ➡ See literature survey • Score criteria check-list + Often by applying the technique on a “CASE” • Compare + typically in the form of a table 1. Research Methods 24
  • 25.
  • 26. Observational Study [Ethnography] Understand phenomena through observations ➡ Metaphor: Diane Fossey “Gorillas in the Mist” • systematic collection of data derived from direct observation of the everyday life + phenomena is best understood in the fullest possible context ➡ observation & participation ➡ interviews & questionnaires • Observing a series of cases “CASE” + observation vs. participation ? • example: Action Research + Action research is carried out by people who usually recognize a problem or limitation in their workplace situation and, together, devise a plan to counteract the problem, implement the plan, observe what happens, reflect on these outcomes, revise the plan, implement it, reflect, revise and so on. • Conclusions + primarily qualitative: classifications/observations/… 1. Research Methods 26
  • 27. Torben Giehler Paul Klee Matterhorn Niesen
  • 28. Literature Survey What is known ? What questions are still open ? • source: B. A. Kitchenham, “Procedures for Performing Systematic Reviews”, Keele University Technical Report EBSE-2007-01, 2007 Systematic • “comprehensive” ➡ precise research question is prerequisite + defined search strategy (rigor, completeness, replication) + clearly defined scope - criteria for inclusion and exclusion + specify information to be obtained - the “CASES” are the selected papers • outcome is organized classification taxonomy conceptual model table tree frequency 1. Research Methods 28
  • 29. Literature survey - example IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 0, NO. Survey of Program Comprehension through Dynamic Analysis Cornelissen et al. - An Systematic 0, JANUARY 2000 5 !"#$%&'()#)*%"$+ ;!#1$'#0!#"$+1'- Source <#"$+1'-0*#,. $)$"$!10!#"$+1' Bas Cornelissen, Andy Zaidman, Arie van #'1'9!)"09')&'- 7)$"$!10-'1'+"$,) #'*'#')+'0+4'+3$)5 :$)!10-'1'+"$,) =&1>0?@@0A0=&)'0BCD -'1'+"$,) Deursen, Leon Moonen, Rainer Koschke. A Systematic Survey of Program <#"$+1'-0*#,. Comprehension through Dynamic Analysis ,"4'#09')&'- !"#$%&'()'&'%#$*+ IEEE Transactions on Software Engineering (TSE): 35(5): 684-702, 2009. <""#$%&"' !""#$%&"' 7)$"$!1 !""#$%&"' *#!.'2,#3 5')'#!1$/!"$,) !""#$%&"'- $(')"$*$+!"$,) !##"$,-#'($.'+#$/$%0#$*+ !"#$%&',(("-+.)+% 89'#9$'20,* !""#$%&"' -&..!#$/!"$,)0,* +4!#!+"'#$/'( !--$5).')" -$.$1!#02,#3 !#"$+1'- !"#$%&'(%10"0%#'"$20#$*+ Cornelissen et al. - An Systematic Survey of Program Comprehension through Dynamic Analysis IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 0, NO. 0, JANUARY 2000 22 E'+,..')(!"$,)- $)"'#6#'"!"$,) 3+#'"4"'#0#$*+ &** Fig. 1. Overview of the systematic survey process. )* (# (& !" #% #( !* ## vague, as it does not specify which properties are analyzed. To allow the definition to serve in "# multiple problem domains, the exact properties under analysis are left open. ') '! "* $% $% $% '* $% While the definition of dynamic analysis is rather abstract, we can elaborate upon the benefits $# $# $" $' &% &% and limitations of using dynamic analysis in program comprehension contexts. The benefits that $* &' && &$ && &* % &* we consider are: ( ( # # # ! • The preciseness with regard to the actual*behavior of the software system, for example, in 46>6 10 1=3 - ? ;- : 5-3 234 ? > A/ 8 0- / 8- 2-/ 1/ - 62,8 </ 31 7432 2 /7 > 5; 3 C@ 32? F> 692 34 +,/ <-6> 5,46- >B D7 +>B 4- 0 < 0 ,; @3 2,;/ 72, / ,10 4,C - 2-0 - ,1< /:1 C- > ,34 /4,; / 2;9 C +,: the context of object-oriented software software with its late binding mechanism. . 2+- 3; - 78 .- ; ; : 1 2,1 @2- 636,+ 4,1 76- 6< C@ /636 2 62 3 /6, E1 7/62 -3 23; , 1- 97 1=1: <7 < 7C + ,- -2? 67 ,1 6 :1 93 32, > A3 66> 6, 2>=/ + ,/ C3 The fact that a goal-oriented strategy can be used, which entails the definition of an :. /,0 9- @2: • < ,/ C7 D7 ;: <- 1. Research Methods only the parts of interest of the software system are analyzed. 29 ;: 71 execution scenario such that
  • 30. Klee Bergbahn Vojin Bakic Bull
  • 31. Formal Model How can we understand/explain the world ? • make a mathematical abstraction of a certain problem + analytical model, stochastic model, logical model, re-write system, ... + often explained using a “CASE” • prove some important characteristics + based on inductive reasoning, axioms & lemma’s, … Motivate • which factors are irrelevant (excluded) and which are not (included) ? • which properties are worthwhile (proven) ? ➡ See literature survey Problem Problem Properties ? Mathematical Abstraction Properties 1. Research Methods 31
  • 32. Hodler Eiger, Mönch and Jungfrau in the Morning Sun Seurat Eiffel Tower
  • 33. Simulation What would happen if … ? • study circumstances of phenomena in detail + simulated because real world too expensive; too slow or impossible • make prognoses about what can happen in certain situations + test using real observations, typically obtained via a “CASE” Motivate • which circumstances are irrelevant (excluded) and which are not (included) ? • which properties are worthwhile (to be observed/predicted) ? ➡ See literature survey Examples • distributed systems (grid); network protocols + too expensive or too slow to test in real life • embedded systems — simulating hardware platforms + impossible to observe real clock-speed / memory footprint / … ➡ Heisenberg uncertainty principle 1. Research Methods 33
  • 34. Case studies - Revisited 7. Simulation: test prognoses with real case studies are widely used in computer science observations obtained “studying a case” vs. “doing a case study” via a “CASE” 6. Formal Model often explained using a “CASE” 5. Literature survey “CASES” = selected papers 4. Observational Study Observing a series of “CASES” 3. Comparative study Score criteria check-list; often by applying on a “CASE” 2. Pilot Case, Demonstrator Demonstrated on a simple yet representative “CASE” 1. Feasibility study Proof by construction; often by applying on a “CASE” 1. Research Methods 34
  • 35. Case Study Research Introduction • Origins of Computer Science • Research Philosophy Research Methods • 1. Feasibility study • 2. Pilot Case • 3. Comparative study • 4. Observational Study [a.k.a. Etnography] • 5. Literature survey Sources • 6. Formal Model • Robert K. Yin. Case Study Research: • 7. Simulation Design and Methods. 3rd Edition. SAGE Publications. California, 2009. Conclusion • Bent Flyvbjerg, "Five • Studying a Case Misunderstandings About Case Study vs. Performing a Case Study Research." Qualitative Inquiry, vol. 12, + Proposition no. 2, April 2006, pp. 219-245. • Runeson, P. and Höst, M. 2009. + Unit of Analysis Guidelines for conducting and reporting + Threats to Validity case study research in software engineering. Empirical Softw. Eng. 14, 2 (Apr. 2009), 131-164. 1. Research Methods 35
  • 36. Spectrum of cases created for explanation • foo, bar examples Toy-example • simple model; illustrates differences Martin S. Feather , Stephen Fickas , accepted teaching vehicle Anthony Finkelstein , Axel Van • “textbook example” Exemplar Lamsweerde, Requirements and • simple but illustrates Specification Exemplars, Automated Software Engineering, v.4 n.4, p. relevant issues 419-438, October 1997 real-life example Case study Runeson, P. and Höst, M. 2009. Guidelines for conducting and reporting case study research in software • industrial system, Case engineering. Empirical Softw. Eng. 14, open-source system 2 (Apr. 2009), 131-164. • context is difficult to grasp Mining Software Repositories Challenge. competition (tool oriented) [Yearly workshop where research tools compete against one another on a common predefined • approved by community Community case case.] • comparing Susan Elliott Sim, Steve Easterbrook, and Richard C. Holt. Using benchmark Benchmarking to Advance Research: A Challenge to Software Engineering, Proceedings of the Twenty-fifth International Benchmark • approved by community Conference on Software Engineering, Portland, Oregon, pp. • known context 74-83, 3-10 May, 2003. • “planted” issues 1. Research Methods 36
  • 37. Case study — definition A case study is an empirical inquiry that investigates a contemporary phenomenon within its real-life context, especially when the boundaries between the phenomenon and context are not clearly evident [Robert K. Yin. Case Study Research: Design and Methods; p. 13] • empirical inquiry: yes, it is empirical research • contemporary: (close to) real-time observations + incl. interviews • boundaries between the phenomenon and context not clear + as opposed to “experiment” Context Treatment Outcome Phenomenon Experiment Case Study 1. Research Methods 37
  • 38. Case Study — Counter evidence Context Phenomenon - many more variables than data points - multiple sources of evidence; triangulation - theoretical propositions guide data collection (try to confirm or refute propositions with well-selected cases) Case studies also look for counter evidence 1. Research Methods 38
  • 39. Misunderstanding 2: Generalization One cannot generalize on the basis of an individual case; therefore the case study cannot contribute to scientific development. ➡ [Bent Flyvbjerg, "Five Misunderstandings About Case Study Research."] • Understanding + The power of examples + Formal generalization is overvalued - dominant research views of physics and medicine • Counterexamples + one black swan falsifies “all swans are white” - case studies generate deep understanding; what appears to be white often turns out to be black • sampling logic vs. replication logic + sampling logic: operational enumeration of entire universe - use statistics: generalize from “randomly selected” observations + replication logic: careful selection of boundary values - use logic reasoning: presence of absence of property has effect 1. Research Methods 39
  • 40. Sampling Logic vs. Replication Logic Boundary value Selection of (boundary) value Random selection understand differences generalize for entire population • propositions • units of analysis 1. Research Methods 40
  • 41. Research questions for Case Studies Existence: Exploratory Relationship Explanatory • Does X exist? • Are X and Y related? • Do occurrences of X correlate with Description & Classification occurrences of Y? • What is X like? • What are its properties? Causality • How can it be categorized? • What causes X? • How can we measure it? • What effect does X have on Y? • What are its components? • Does X cause Y? • Does X prevent Y? Descriptive-Comparative • How does X differ from Y? Causality-Comparative • Does X cause more Y than does Z? Frequency and Distribution • Is X better at preventing Y than is Z? • How often does X occur? • Does X cause more Y than does Z • What is an average amount of X? under one condition but not others? Descriptive-Process Design • How does X normally work? • What is an effective way to achieve • By what process does X happen? X? • What are the steps as X evolves? • How can we improve X? Source: Empirical Research Methods in Requirements Engineering. Tutorial given at RE'07, New Delhi, India, Oct 2007. 1. Research Methods 41
  • 42. Proposition (a.k.a. Purpose) Where to expect boundaries ? Thorough preparation is necessary ! You need an explicit theory. Boundary value Exploratory Confirmatory Confirmatory case studies are used to test existing Exploratory case studies are used as initial theories. The latter are especially important for investigations of some phenomena to derive new refuting theories: a detailed case study of a real hypotheses and build theories.(*) situation in which a theory fails may be more convincing than failed experiments in the lab.(*) (*) Steve Easterbrook, Janice Singer, Margaret-Anne Storey, and Daniela Damian. Selecting empirical methods for soft- ware engineering research. In Forrest Shull, Janice Singer, and Dag I. K. Sjoberg, editors, Guide to Advanced Empirical Software Engineering, pages 285—311. Springer London, 2008. 1. Research Methods 42
  • 43. Units of Analysis What phenomena to analyze • depends on research questions • affects data collection & interpretation • affects generalizability Example: Clone Detection, Bug Prediction • the tool/algorithm Possibilities Does it work ? • individual developer • the individual developer • a team How/why does he produce bugs/clones ? • a decision • about the culture/process in the team • a process How does the team prevent bugs/clones ? How successful is this prevention ? • a programming language • about the programming language • a tool How vulnerable is the programming language towards clones / bugs ? Design in advance (COBOL vs. AspectJ) • avoid “easy” units of analysis + cases restricted to Java because parser - Is the language really an issue for your research question ? + report size of the system (KLOC, # Classes, # Bug reports) - Is team composition not more important ? 1. Research Methods 43
  • 44. Threats to Validity (Experiments) Experiment objective THEORY cause- effect construct Cause Effect construct construct 4 OBSERVATION 3 3 treatment- outcome construct Treatment Outcome 1 2 Independent variable Dependent variable Experiment operation 1. Conclusion validity 2. Internal validity 3. Construct validity 4. External validity 1. Research Methods 44
  • 45. Threats to validity (Case Studies) • Source: Runeson, P. and Höst, M. 2009. Guidelines for conducting and reporting case study research in software engineering. 1. Construct validity • Do the operational measures reflect what the researcher had in mind ? 2. Internal validity • Are there any other factors that may affect the results ? ➡ Mainly when investigating causality ! 3. External validity • To what extent can the findings be generalized ? ➡ Precise research question & units of analysis required 4. Reliability • To what extent is the data and the analysis dependent on the researcher (the instruments, …) Other categories have been proposed as well • credibility, transferability, dependability, confirmability 1. Research Methods 45
  • 46. Threats to validity — Examples (1/2) 1. Construct validity • Do the operational measures reflect what the researcher had in mind ? • Time recorded vs. time spent • Execution time, memory consumption, … + noise of operating system, sampling method • Human-assigned classifiers (bug severity, …) + risk for “default” values • Participants in interviews have pressure to answer positively 2. Internal validity • Are there any other factors that may affect the results ? • Were phenomena observed under special conditions + in the lab, close to a deadline, company risked bankruptcy, … + major turnover in team, contributors changed (open-source), … • Similar observations repeated over time (learning effects) 1. Research Methods 46
  • 47. Threats to validity — Examples (2/2) 3. External validity • To what extent can the findings be generalized ? • Does it apply to other languages ? other sizes ? other domains ? • Background & education of participants • Simplicity & scale of the team + small teams & flexible roles vs. large organizations & fixed roles 4. Reliability • To what extent is the data and the analysis dependent on the researcher (the instruments, …) • How did you cope with bugs in the tool, the instrument ? • Classification: if others were to classify, would they obtain the same ? • How did you search for evidence in mailing archives, bug reports, … 1. Research Methods 47
  • 48. Threats to validity = Risk Management No experimental design can be “perfect” … but you can limit the chance of deriving false conclusions • manage the risk of false conclusions as much as possible + likelihood + impact • state clearly what and how you alleviated the risk (replication !) + construct validity - precise metric definitions - GQM paradigm + internal & external validity - report the context consciously + Reliability - bugs in tools: testing, usage of well-known libraries, … - classification: develop guidelines & others repeat classification - search for evidence (mailing archives, bug reports, …): have an explicit search procedure 1. Research Methods 48
  • 49. 1. Research Methods Introduction • Origins of Computer Science • Research Philosophy Research Methods • 1. Feasibility study • 2. Pilot Case • 3. Comparative study • 4. Observational Study [a.k.a. Etnography] • 5. Literature survey • 6. Formal Model • 7. Simulation Conclusion • Studying a Case vs. Performing a Case Study + Proposition + Unit of Analysis + Threats to Validity 1. Research Methods 49
  • 50. Studying a case vs. Performing a case study 1. Questions • most likely “How” and “Why”; also sometimes “What” 2. Propositions (a.k.a. Purpose) –––––––––Low hanging fruit––––––––– • explanatory: where to look for evidence • exploratory: rationale and direction + example: Christopher Columbus asks for sponsorship - Why three ships (not one, not five) ? - Why going westward (not south ?) • role of “Theories” + possible explanations (how, why) for certain phenomena ➡ Obtained through literature survey 3. Unit(s) of analysis • What is the case ? Threats to 4. Logic linking data to propositions validity + 5. Criteria for interpreting findings • Chain of evidence from multiple sources • When does data confirm proposition ? When does it refute ? 1. Research Methods 50