This document discusses research methods in computer science. It begins by exploring the origins of computer science in mathematics and engineering. It then describes several common research methods: feasibility studies, pilot cases, comparative studies, observational studies, literature surveys, formal models, and simulations. For each method, it provides an overview and examples of how it has been applied in computer science research. It emphasizes the importance of empirical, quantitative methods and validation when evaluating new technologies.
Program Comprehension - An Evaluation of the Strategies of Sorting, Filtering...ICSM 2011
Paper: An Evaluation of the Strategies of Sorting, Filtering, and Grouping API Methods for Code Completion
Authors: Daqing Hou and Dave Pletcher
Session: Research Track Session 8 -Program Comprehension
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...ICSM 2011
Paper: "Precise Detection of Un-Initialized Variables in Large, Real-life COBOL Programs in Presence of Un-realizable Paths"
Authors: Rahul Jiresal, Adnan Contractor and Ravindra Naik
Session: Industry Track Session 4: Program analysis and Verification
Program Comprehension - An Evaluation of the Strategies of Sorting, Filtering...ICSM 2011
Paper: An Evaluation of the Strategies of Sorting, Filtering, and Grouping API Methods for Code Completion
Authors: Daqing Hou and Dave Pletcher
Session: Research Track Session 8 -Program Comprehension
Industry - Precise Detection of Un-Initialized Variables in Large, Real-life ...ICSM 2011
Paper: "Precise Detection of Un-Initialized Variables in Large, Real-life COBOL Programs in Presence of Un-realizable Paths"
Authors: Rahul Jiresal, Adnan Contractor and Ravindra Naik
Session: Industry Track Session 4: Program analysis and Verification
Industry - Estimating software maintenance effort from use cases an indu...ICSM 2011
Paper: Estimating Software Maintenance Effort from Use Cases: an Industrial Case Study
Authors:Yan Ku, Jing Du, Ye Yang, Qing Wang
Session: Industry Tracking 5: Metrics and
Estimation
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...ICSM 2011
Paper: "Crossing the Boundaries while Analyzing Heterogeneous Component-Based Software Systems"
Authors: Amir Reza Yazdanshenas, Leon Moonen
Session: Research Track Session 7: Components
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...ICSM 2011
Paper: Localizing Failure-Inducing Program Edits Based on Spectrum Information.
Authors: Lingming Zhang, Miryung Kim, Sarfraz Khurshid.
Session: Research Track Session 1: Faults and Regression Testing
ERA - Clustering and Recommending Collections of Code Relevant to TaskICSM 2011
Paper: Clustering and Recommending Collections of Code Relevant to Task
Authors: Seonah Lee and Sungwon Kang
Session: Early Research Achievements Track Session 3: Managing and Supporting Software Maintenance Activities
Metrics - Using Source Code Metrics to Predict Change-Prone Java InterfacesICSM 2011
Paper title: Using Source Code Metrics to Predict Change-Prone Java Interfaces
Authors: Daniele Romano and Martin Pinzger
Session: Research Track Session 11: Metrics
ERA - Measuring Maintainability of Spreadsheets in the Wild ICSM 2011
Paper: Measuring Maintainability of Spreadsheets in the Wild
Authors: José Pedro Correia and Miguel Alexandre Ferreira
Session: Early Research Achievements Track Session 2: Software Changes and Maintainability
Industry - Testing & Quality Assurance in Data Migration Projects ICSM 2011
Paper: Testing & Quality Assurance in Data Migration Projects
Authors: Klaus Haller, Florian Matthes, Christopher Schulz
Session: Industry Track Session 3: Evolution and migration
Industry - Evolution and migration - Incremental and Iterative Reengineering ...ICSM 2011
Paper: Incremental and Iterative Reengineering towards Software Product Line: An Industrial Case Study
Authors: Gang Zhang, Liwei Shen, Xin Peng, Zhenchang Xing and Wenyun Zhao
Session: Industry Track Session 3: Evolution and migration
Natural Language Analysis - Mining Java Class Naming ConventionsICSM 2011
Paper: Mining Java Class Naming Conventions
Authors: Simon Butler, Michel Wermelinger, Yijun Yu and Helen Sharp
Session: Research Track 4 - Natural Language Analysis
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...ICSM 2011
Paper: Expanding Identifiers to Normalize Source Code Vocabulary
Authors: Dave Binkley and Dawn Lawrie
Session: Research Track 4: Natural Language Analysis
Components - Graph Based Detection of Library API LimitationsICSM 2011
Paper: Graph-based Detection of Library API Imitations
Authors: Chengnian Sun, Siau-Cheng Khoo, Shao Jie Zhang (All from National University of Singapore)
Session: Research Track Session 7: Component
Metrics - You can't control the unfamiliarICSM 2011
Paper: You Can't Control the Unfamiliar: A Study on the Relations Between Aggregation Techniques for Software Metrics
Authors: Bogdan Vasilescu, Alexander Serebrenik and Mark Van Den Brand
Session: Research Track 11 - Metrics
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...ICSM 2011
Paper: Relating Developers' Concepts and Artefact Vocabulary in a Financial
Software Module
Authors: Tezcan Dilshener and Michel Wermelinger
Session: Industry Track 2 - Reverse Engineering
Abstract:
Though in essence an engineering discipline, software engineering research has always been struggling to demonstrate impact. This is reflected in part by the funding challenges that the discipline faces in many countries, the difficulties we have to attract industrial participants to our conferences, and the scarcity of papers reporting industrial case studies.
There are clear historical reasons for this but we nevertheless need, as a community, to question our research paradigms and peer evaluation processes in order to improve the situation. From a personal standpoint, relevance and impact are concerns that I have been struggling with for a long time, which eventually led me to leave a comfortable academic position and a research chair to work in industry-driven research.
I will use some concrete research project examples to argue why we need more inductive research, that is, research working from specific observations in real settings to broader generalizations and theories. Among other things, the examples will show how a more thorough understanding of practice and closer interactions with practitioners can profoundly influence the definition of research problems, and the development and evaluation of solutions to these problems. Furthermore, these examples will illustrate why, to a large extent, useful research is necessarily multidisciplinary. I will also address issues regarding the implementation of such a research paradigm and show how our own bias as a research community worsens the situation and undermines our very own interests.
On a more humorous note, the title hints at the fact that being a scientist in software engineering and aiming at having impact on practice often entails leading two parallel careers and impersonate different roles to different peers and partners.
Bio:
Lionel Briand is heading the Certus center on software verification and validation at Simula Research Laboratory, where he is leading research projects with industrial partners. He is also a professor at the University of Oslo (Norway). Before that, he was on the faculty of the department of Systems and Computer Engineering, Carleton University, Ottawa, Canada, where he was full professor and held the Canada Research Chair (Tier I) in Software Quality Engineering. He is the coeditor-in-chief of Empirical Software Engineering (Springer) and is a member of the editorial boards of Systems and Software Modeling (Springer) and Software Testing, Verification, and Reliability (Wiley). He was on the board of IEEE Transactions on Software Engineering from 2000 to 2004. Lionel was elevated to the grade of IEEE Fellow for his work on the testing of object-oriented systems. His research interests include: model-driven development, testing and verification, search-based software engineering, and empirical software engineering.
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...ICSM 2011
Paper: Structural Conformance Checking with Design Tests: An Evaluation of Usability and Scalability.
Authors: João Brunet, Dalton Dario Serey Guerrero and Jorge Figueiredo.
Session: Research Track 5: Traceability
ERA - A Comparison of Stemmers on Source Code Identifiers for Software SearchICSM 2011
Paper: A Comparison of Stemmers on Source Code Identifiers for Software
Search
Authors: Andrew Wiese, Valerie Ho, Emily Hill.
Session: ERA1 - Linguistic Analysis of Software Artifacts
Reliability and Quality - Predicting post-release defects using pre-release f...ICSM 2011
Paper : Predicting Post-release Defects Using Pre-release Field Testing Results
Authors : Foutse Khomh, Brian Chan, Ying Zou, Anand Sinha and Dave Dietz
Session: Research Track Session 9: Reliability and Quality
Industry - Estimating software maintenance effort from use cases an indu...ICSM 2011
Paper: Estimating Software Maintenance Effort from Use Cases: an Industrial Case Study
Authors:Yan Ku, Jing Du, Ye Yang, Qing Wang
Session: Industry Tracking 5: Metrics and
Estimation
Components - Crossing the Boundaries while Analyzing Heterogeneous Component-...ICSM 2011
Paper: "Crossing the Boundaries while Analyzing Heterogeneous Component-Based Software Systems"
Authors: Amir Reza Yazdanshenas, Leon Moonen
Session: Research Track Session 7: Components
Faults and Regression testing - Localizing Failure-Inducing Program Edits Bas...ICSM 2011
Paper: Localizing Failure-Inducing Program Edits Based on Spectrum Information.
Authors: Lingming Zhang, Miryung Kim, Sarfraz Khurshid.
Session: Research Track Session 1: Faults and Regression Testing
ERA - Clustering and Recommending Collections of Code Relevant to TaskICSM 2011
Paper: Clustering and Recommending Collections of Code Relevant to Task
Authors: Seonah Lee and Sungwon Kang
Session: Early Research Achievements Track Session 3: Managing and Supporting Software Maintenance Activities
Metrics - Using Source Code Metrics to Predict Change-Prone Java InterfacesICSM 2011
Paper title: Using Source Code Metrics to Predict Change-Prone Java Interfaces
Authors: Daniele Romano and Martin Pinzger
Session: Research Track Session 11: Metrics
ERA - Measuring Maintainability of Spreadsheets in the Wild ICSM 2011
Paper: Measuring Maintainability of Spreadsheets in the Wild
Authors: José Pedro Correia and Miguel Alexandre Ferreira
Session: Early Research Achievements Track Session 2: Software Changes and Maintainability
Industry - Testing & Quality Assurance in Data Migration Projects ICSM 2011
Paper: Testing & Quality Assurance in Data Migration Projects
Authors: Klaus Haller, Florian Matthes, Christopher Schulz
Session: Industry Track Session 3: Evolution and migration
Industry - Evolution and migration - Incremental and Iterative Reengineering ...ICSM 2011
Paper: Incremental and Iterative Reengineering towards Software Product Line: An Industrial Case Study
Authors: Gang Zhang, Liwei Shen, Xin Peng, Zhenchang Xing and Wenyun Zhao
Session: Industry Track Session 3: Evolution and migration
Natural Language Analysis - Mining Java Class Naming ConventionsICSM 2011
Paper: Mining Java Class Naming Conventions
Authors: Simon Butler, Michel Wermelinger, Yijun Yu and Helen Sharp
Session: Research Track 4 - Natural Language Analysis
Natural Language Analysis - Expanding Identifiers to Normalize Source Code Vo...ICSM 2011
Paper: Expanding Identifiers to Normalize Source Code Vocabulary
Authors: Dave Binkley and Dawn Lawrie
Session: Research Track 4: Natural Language Analysis
Components - Graph Based Detection of Library API LimitationsICSM 2011
Paper: Graph-based Detection of Library API Imitations
Authors: Chengnian Sun, Siau-Cheng Khoo, Shao Jie Zhang (All from National University of Singapore)
Session: Research Track Session 7: Component
Metrics - You can't control the unfamiliarICSM 2011
Paper: You Can't Control the Unfamiliar: A Study on the Relations Between Aggregation Techniques for Software Metrics
Authors: Bogdan Vasilescu, Alexander Serebrenik and Mark Van Den Brand
Session: Research Track 11 - Metrics
Industry - Relating Developers' Concepts and Artefact Vocabulary in a Financ...ICSM 2011
Paper: Relating Developers' Concepts and Artefact Vocabulary in a Financial
Software Module
Authors: Tezcan Dilshener and Michel Wermelinger
Session: Industry Track 2 - Reverse Engineering
Abstract:
Though in essence an engineering discipline, software engineering research has always been struggling to demonstrate impact. This is reflected in part by the funding challenges that the discipline faces in many countries, the difficulties we have to attract industrial participants to our conferences, and the scarcity of papers reporting industrial case studies.
There are clear historical reasons for this but we nevertheless need, as a community, to question our research paradigms and peer evaluation processes in order to improve the situation. From a personal standpoint, relevance and impact are concerns that I have been struggling with for a long time, which eventually led me to leave a comfortable academic position and a research chair to work in industry-driven research.
I will use some concrete research project examples to argue why we need more inductive research, that is, research working from specific observations in real settings to broader generalizations and theories. Among other things, the examples will show how a more thorough understanding of practice and closer interactions with practitioners can profoundly influence the definition of research problems, and the development and evaluation of solutions to these problems. Furthermore, these examples will illustrate why, to a large extent, useful research is necessarily multidisciplinary. I will also address issues regarding the implementation of such a research paradigm and show how our own bias as a research community worsens the situation and undermines our very own interests.
On a more humorous note, the title hints at the fact that being a scientist in software engineering and aiming at having impact on practice often entails leading two parallel careers and impersonate different roles to different peers and partners.
Bio:
Lionel Briand is heading the Certus center on software verification and validation at Simula Research Laboratory, where he is leading research projects with industrial partners. He is also a professor at the University of Oslo (Norway). Before that, he was on the faculty of the department of Systems and Computer Engineering, Carleton University, Ottawa, Canada, where he was full professor and held the Canada Research Chair (Tier I) in Software Quality Engineering. He is the coeditor-in-chief of Empirical Software Engineering (Springer) and is a member of the editorial boards of Systems and Software Modeling (Springer) and Software Testing, Verification, and Reliability (Wiley). He was on the board of IEEE Transactions on Software Engineering from 2000 to 2004. Lionel was elevated to the grade of IEEE Fellow for his work on the testing of object-oriented systems. His research interests include: model-driven development, testing and verification, search-based software engineering, and empirical software engineering.
Traceability - Structural Conformance Checking with Design Tests: An Evaluati...ICSM 2011
Paper: Structural Conformance Checking with Design Tests: An Evaluation of Usability and Scalability.
Authors: João Brunet, Dalton Dario Serey Guerrero and Jorge Figueiredo.
Session: Research Track 5: Traceability
ERA - A Comparison of Stemmers on Source Code Identifiers for Software SearchICSM 2011
Paper: A Comparison of Stemmers on Source Code Identifiers for Software
Search
Authors: Andrew Wiese, Valerie Ho, Emily Hill.
Session: ERA1 - Linguistic Analysis of Software Artifacts
Reliability and Quality - Predicting post-release defects using pre-release f...ICSM 2011
Paper : Predicting Post-release Defects Using Pre-release Field Testing Results
Authors : Foutse Khomh, Brian Chan, Ying Zou, Anand Sinha and Dave Dietz
Session: Research Track Session 9: Reliability and Quality
I argue why I think that Computer Science (or better: Informatics) is a "natural science", in the same sense that physics, astronomy, biology, psychology and sociology are a natural science: they study a part of the world around us. In that same sense, I think Informatics studies a part of the world around us.
For a similar talk (including script), but more aimed at a Semantic Web audience in particular, see http://www.cs.vu.nl/~frankh/spool/ISWC2011Keynote/
(or http://videolectures.net/iswc2011_van_harmelen_universal/ for a video registration)
Machine learning is concerned with developing algorithms that learn
from experience, build models of the environment from the acquired
knowledge, and use these models for prediction. Machine learning is
usually taught as a bunch of methods that can solve a bunch of
problems (see my Introduction to SML last week). The following
tutorial takes a step back and asks about the foundations of machine
learning, in particular the (philosophical) problem of inductive inference,
(Bayesian) statistics, and arti¯cial intelligence. The tutorial concentrates
on principled, uni¯ed, and exact methods.
The design and implementation of systems that possess, reason with, and acquire knowledge is arguably the ultimate intellectual challenge. So why then, when we open almost any
book on Artificial Intelligence, does it open with a painstaking, almost defensive, definition
of what AI is and what AI is not?
Research in Computer Science and EngineeringOdiaPua1
Talks about Science and Research in the Computer Science and Engineering domain. The scientific foundations and methods of computer science and computer engineering.
The approaches to Artificial Intelligence (AI) in the last century may be labelled as (a) trying to understand and copy (human) nature, (b) being based on heuristic considerations, (c) being formal but from the outset (provably) limited, (d) being (mere) frameworks that leave crucial aspects unspecified. This decade has spawned the first theory of AI, which (e) is principled, formal, complete, and general. This theory, called Universal AI, is about ultimate super-intelligence. It can serve as a gold standard for General AI, and implicitly proposes a formal definition of machine intelligence. After a brief review of the various approaches to (general) AI, I will give an introduction to Universal AI, concentrating on the philosophical, mathematical, and computational aspects behind it. I will also discuss various implications and future challenges.
Similar to Tutorial 3 - Research methods - Part 1 (20)
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Tutorial 3 - Research methods - Part 1
1. Research Methods
in Computer Science
(Serge Demeyer — University of Antwerp)
AnSyMo
Antwerp Systems and software Modelling
http://ansymo.ua.ac.be/
Universiteit Antwerpen
2. Helicopter View
(Ph.D.)
Research
How to perform research ? How to write research ?
(and get “empirical” results) (and get papers accepted)
How many of you have
done / will do a case-study ?
1. Research Methods 2
4. 1. Research Methods
Introduction
• Origins of Computer Science
• Research Philosophy
Research Methods
• 1. Feasibility study
• 2. Pilot Case
• 3. Comparative study
• 4. Observational Study [a.k.a. Etnography]
• 5. Literature survey
• 6. Formal Model
• 7. Simulation
Conclusion
• Studying a Case
vs. Performing a Case Study
+ Proposition
+ Unit of Analysis
+ Threats to Validity
1. Research Methods 4
5. What is (Ph.d.) Research ?
http://gizmodo.com/5613794/what-is-exactly-a-doctorate
Human Elementary
Knowledge School High School Bachelor
Ph.D. Ph.D.
Master (early stages) (finished)
1. Research Methods 5
6. Computer Science
All science is either physics or stamp collecting (E. Rutherford)
We study artifacts produced by humans
Computer science is no more about computers than
astronomy is about telescopes. (E. Dijkstra)
Computer science
Computer engineering
Informatics
Software Engineering
1. Research Methods 6
7. Science vs. Engineering
Science Engineering
Physics
Civil Engineering
???
Chemistry Computer Electronics
Science
Biology ???
Software Chemistry and Materials
Mathematics Engineering
???
Geography Electro-Mechanical
Engineering
1. Research Methods 7
8. Mathematical Origins
Turing Machines
• Halting problem (inductive) Reasoning
• logical argumentation
Algorithmic Complexity + formal models,
• P = ? NP theorem proving, …
+ axioms & lemma’s
Compilers + foo, bar type of examples
• Chomsky hierarchy • “deep” and generic universal
knowledge
Databases
• Relational model
Gödel theorem: consistency of the system is not provable in the system.
A complete and consistent set of axioms
for all of mathematics is impossible
1. Research Methods 8
9. Engineering Origins
Computer Engineering Empirical Approach
• Moore’s law: “the number of • Tom De Marco: “you cannot
transistors on a chip will double control what you cannot
about every two years” measure”
+ Self-fulfilling prophesy + quantify
• Hardware technology + mathematical model
+ RISC vs. CISC • Pareto principle
+ MPSoC + 80 % - 20 % rule
• Compiler optimization (80% of the effects come
+ peephole optimization from 20% of the causes)
+ branch prediction
As good as your next observation.
Premise: The sun has risen in the east every morning up until now.
Conclusion: The sun will also rise in the east tomorrow. … Or Not ?
1. Research Methods 9
10. Influence of Society
Lives are at stake
(e.g., automatic pilot,
nuclear power plants)
Huge amounts of money
are at stake
(e.g., Ariane V crash,
Denver Airport Baggage)
Software became Ubiquitous
… its not a hobby anymore
Corporate success or failure
is at stake (e.g., telephone
billing, VTM launching 2nd
channel)
1. Research Methods 10
11. Interdisciplinary Nature
“Hard”
Science Engineering
Sciences
Computer
Science
Action
Research
“Soft”
Economics Sociology
Sciences
Psychology
1. Research Methods 11
14. Objective Subjective
• Plato’s cave
• Scientific Paradigm (Kuhn)
+ Dominant paradigm / Competing paradigms / Paradigm shift
➡ Normal science vs. Revolutionary science
1. Research Methods 14
15. Dominant view on Research Methods
Physics Medicine
(“The” Scientific method) (Double-blind treatment)
• form hypothesis about a • form hypothesis about a
phenomenon treatment
• design experiment • select experimental and control
• collect data groups that are comparable
• compare data to hypothesis except for the treatment
• accept or reject hypothesis • collect data
+ … publish (in Nature) • commit statistics on the data
• get someone else to repeat • treatment difference
experiment (replication) (statistically significant)
Cannot answer the “big” questions
… in timely fashion
•smoking is unhealthy
•climate change
•darwin theory vs. intelligent design
•…
•agile methods
1. Research Methods 15
16. Experiment principles
source: C. Wohlin, P. Runeson, M. Höst, M. Ohlsson, B. Regnell, and A. Wesslén.
Experimentation in Software Engineering - An Introduction. Kluwer Academic Publishers,
2000 “Bo
Experiment objective • To ring
o m to r
res ea
THEORY ear uch f d” s
ch ocu yn
pro s o drom
ced n p e
ure rop
cause-effect er
construct
Cause Effect
construct construct
OBSERVATION
treatment-
outcome
construct
Treatment Outcome
Independent variable Dependent variable
Experiment operation
1. Research Methods 16
17. !"""!"#$%&'()*+%$,(-&./%$(01(23456"
Research Methods in Computer Science
Different Sources Static analysis
• Marvin V. Zelkowitz and Dolores R. Lesso ns learned
Wallace, "Experimental Models for Legacy data
Validating Technology", IEEE Literat ure search
Computer, May 1998. Field st u dy
Validation method
Assertio n
Case st u dy
• Easterbrook, S. M., Singer, J., Storey,
Project mo nit orin g
M, and Damian, D. Selecting Empirical Simulatio n 1995 (152 papers)
Methods for Software Engineering Dynamic analysis
1990 (217 papers)
1985 (243 papers)
Research. Appears in F. Shull and J. Syn t hetic
Singer (eds) "Guide to Advanced Replicated
Empirical Software Engineering", No experimen tatio n
"
Springer, 2007. 0371(81"#$%&'"()*+,-$.&,/"0&+1"/23+$%&'/"*.,"1&-14&-1+,5"&6"$.*6-,78"
0 5 10 15 20 25 30 35 40
• Gordona Dodif-Crnkovic, “Scientific "
Percen tage o f papers
Methods in Computer Science”
lection method that conforms to any one of the 12 validate the claims in the paper. For completeness we
• Andreas Höfer, Walter F. Tichy, Status given data collection methods.
Our 12 methods are not the only ways to classify
added the following two classifications:
of Empirical Research in Software data collection, although we believe they are the most 1. Not applicable. Some papers did not address some
comprehensive. For example, Victor Basili6 calls an new technology, so the concept of data collection does
Engineering, Empirical Software experiment in vivo when it is run at a development loca- not apply. For example, a paper summarizing a recent
tion and in vitro when it is run in an isolated, controlled
Engineering Issues, p. 10-19,
conference or workshop wouldn’t be applicable.
setting. According to Basili, a project may involve one 2. No experiment. Some papers describing a new
Springer, 2007. team of developers or multiple teams, and an experi-
ment may involve one project or multiple projects. This
technology contained no experimental validations.
variability permits eight different experiment classifi- In our survey, we were interested in the data col-
cations. On the other hand, Barbara Kitchenham7 con- lection methods employed by the authors of the papers
siders nine classifications of experiments divided into in order to determine our classification scheme’s com-
three general categories: a quantitative experiment to prehensiveness. We tried to distinguish between data
identify measurable benefits of using a method or tool, used as a demonstration of concept (which may
a qualitative experiment to assess the features provided involve some measurements as a “proof of concept,”
by a method or tool, and a benchmarking experiment but not a full validation of the method) and a true
to determine performance. attempt at validation of their results.
As in the study by Walter Tichy,8 we considered a
1. Research Methods MODEL VALIDATION
To test whether the classification presented here
17
demonstration of technology via example as part of
the analytical phase. The paper had to go beyond that "
18. Case studies - Spectrum
case studies are widely used in computer science 7. Simulation
“studying a case” vs. “doing a case study”
• what if ?
6. Formal Model
• underlying concepts ?
5. Literature survey
• what is known/unknown ?
4. Observational Study
• What is “it” ?
3. Comparative study
• is it better ?
2. Pilot Case, Demonstrator
• is it appropriate ?
1. Feasibility study
• is it possible ?
Source: Personal experience
(Guidelines for Master Thesis Research –
University of Antwerp)
1. Research Methods 18
20. Feasibility Study
Here is a new idea, is it possible ?
➡ Metaphor: Christopher Columbus and western route to India
• Is it possible to solve a specific kind of problem … effectively ?
+ computer science perspective (P = NP, Turing test, …)
+ engineering perspective (build efficiently; fast — small)
+ economic perspective (cost effective; profitable)
• Is the technique new / novel / innovative ?
+ compare against alternatives
➡ See literature survey; comparative study
• Proof by construction
+ build a prototype
+ often by applying on a “CASE”
• Conclusions
+ primarily qualitative; "lessons learned"
+ quantitative
- economic perspective: cost - benefit
- engineering perspective: speed - memory footprint
1. Research Methods 20
22. Pilot Case (a.k.a. Demonstrator)
Here is an idea that has proven valuable; does it work for us ?
➡ Metaphor: Portugal (Amerigo Vespucci) explores western route
• proven valuable
+ accepted merits (e.g. “lessons learned” from feasibility study)
+ there is some (implicit) theory explaining why the idea has merit
• does it work for us
+ context is very important
• Demonstrated on a simple yet representative “CASE”
+ “Pilot case” ≠ “Pilot Study”
• Proof by construction
+ build a prototype
+ apply on a “case”
• Conclusions
+ primarily qualitative; "lessons learned"
+ quantitative; preferably with predefined criteria
➡ compare to context before applying the idea !!
1. Research Methods 22
23. Walking man
Standing Figure
– Alberto Giacometti
24. Comparative Study
Here are two techniques, which one is better ?
• for a given purpose !
+ (Not necessarily absolute ranking)
• Where are the differences ? What are the tradeoffs ?
• Criteria check-list
+ predefined
- should not favor one technique
+ qualitative and quantitative
- qualitative: how to remain unbiased ?
- quantitative: represent what you want to know ?
+ Criteria check-list should be complete and reusable !
➡ If done well, most important contribution (replication !)
➡ See literature survey
• Score criteria check-list
+ Often by applying the technique on a “CASE”
• Compare
+ typically in the form of a table
1. Research Methods 24
25.
26. Observational Study [Ethnography]
Understand phenomena through observations
➡ Metaphor: Diane Fossey “Gorillas in the Mist”
• systematic collection of data derived from direct observation of the
everyday life
+ phenomena is best understood in the fullest possible context
➡ observation & participation
➡ interviews & questionnaires
• Observing a series of cases “CASE”
+ observation vs. participation ?
• example: Action Research
+ Action research is carried out by people who usually recognize a problem or limitation in
their workplace situation and, together, devise a plan to counteract the problem,
implement the plan, observe what happens, reflect on these outcomes, revise the plan,
implement it, reflect, revise and so on.
• Conclusions
+ primarily qualitative: classifications/observations/…
1. Research Methods 26
28. Literature Survey
What is known ? What questions are still open ?
• source: B. A. Kitchenham, “Procedures for Performing Systematic
Reviews”, Keele University Technical Report EBSE-2007-01, 2007
Systematic
• “comprehensive”
➡ precise research question is prerequisite
+ defined search strategy (rigor, completeness, replication)
+ clearly defined scope
- criteria for inclusion and exclusion
+ specify information to be obtained
- the “CASES” are the selected papers
• outcome is organized
classification taxonomy conceptual model
table tree frequency
1. Research Methods 28
29. Literature survey - example
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 0, NO. Survey of Program Comprehension through Dynamic Analysis
Cornelissen et al. - An Systematic 0, JANUARY 2000 5
!"#$%&'()#)*%"$+ ;!#1$'#0!#"$+1'-
Source
<#"$+1'-0*#,.
$)$"$!10!#"$+1'
Bas Cornelissen, Andy Zaidman, Arie van
#'1'9!)"09')&'- 7)$"$!10-'1'+"$,) #'*'#')+'0+4'+3$)5 :$)!10-'1'+"$,)
=&1>0?@@0A0=&)'0BCD
-'1'+"$,) Deursen, Leon Moonen, Rainer Koschke. A
Systematic Survey of Program
<#"$+1'-0*#,. Comprehension through Dynamic Analysis
,"4'#09')&'-
!"#$%&'()'&'%#$*+ IEEE Transactions on Software Engineering
(TSE): 35(5): 684-702, 2009.
<""#$%&"' !""#$%&"' 7)$"$!1 !""#$%&"'
*#!.'2,#3 5')'#!1$/!"$,) !""#$%&"'- $(')"$*$+!"$,)
!##"$,-#'($.'+#$/$%0#$*+
!"#$%&',(("-+.)+%
89'#9$'20,*
!""#$%&"' -&..!#$/!"$,)0,* +4!#!+"'#$/'(
!--$5).')" -$.$1!#02,#3 !#"$+1'-
!"#$%&'(%10"0%#'"$20#$*+ Cornelissen et al. - An Systematic Survey of Program Comprehension through Dynamic Analysis
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 0, NO. 0, JANUARY 2000 22
E'+,..')(!"$,)- $)"'#6#'"!"$,)
3+#'"4"'#0#$*+
&**
Fig. 1. Overview of the systematic survey process.
)* (#
(&
!"
#% #(
!* ##
vague, as it does not specify which properties are analyzed. To allow the definition to serve in
"#
multiple problem domains, the exact properties under analysis are left open. ') '!
"*
$% $% $% '* $%
While the definition of dynamic analysis is rather abstract, we can elaborate upon the benefits $# $# $" $'
&% &%
and limitations of using dynamic analysis in program comprehension contexts. The benefits that
$* &' && &$ && &*
% &*
we consider are: ( ( # # # !
• The preciseness with regard to the actual*behavior of the software system, for example, in
46>6 10
1=3 -
?
;- :
5-3 234
?
> A/ 8
0- /
8- 2-/
1/ -
62,8 </
31 7432
2
/7 >
5; 3
C@ 32?
F>
692 34
+,/ <-6>
5,46- >B
D7 +>B
4- 0 <
0
,;
@3 2,;/
72, /
,10
4,C -
2-0 -
,1< /:1
C- >
,34
/4,; /
2;9
C
+,:
the context of object-oriented software software with its late binding mechanism.
.
2+-
3;
-
78
.-
;
;
:
1
2,1
@2- 636,+
4,1
76-
6<
C@ /636
2
62 3
/6,
E1 7/62
-3
23;
,
1-
97 1=1:
<7
<
7C
+ ,-
-2?
67
,1
6
:1
93
32,
> A3
66>
6,
2>=/
+ ,/
C3
The fact that a goal-oriented strategy can be used, which entails the definition of an
:.
/,0
9-
@2:
•
< ,/
C7
D7
;:
<-
1. Research Methods only the parts of interest of the software system are analyzed. 29
;:
71
execution scenario such that
31. Formal Model
How can we understand/explain the world ?
• make a mathematical abstraction of a certain problem
+ analytical model, stochastic model, logical model, re-write
system, ...
+ often explained using a “CASE”
• prove some important characteristics
+ based on inductive reasoning, axioms & lemma’s, …
Motivate
• which factors are irrelevant (excluded) and which are not (included) ?
• which properties are worthwhile (proven) ?
➡ See literature survey
Problem
Problem
Properties
?
Mathematical
Abstraction
Properties
1. Research Methods 31
33. Simulation
What would happen if … ?
• study circumstances of phenomena in detail
+ simulated because real world too expensive; too slow or impossible
• make prognoses about what can happen in certain situations
+ test using real observations, typically obtained via a “CASE”
Motivate
• which circumstances are irrelevant (excluded) and which are not
(included) ?
• which properties are worthwhile (to be observed/predicted) ?
➡ See literature survey
Examples
• distributed systems (grid); network protocols
+ too expensive or too slow to test in real life
• embedded systems — simulating hardware platforms
+ impossible to observe real clock-speed / memory footprint / …
➡ Heisenberg uncertainty principle
1. Research Methods 33
34. Case studies - Revisited 7. Simulation: test
prognoses with real
case studies are widely used in computer science observations obtained
“studying a case” vs. “doing a case study” via a “CASE”
6. Formal Model
often explained using a “CASE”
5. Literature survey
“CASES” = selected papers
4. Observational Study
Observing a series of “CASES”
3. Comparative study
Score criteria check-list; often by applying on a “CASE”
2. Pilot Case, Demonstrator
Demonstrated on a simple yet representative “CASE”
1. Feasibility study
Proof by construction; often by applying on a “CASE”
1. Research Methods 34
35. Case Study Research
Introduction
• Origins of Computer Science
• Research Philosophy
Research Methods
• 1. Feasibility study
• 2. Pilot Case
• 3. Comparative study
• 4. Observational Study [a.k.a. Etnography]
• 5. Literature survey Sources
• 6. Formal Model • Robert K. Yin. Case Study Research:
• 7. Simulation Design and Methods. 3rd Edition. SAGE
Publications. California, 2009.
Conclusion • Bent Flyvbjerg, "Five
• Studying a Case Misunderstandings About Case Study
vs. Performing a Case Study Research." Qualitative Inquiry, vol. 12,
+ Proposition no. 2, April 2006, pp. 219-245.
• Runeson, P. and Höst, M. 2009.
+ Unit of Analysis Guidelines for conducting and reporting
+ Threats to Validity case study research in software
engineering. Empirical Softw. Eng. 14,
2 (Apr. 2009), 131-164.
1. Research Methods 35
36. Spectrum of cases
created for explanation
• foo, bar examples Toy-example
• simple model;
illustrates differences
Martin S. Feather , Stephen Fickas ,
accepted teaching vehicle Anthony Finkelstein , Axel Van
• “textbook example” Exemplar Lamsweerde, Requirements and
• simple but illustrates Specification Exemplars, Automated
Software Engineering, v.4 n.4, p.
relevant issues 419-438, October 1997
real-life example
Case study
Runeson, P. and Höst, M. 2009.
Guidelines for conducting and reporting
case study research in software
• industrial system, Case
engineering. Empirical Softw. Eng. 14, open-source system
2 (Apr. 2009), 131-164. • context is difficult to grasp
Mining Software Repositories Challenge. competition (tool oriented)
[Yearly workshop where research tools compete
against one another on a common predefined • approved by community Community case
case.] • comparing
Susan Elliott Sim, Steve Easterbrook, and Richard C. Holt. Using benchmark
Benchmarking to Advance Research: A Challenge to Software
Engineering, Proceedings of the Twenty-fifth International Benchmark
• approved by community
Conference on Software Engineering, Portland, Oregon, pp. • known context
74-83, 3-10 May, 2003. • “planted” issues
1. Research Methods 36
37. Case study — definition
A case study is an empirical inquiry that investigates a
contemporary phenomenon within its real-life context, especially
when the boundaries between the phenomenon and context are not
clearly evident
[Robert K. Yin. Case Study Research: Design and Methods; p. 13]
• empirical inquiry: yes, it is empirical research
• contemporary: (close to) real-time observations
+ incl. interviews
• boundaries between the phenomenon and context not clear
+ as opposed to “experiment”
Context
Treatment Outcome
Phenomenon
Experiment Case Study
1. Research Methods 37
38. Case Study — Counter evidence
Context
Phenomenon
- many more variables than data points
- multiple sources of evidence; triangulation
- theoretical propositions guide data collection
(try to confirm or refute propositions with well-selected cases)
Case studies also look
for counter evidence
1. Research Methods 38
39. Misunderstanding 2: Generalization
One cannot generalize on the basis of an individual case; therefore
the case study cannot contribute to scientific development.
➡ [Bent Flyvbjerg, "Five Misunderstandings About Case Study Research."]
• Understanding
+ The power of examples
+ Formal generalization is overvalued
- dominant research views of physics and medicine
• Counterexamples
+ one black swan falsifies “all swans are white”
- case studies generate deep understanding; what appears to be
white often turns out to be black
• sampling logic vs. replication logic
+ sampling logic: operational enumeration of entire universe
- use statistics: generalize from “randomly selected” observations
+ replication logic: careful selection of boundary values
- use logic reasoning: presence of absence of property has effect
1. Research Methods 39
40. Sampling Logic vs. Replication Logic
Boundary value
Selection of (boundary) value
Random selection understand differences
generalize for entire population • propositions
• units of analysis
1. Research Methods 40
41. Research questions for Case Studies
Existence: Exploratory Relationship Explanatory
• Does X exist? • Are X and Y related?
• Do occurrences of X correlate with
Description & Classification occurrences of Y?
• What is X like?
• What are its properties? Causality
• How can it be categorized? • What causes X?
• How can we measure it? • What effect does X have on Y?
• What are its components? • Does X cause Y?
• Does X prevent Y?
Descriptive-Comparative
• How does X differ from Y? Causality-Comparative
• Does X cause more Y than does Z?
Frequency and Distribution • Is X better at preventing Y than is Z?
• How often does X occur? • Does X cause more Y than does Z
• What is an average amount of X? under one condition but not others?
Descriptive-Process Design
• How does X normally work? • What is an effective way to achieve
• By what process does X happen? X?
• What are the steps as X evolves? • How can we improve X?
Source: Empirical Research Methods in Requirements Engineering.
Tutorial given at RE'07, New Delhi, India, Oct 2007.
1. Research Methods 41
42. Proposition (a.k.a. Purpose)
Where to expect boundaries ?
Thorough preparation is necessary !
You need an explicit theory.
Boundary value
Exploratory Confirmatory
Confirmatory case studies are used to test existing
Exploratory case studies are used as initial theories. The latter are especially important for
investigations of some phenomena to derive new refuting theories: a detailed case study of a real
hypotheses and build theories.(*) situation in which a theory fails may be more
convincing than failed experiments in the lab.(*)
(*) Steve Easterbrook, Janice Singer, Margaret-Anne Storey, and Daniela Damian. Selecting empirical methods for soft- ware engineering
research. In Forrest Shull, Janice Singer, and Dag I. K. Sjoberg, editors, Guide to Advanced Empirical Software Engineering, pages 285—311.
Springer London, 2008.
1. Research Methods 42
43. Units of Analysis
What phenomena to analyze
• depends on research questions
• affects data collection & interpretation
• affects generalizability
Example: Clone Detection, Bug Prediction
• the tool/algorithm
Possibilities
Does it work ?
• individual developer • the individual developer
• a team How/why does he produce bugs/clones ?
• a decision • about the culture/process in the team
• a process How does the team prevent bugs/clones ?
How successful is this prevention ?
• a programming language
• about the programming language
• a tool How vulnerable is the programming
language towards clones / bugs ?
Design in advance (COBOL vs. AspectJ)
• avoid “easy” units of analysis
+ cases restricted to Java because parser
- Is the language really an issue for your research question ?
+ report size of the system (KLOC, # Classes, # Bug reports)
- Is team composition not more important ?
1. Research Methods 43
45. Threats to validity (Case Studies)
• Source: Runeson, P. and Höst, M. 2009. Guidelines for conducting and
reporting case study research in software engineering.
1. Construct validity
• Do the operational measures reflect what the researcher had in mind ?
2. Internal validity
• Are there any other factors that may affect the results ?
➡ Mainly when investigating causality !
3. External validity
• To what extent can the findings be generalized ?
➡ Precise research question & units of analysis required
4. Reliability
• To what extent is the data and the analysis dependent on the
researcher (the instruments, …)
Other categories have been proposed as well
• credibility, transferability, dependability, confirmability
1. Research Methods 45
46. Threats to validity — Examples (1/2)
1. Construct validity
• Do the operational measures reflect what the researcher had in mind ?
• Time recorded vs. time spent
• Execution time, memory consumption, …
+ noise of operating system, sampling method
• Human-assigned classifiers (bug severity, …)
+ risk for “default” values
• Participants in interviews have pressure to answer positively
2. Internal validity
• Are there any other factors that may affect the results ?
• Were phenomena observed under special conditions
+ in the lab, close to a deadline, company risked bankruptcy, …
+ major turnover in team, contributors changed (open-source), …
• Similar observations repeated over time (learning effects)
1. Research Methods 46
47. Threats to validity — Examples (2/2)
3. External validity
• To what extent can the findings be generalized ?
• Does it apply to other languages ? other sizes ? other domains ?
• Background & education of participants
• Simplicity & scale of the team
+ small teams & flexible roles vs. large organizations & fixed roles
4. Reliability
• To what extent is the data and the analysis dependent on the
researcher (the instruments, …)
• How did you cope with bugs in the tool, the instrument ?
• Classification: if others were to classify, would they obtain the same ?
• How did you search for evidence in mailing archives, bug reports, …
1. Research Methods 47
48. Threats to validity = Risk Management
No experimental design can be “perfect”
… but you can limit the chance of deriving false conclusions
• manage the risk of false conclusions as much as possible
+ likelihood
+ impact
• state clearly what and how you alleviated the risk (replication !)
+ construct validity
- precise metric definitions
- GQM paradigm
+ internal & external validity
- report the context consciously
+ Reliability
- bugs in tools: testing, usage of well-known libraries, …
- classification: develop guidelines & others repeat classification
- search for evidence (mailing archives, bug reports, …):
have an explicit search procedure
1. Research Methods 48
49. 1. Research Methods
Introduction
• Origins of Computer Science
• Research Philosophy
Research Methods
• 1. Feasibility study
• 2. Pilot Case
• 3. Comparative study
• 4. Observational Study [a.k.a. Etnography]
• 5. Literature survey
• 6. Formal Model
• 7. Simulation
Conclusion
• Studying a Case
vs. Performing a Case Study
+ Proposition
+ Unit of Analysis
+ Threats to Validity
1. Research Methods 49
50. Studying a case vs. Performing a case study
1. Questions
• most likely “How” and “Why”; also sometimes “What”
2. Propositions (a.k.a. Purpose)
–––––––––Low hanging fruit–––––––––
• explanatory: where to look for evidence
• exploratory: rationale and direction
+ example: Christopher Columbus asks for sponsorship
- Why three ships (not one, not five) ?
- Why going westward (not south ?)
• role of “Theories”
+ possible explanations (how, why) for certain phenomena
➡ Obtained through literature survey
3. Unit(s) of analysis
• What is the case ?
Threats to
4. Logic linking data to propositions
validity
+ 5. Criteria for interpreting findings
• Chain of evidence from multiple sources
• When does data confirm proposition ? When does it refute ?
1. Research Methods 50