Survey Research In Empirical Software Engineering

Survey Research in
Software Engineering
Alessio Ferrari, CNR-ISTI, Pisa, Italy

alessio.ferrari@isti.cnr.it
LM Rea and RA Parker, 2014. Designing and conducting survey research: A comprehensive guide
Barbara A. Kitchenham and Shari L. Pﬂeeger, 2008 , https://doi.org/10.1007/978-1-84800-044-5_3
April, 2020

Survey
• A survey is a method to systematically gather qualitative and quantitative data related to
certain constructs of interests from a group of individuals that are representative of a
population of interest

• Constructs of interest: concepts that I want to evaluate, e.g., usability of a certain tool,
developers’ habits, etc.

• Population of interest (target population or population): the group of individuals that
is the focus of the survey, e.g., Python developers, companies in a certain area, Python
developers from University A vs Python developers from University B, potential users

• NOTE: I also have qualitative data, but I am normally oriented to present statistics, and
therefore the output is normally quantitative

• NOTE: In principle, individuals of the population of interest can also be objects, but here
we mainly focus on surveying subjects

Survey




In this context, a survey is a synonymous of QUESTIONNAIRE

Survey




In this context, a survey is a synonymous of QUESTIONNAIRE
In practice, a survey can be carried out with structured interviews

Why Surveys in
Software Engineering (SE)?
• SE Practice: Surveys are important to gather user’s
needs, which are the trigger for any software
development endeavour (e.g., understanding what are the
typical linguistic problems in oﬃcial documents from the
viewpoint of citizens, and build a tool to prevent these
problems)

• SE Research: Surveys are also important to gather
information about the practice of software engineering,
in a company or across companies, and to build general
theories (e.g., 80% of problems in SE are due to poorly
written requirements)
Here we are mainly concerned with this second type
However, many considerations apply to both cases

The ABC of Software Engineering Research 11:11
Fig. 1. The ABC framework: eight research strategies as categories of research methods for software engi-
Jungle
Natural
Reserve
Flight SimulatorIn Vitro Experiment
Courtroom
Referendum
Mathematical Model Forecasting System

Survey in SE: Examples
• In a company: I have developed a set of requirements
issues from interviews, and I want to see how relevant they
are for the whole company (unit of analysis are the
employees)

• In a cross-domain population: I want to understand which
are the requirements engineering problems from a large
population; I recruit representatives from diﬀerent
companies and ask them to ﬁll the survey about their
company (unit of analysis is the company)

• In an open-source population: I want to understand which
are the reasons for following some people; I recruit them
from GitHub, ask open-ended questions, and code them
Mostly deductive, but inductive approaches
are needed in open ended questions

Which Roles to Survey in SE?
• Board of Directors: A group of people, elected by stockholders, to establish corporate policies, and make
management decisions (can also be a single person in case the co)
• Managers: three diﬀerent levels of management may be present in a large company (low, middle, top)

• Top-level managers (e.g, Organisational Managers) responsible for controlling and overseeing the entire
organization.

• Middle-level managers (e.g., Functional Managers) are responsible for executing organizational plans which
comply with the company’s policies. These managers act at an intermediary between top-level management and
low-level management.

• Low-level managers focus on controlling and directing (e.g., Project Managers). They serve as role models for
the employees they supervise.

• Customers: the ones who buy the system
• Users: the ones who use the system
• Requirements/Business Analysts: the ones that gather requirements from customers and users
• Designers and Architects: the ones that design the system at the high level
• Developers: the ones who code
• Testers: the ones who test the code

Which Roles to Survey in SE?
• Board of Directors: A group of people, elected by stockholders, to establish corporate policies, and make
management decisions (can also be a single person in case the co)
• Managers: three diﬀerent levels of management may be present in a large company (low, middle, top)

• Top-level managers (e.g, Organisational Managers) responsible for controlling and overseeing the entire
organization.

• Middle-level managers (e.g., Functional Managers) are responsible for executing organizational plans which
comply with the company’s policies. These managers act at an intermediary between top-level management and
low-level management.

• Low-level managers focus on controlling and directing (e.g., Project Managers). They serve as role models for
the employees they supervise.

• Customers: the ones who buy the system
• Users: the ones who use the system
• Requirements/Business Analysts: the ones that gather requirements from customers and users
• Designers and Architects: the ones that design the system at the high level
• Developers: the ones who code
• Testers: the ones who test the code
The roles may depend on the adopted software process!
Companies may include only a subset of the roles
Some roles may be covered by the same person

Survey Process
Research Questions
Sampling
Design Questionnaire
Finalise Questionnaire
Planning Execution and Analysis
Set Deadline for Reply
(if online/email)
Reporting
Collect Answers
Data Coding and Editing
Sampling Procedure
Characterise Target
Population
Pilot Questionnaire
Recruit and Deliver
Questionnaire
Data Analysis and
Interpretation
Research Questions
Questionnaire Design
Threats to Validity (Validity
and Reliability)
Deal with Ethics and GDPR
Deﬁne Measures
Results and Analysis
Discussion
in Relation to RQs
Surveys are a hybrid
between qualitative
and quantitative studies
Imputation and Adjustments

Terminology
• Population: he universe of units from which the sample is to be selected. The
term ‘units’ is employed because it is not necessarily people who are being
sampled—the researcher may want to sample from a universe of nations,
cities, regions, ﬁrms, etc.

• Sample: the segment of the population that is selected for investigation. It is a
subset of the population. The method of selection may be based on a
probability or a non-probability approach (next slide).

• Sampling frame: the listing of all units in the population from which the
sample will be selected. It is an explicit list of units —sometimes it is not
possible to match it with the actual population, e.g., if the population is “all
Python developers”.

• Representative sample: a sample that reﬂects the population accurately so
that it is a microcosm of the population.

• Respondents: the subject who responded to the survey

Sampling
Population
Sampling Frame
e.g., e-mails
e.g., developers
Respondents
Sample

Probability Sampling:
Sampling Frame
• The optimal sampling frame has the following qualities:

• all units have a logical, numerical identiﬁer

• all units can be found – their contact information, map location or other relevant information is
present

• the frame is organized in a logical, systematic fashion

• the frame has additional information about the units that allow the use of more advanced
sampling frames (e.g., age or expertise of developers to have stratiﬁed samples—this may be
collected afterwards)

• every element of the population of interest is present in the frame (it is not always possible…)

• every element of the population is present only once in the frame

• no elements from outside the population of interest are present in the frame

• the data is 'up-to-date'
https://en.wikipedia.org/wiki/Sampling_frame

Terminology
• Probability sample: a sample that has been selected using random selection so that
each unit in the sampling frame has a known chance of being selected.

• Non-probability sample: a sample that has not been selected using a random
selection method. This implies that some units are more likely to be selected than
others.

• Sampling error: error in the findings deriving from research due to the difference
between a sample and the population from which it is selected.

• Non-sampling error: error in the findings deriving from research due to the
differences between the population and the sample that arise either from deficiencies
in the sampling approach, such as an inadequate sampling frame or non-response
(see below), or from such problems as poor question wording, poor interviewing, or
flawed processing of data.

• Non-response: it occurs whenever some members of the sample refuse to
cooperate, cannot be contacted, or for some reason cannot supply the required data

Probability Sampling
• Random sampling: select n units from the sampling
frame, in a random manner (e.g., “=RAND()" function in
Excel, order list of subjects by random number, select first
n)

• Stratified sampling: select s unit for each identified
stratum (e.g., developer vs tester) of the sampling frame
Typical for market analysis and user studies
Used for large SE studies
Purposive sampling (Non-probability) was used in
Interviews, here random sampling is preferred
cf. De Mello and Travassos, 2016 https://doi.org/10.1145/2961111.2962632

Probability Sampling: Formula• Recommended when working with probabilistic sampling designs
• SS: sample size
• Z: Z-value, established through a specific table (Z=2.58 for 99% of confidence
level, Z=1.96 for 95% of confidence level
• p: percentage selecting a choice, expressed as decimal (0.5 used as default for
calculating sample size, since it represents the worst case).
• c: desired confidence Interval, expressed in decimal points (Ex.: 0.04).
47
cf. Torchiano et al. https://www.slideshare.net/mendezfe/surveys-in-software-engineering
• SS: sample size

• Z: Z-value, established through a specific table (Z=2.58 for 99% of confidence  
level, Z=1.96 for 95% of confidence level)

• p: sample proportion, conservative approach is 0.5 (leads to largest SS)

• c: confidence interval, expressed in decimal points (e.g.: 0.04, ± 4%)
Example
- Confidence level: 95%
- Confidence interval: ± 4%
- If the result of a survey answer is e.g., 50% of subjects responding X,
if I repeat the survey the actual result can be between 46% to 54% of
people, with a confidence level of 95%.
How to compute
the sample size?

Probability Sampling: Formula
Sample Size Formula
• Correction formula based on a finite population with a pop
size
48
Population Confidence Level
Confidence
Interval
Sample Size
10,000 95% 0.01 4,899
10,000 95% 0.05 370
500 95% 0.01 475
500 95% 0.05 217
Correction Formula, with population of pop size
Sample Size Formula
• Correction formula based on a finite population with a pop
size
48
Population Confidence Level
Confidence
Interval
Sample Size
10,000 95% 0.01 4,899
10,000 95% 0.05 370
500 95% 0.01 475
500 95% 0.05 217
In SE, it may be convenient to increase the
conﬁdence interval, as we can tolerate some imprecision

Probability Sampling in SE Practice
• Select the population from a certain portal:

• GitHub (for developers)

• check most active GitHub users here: https://gist.github.com/
paulmillr/2657075;

• try to copy-paste this in your browser: https://api.github.com/search/
users?q=followers:100+sort:followers&per_page=100 (the GitHub API
can help you to identify users)

• Check GHTorrent project: https://ghtorrent.org

• LinkedIn (for other types of professionals, you need to enter groups and
contact people personally, or create polls in groups)

• Consider that only 10% of the contacted subjects will respond (20% in
GitHub), so ensure that you gather enough data, contact as many people as
possible and reasonable

• My population is the world of developers.

• …Well, open source developers…Well, open source developers using GitHub.

• My sample frame is the open source developers in GitHub —I can identify their email and contact them.

• I have identiﬁed that in GitHub there are 44,735,158 users. I can’t send a questionnaire to all of them.

• I decide to select a sample of the most active users, as I think they represent my population better: HOW
MANY?

• Go to: https://www.surveymonkey.com/mp/sample-size-calculator/
cf. Blincoe et al. http://kblincoe.github.io/publications/2015_IST_Blincoe.pdf
conﬁdence interval

• Since normally just 10% of the people respond, I need to
consider at least 385 * 10 people if I want a representative
sample, so about 4,000 emails.

• In the end, I get answers from 800 people (20%), not too bad.
This is my actual sample, 800 instead of 44,000,000. I can say
that it is representative, as it is clearly above 385.

• Actually, I can even reduce my conﬁdence interval now to 4%

Non-probability Sampling:
Convenience Sampling
• In SE research, it is also typical to have non-probability samples

• Specific expertise is normally required by the respondents (e.g., developers but also domain experts), and it may not be
straightforward to collect a sufficiently large sample, unless you work with GitHub or other networks.

• If you are sampling in a specific company (e.g., to make a survey in a multi-national company, in which the unit of
analysis is the employee) it is unlikely that you have access to the list of all employees

• If you are sampling the companies in a certain area (e.g, to make a survey on startups in Italy or in Tuscany, the unit of
analysis is the company), it is again unlikely that you have access to the list of startups in the area

• Convenience sampling is often adopted: I gather information from all the people that I can contact through my social and
professional links; I collect relevant demographic information (e.g., age, number of years at company X, role, number of
years in a certain role) together with the responses; I check to which extent the demographic information is related to the
responses

• Often, surveys are performed at specific software engineering conferences, and may not reflect the reality—only
companies interested in research may participate, some sectors may not be covered at all

• It is more difficult to have surveys on different companies and performed online — an example will be given at the end of
the presentation

• In these cases you have to rely on personal contacts, that you personally have with companies, and that your colleagues
(other academics in other areas) have with other companies — still, some companies will never be reached

• Little, biased information is better than NO information at all, if the context is clearly explained

What to Ask? Depends on
the Unit of Analysis
•Individuals: experience in the research context,
experience in SE, current professional role, location
and higher academic degree, ...  
•Project teams: team size, client/product domain
(avionics, ﬁnance, health, telecommunications, etc.)
and physical distribution, ...  
•Organisations: size, industry segment, location, type
(government, private company, university, etc.), ...  
•
Demographic information

What to Ask? Depends on
your Research Questions
• RQ1: Which are the most frequent requirements defects?

• RQ2: Which requirements defects are more difficult to identify?

• …

• Question: How frequently do you encounter these types of
requirements defects (Never, Seldom, Sometimes, Often, Very
Often): ambiguity, incompleteness, grammar error, etc.

• Question: How difficult is to identify these types of defect (Very
Difficult, Moderately Difficult, Neither Easy Nor Difficult, Moderately
Easy, Very Easy): ambiguity, incompleteness, grammar error, etc.
To identify the types of defects, and the choices in general
I need to refer to the literature, or to experts in the field

What to Ask? Organise Focus
Groups and Interviews
• Sometimes it is useful to organise a focus group to identify the
relevant questions (or a draft for them, you will need more time to
revise the formulation…)

• Gather participants with different viewpoints, give them 5-10 minutes
to write in a piece of paper a set of relevant questions, ask them to
read, and brainstorm on the proposals

• Sometimes you can refer to the literature to identify your options (e.g.,
phases of a certain software process), or to experts' opinion

• If you are dealing with a somewhat unknown public—e.g., in a specific
domain—it may be useful to first interview people to identify
terminology and relevant questions, and then create the questionnaire

What to Ask? Types of Questions
• Personal factual questions: what is your role in the organisation? How many
years of experience do you have in your current role?
• Factual questions about others: how old are, in average, developers in your
company?
• Informant factual questions: does your company employ external suppliers?
• Questions about attitudes: my job is typically interesting [Disagree…Agree]
(judgments)

• Questions about beliefs: incorrect requirements tend to result in code errors
[Never … Always] (attitudes and beliefs are different, use different Likert
scales!)

• Questions about normative standards and values: is it considered
appropriate to have casual dressing in your office?
• Questions about knowledge: which is the most common cause of software
project failure according to research? (rare, to check if the person is informed)

Qualities of a Questionnaire
• Clarity: Will respondents understand the questions? The researchers
may find that certain ambiguities exist that confuse respondents. Are
the response choices sufficiently clear to elicit the desired information?

• Comprehensiveness: Are the questions and response choices
sufficiently comprehensive to cover a reasonably complete range of
alternatives? The researchers may find that certain questions are
irrelevant, incomplete, or redundant and that the stated questions do
not generate all of the important information required for the study.

• Acceptability: Such potential problems as excessive questionnaire
length or questions that are perceived to invade the privacy of the
respondents, as well as those that may abridge ethical or moral
standards, must be identified and addressed by the researchers.

Structure of the
Questionnaire
• Introductory questions: easy to answer, demographic, NOT sensitive

• Sensitive/personal questions: just if needed, just late in the questionnaire after the
(virtual) rapport is established

• Related questions: group by topic

• Logical sequence: topics shall be logically connected

• Filter/Screening Questions: questions to qualify or disqualify respondents (to make
them eligible to respond to other questions, or evaluate their conﬁdence)

• Nested Structures: try to avoid large blocks that are responded only by certain
participants —very hard to elaborate and compare afterwards

• Reliability Checks: reformulate and present questions that you consider particularly
relevant to be responded accurately (Do you like writing code? When thinking about
writing code you feel…)

Types of Questions
• Open-ended Questions: the respondent can write free
text (long or short)

• Close-ended Questions: set of alternatives; multiple
choice (with minimum and maximum choices), exclusive
choices, Likert Scale.

Open-ended vs Close-ended
Open-ended Close-ended
Allow usage of personal words 🙂 ☹
Unusual answers can be identified 🙂 😐
Typically not leading 🙂 😐
Useful to explore new areas 🙂 ☹
Time effective ☹ 🙂
Answers need to be coded ☹ 🙂
Clear answers ☹ 🙂
Easy to process ☹ 🙂
Compatible answers ☹ 🙂
Answers clarify questions ☹ 🙂
Spontaneous Answers 🙂 ☹
Exhaustive Answers 🙂 ☹
Different perception of scales 🙂 ☹

Formulating Questions: Tips
• Given a question, how would YOU answer it?

• Given a question, test it with peers (for initial draft)

• Pilot the set of questions with a group of respondents
from which you can get feedback (e.g., colleagues,
subjects from company)

• Remember that you may not know the terminology
typically used by your respondents, soy may have to
perform preliminary unstructured interviews to understand
the typical terminology

• Given a question, how would YOU answer it?

• Given a question, test it with peers (for initial draft)

• Pilot the set of questions with a group of respondents
from which you can get feedback (e.g., colleagues,
subjects from company)

• Remember that you may not know the terminology
typically used by your respondents, soy may have to
perform preliminary unstructured interviews to understand
the typical terminology
PILOT, PILOT, PILOT

• Avoid vague/ambiguous questions and answers:
• How often does your group have meetings? [Often…Never]
• How frequently does your group have meetings? [Once a day, Once per week, …]

• Avoid double negatives: Do you consider not appropriate to avoid testing?

• Avoid long questions: Which types of defects are typically encountered by developers whose
relevance is normally diﬃcult to communicate to managers?
• Avoid general questions: What is the general, physical, intellectual, and moral condition of
men and women employed in your group?
• Avoid double-barrelled questions: How satisﬁed are you with the space and the colleagues?
What testing environment do you normally use? (there could be no testing environment in use)

• Avoid technical terms: What is the Six-sigma Maturity Level of your process?
• Prefer forced choice answers instead of “all that apply” (for each choice: YES, NO)

What Types of Responses?Questionnaire Design
Free-text
Numeric
values
• Open questions
• Allow coding
• Content analysis
• High effort on data
analysis
• Open questions
• Allow a wide range
of statistical
analysis
Interval
Scale
• Closed questions
• Not necessarily equally
distributed intervals
• Significantly restricts
statistical analysis
Ordinal/
Likert scale
• Intervals are
considered equally
distributed
• Statistical analysis is
less restrictive than
Interval Scale
Nominal
• Statistical analysis
based on frequency
likert scale

Response Formats:
Examples
Questionnaire Design
How much experience do you have in
Java programming?
a) Very High experience
b) High Experience
c) Few Experience
d) Very Few experience
Java Programming?
a) Less than one year
b) 1 year to 3 years
c) 3 years to 5 years
d) More than 5 years
Java programming?
__5__ years
Java programming?
I have been working with Java programming at
companies since 2011. Before, I got my first
Java certification in 2009, when I started
working in personal projects. But I have
difficult withobject-orientedparts…_________
Do you have experience in Java
programming?
( ) Yes ( ) No

Tip: Standardised Answers
• When possible, use statements and standardised Likert-
scale answers indicating agreement (more answers can
be gathered):

• Strongly Agree, Agree, Disagree, Strongly Disagree

Not Just Questions…
• The questionnaire must be accompanied by various administrative information
including:

• An explanation of the purpose of the study.

• A description of who is sponsoring the study (and perhaps why).

• A cover letter using letterhead paper, dated to be consistent with the mail shot

• Provide a contact name and phone number. Personalize the salutation if possible.

• An explanation of how the respondents were chosen and why.

• An explanation of how to return the questionnaire.

• A realistic estimate of the time required to complete the questionnaire. Note that an
unrealistic estimate will be counter-productive.
And privacy issues (later)

Tips for a
Successful Survey

Recruiting
• Send individual but standard invitation messages

• It is expected that great most of the individual messages sent will be read

• Avoid "spreading spree": mailing lists, forum invitation messages, crowdsourcing
tools (such as Amazon MechanicalTurk)

• You will have few or no control on who read the invitation. So, who was effectively
recruited?

• Never allow forwarding (which is different from snowballing)! —It will violate the
sample

• Send a questionnaire’s individual token to each subject

• Establish a finite and not long period to answer the survey (One-two weeks)
• Offer rewards (raffles, donations, payments, sharing results)

Reminding
• Reminders should be used with care.

• Avoid reminding who already had participated

• Avoid reminding more than once

• The invitation message should clearly characterize the
involved researchers, the research context and present the
recruitment parameters

• Include in the invitation message a compliment and an
observation regarding the relevance of subject participation

Piloting
• Pilot the population and sampling activities

• Use a (smaller) sample of the sampling frame, reproducing all planned steps ü
Will allow you to check the adequacy of the frame population to your survey.

• Pilot the questionnaire

• Is it clear, unambiguous, did you maybe miss some questions?

• Is it too long/too short?

• Pilot the recruitment

• Is it working eﬀectively?

• Pilot the data analysis

• Do you have planned for the proper data analysis techniques? What is the
necessary data quantity and quality?

Privacy Policy and
General Data Protection
Regulation (GDPR)
cf. https://www.slideshare.net/alanmcsweeney/gdpr-context-principles-implementation-operation-impact-on-
outsourcing-data-governance-and-data-ethics

General Data Protection Regulation
• General Data Protection Regulation (GDPR) applies to any task dealing with
personal data (not just research surveys)
• Personal Data: means any information relating to an identified or identifiable
natural person ('data subject'); an identifiable natural person is one who can
be identified, directly or indirectly, in particular by reference to an identifier
such as a name, an identification number, location data, an online identifier or
to one or more factors specific to the physical, physiological, genetic, mental,
economic, cultural or social identity of that natural person
• If you distribute your surveys anonymously and you do not process
personal data, you can disregard the GDPR. But, be careful, the GDPR has
an extremely broad view of what personal data is (basically, most
demographic data are personal)! 
• If you use contacts or ask for an email address, name or any other personal
data in your surveys, then make sure to read the GDPR, as it imposes a
number of responsibilities on you.
Any individual who can be distinguished from others is considered identifiable.
If you want to ensure that one person answers
one form only, you have to identify them!

General Data Protection Regulation
• If you are creating forms or surveys for a business which is based in the
European Union (EU), or if you collect and process the personal data of
EU citizens, the General Data Protection Regulation (GDPR) affects you.

• The GDPR (General Data Protection Regulation) law basically says
that:

• you must obtain freely given, specific, informed, and unambiguous
consent from your respondents when you collect their personal data.
In other words, you shall not force people to respond to or fill out
your surveys or forms, or somehow trick them to collect their
personal data.

• Additionally, must explain how you plan to use their personal data, in
a clear and easy to understand way.

• Also, as individuals have the right to be forgotten, you must delete
information that you have collected from them if they request.

Privacy Policy: Content (1)
• What you collect and how
• In your text, explain what type of personal data you are collecting and how. Is it respondents email,
name, or IP address? Is it simply by asking them questions,  
or are you collecting data automatically (for example their geo-location or IP address)?

• Why you collect
• Your privacy policy text must clarify your reasons for collecting personal data. Explain for instance why
you need their email.  
Do you have good reasons for collecting their name or address?

• How will you use their data
• Are you going to share it with third parties? In that case, say who these 3rd parties are and why you
need to share their data with them.  
If you ask for their contact info for instance, are you going to use it to contact them, or send them
something?

• How long will you keep their data
• The GDPR requires you to deﬁne a so called “data retention” period, when you collect personal data.
Thus your privacy policy text should explain how long you will retain the data.

Privacy Policy: Content (2)
• How secure is the data in your possession
• Your privacy policy must also explain what security measurements are applied when you collect, export, share, and
store personal data of your respondents. What tools are you using, and if your data processors are also taking the
security of the data seriously.

• Clarify your respondents rights
• The GDPR clearly defines individuals rights for their own data. You must also make sure to reflect these rights in
your privacy policy text, and inform your respondents about their rights, which are as follows:

• Right to access, view, and edit their own information in a timely manner

• Right to be forgotten, which means being deleted from your survey results

• Also right to be able to opt-out form your future messages (e.g. if you use their data to send them ads or
marketing messages)

• Keep in mind that data is owned by the respondents, not you or your company or organization.

• Who to contact
• Every organization that is collecting data from EU citizens must have a Data Protection Officer. The DPO is a
person in the organization who can represent the organization with respect to data and privacy issues. Including the
DPO’s contact information in your privacy policy would be great for your respondents, in case then need to ask
questions or practice their rights.

Example: Privacy Notice
What to write in your survey entry page (with a link to the policy)
Why and How
Transparency
Data Retention
Share or Sale of Data
Link to Policy
Contact Person
We want to understand the typical problems of SE students.

For this, we need your contribution with this survey.

The survey takes 5 to 10 minutes to complete.
Together with your opinion, we will ask also personal data,

such as your email address, to ask you follow-up questions
We securely store this data until the end of 2020
We respect your privacy and therefore we will not share

your data with any third party
By ﬁlling up this form, you agree that we will process

your data according to our privacy policy
If you have any question regarding your data, contact

our data protection oﬃcer: Mr. John Doe, j.doe@survey.com

Threats to Validity in
Survey Research

Reliability and Validity
• Reliability and Validity are the two main criteria used in
survey research to evaluate threats to validity

• Reliability is concerned with how well we can reproduce
the survey data, as well as the extent of measurement
error. That is, a survey is reliable if we get the same kinds
and distribution of answers when we administer the
survey to two similar groups of respondents.

• Validity is concerned with how well the instrument
measures what it is supposed to measure.
Focus groups and pilot tests shall be performed
to ensure reliability and validity

Reliability Types
• Test-retest (intra-observer) Reliability: how likely is that the person responds
in the same way if surveyed twice?

• How to ensure: during pilot, survey twice, if correlation greater than 0.7,
reliability is good; for some questions, include alternate forms, and ensure
Cronbach alpha greater than 0.7
• Inter-rater Reliability: to which extent different observers give similar answers
when they assess the same situation? (not so common)

• How to ensure: use two pilots with different samples, and check correlation
between distributions of answers

• Inter-coder Reliability: (in case of open questions) how reliable is the coding
procedure?

• How to ensure: two coders, joint selection of a master code list, and
application of the master codes to the data; check agreement with
Krippendorff’s alpha

Validity Types
• Content Validity: how appropriate the instrument seems
to a group of reviewers (i.e., a focus group) with
knowledge of the subject matter?

• How to ensure: perform a focus group

• Construct Validity: to which extent are the constructs
related to the measured variables?

• How to ensure: provide sound arguments that show
the relationship between constructs and questions
Other types of validity shall be considered when the survey is repeated
Barbara A. Kitchenham and Shari L. Pﬂeeger, 2008 , https://doi.org/10.1007/978-1-84800-044-5_3

Example Survey in SE: Napire
(Naming the Pain in
Requirements Engineering)
Contemporary Problems, Causes, and Eﬀects in Practice

cf. http://re-survey.org/#/explore
cf. Mendez Fernandez et al. https://arxiv.org/pdf/1611.10288.pdf

Napire
3.1 Research Questions
Our objective is to get a better understanding of which problems practitioners
encounter in RE, and how those problems relate to the overall project setting
(causes and problems). To this end, we formulate three research questions, shown
in Table 2, to steer the design of our study.
Table 2 Research questions.
RQ 1 Which contemporary problems exist in RE?
RQ 2 What are observable patterns of problems and context characteristics?
RQ 3 What are their perceived causes and e↵ects?
The first question aims at understanding which problems practitioners experi-
ence in general in their RE and what their criticality is w.r.t. project failure. This
more descriptive view is complemented by the second research question, which
aims at understanding whether there exist problems that relate to specific context
factors, such as the company size or the type of used process model. Once we un-
derstand whether there exist specific patterns in the problems, we want to know
what their perceived causes and implications are going beyond project failure.
3.2 Instrument
The overall instrument used in NaPiRE constitutes in total 35 questions used to
collect data on (a) the demographics, (b) how practitioners elicit and document
requirements, (c) how requirements are changed and aligned with tests, (d) what
and how RE standards are applied and tailored, (e) how RE is improved, and
finally (f) what problems practitioners experience in their RE. In the study at
hands, we focus on the problems practitioners experience in their RE while using
8 D. Méndez Fernández et al.
Table 3 Questions (simplified and condensed excerpt).
Parts No. Question Type
Demographics Q 1 What is the size of your company? Closed(SC)
Q 2 Please describe the main business area and application
domain.
Open
Q 3 Does your company participate in globally distributed
projects?
Closed(SC)
Q 4 In which country are you personally located? Open
Q 5 To which project role are you most frequently assigned? Closed(SC)
Q 6 How do you rate your experience in this role? Closed(SC)
Q 7 Which organisational role does your company take most
frequently in your projects?
Closed(MC)
Q 8 Which process model do you follow (or a variation of
it)?
Closed(MC)
Status Quo Q 9 How do you elicit requirements? Closed(MC)
Q 10 How do you document functional requirements? Closed(SC)
Q 11 How do you document non-functional requirements? Closed(SC)
Q 12 How do you deal with changing requirements after the
initial release?
Closed(SC)
... ... ...
Q 16 What requirements engineering company standard have
you established at your company?
Closed(MC)
... ... ...
Problems Q 28 Considering your personal experiences, how do the fol-
lowing (more general) problems in requirements engi-
neering apply to your projects?
Likert
Q 29 Considering your personally experienced problems
(stated in the previous question), which ones would you
classify as the five most critical ones (ordered by their
relevance).
Closed
Q 30 Considering your personally experienced most critical
problems (selected in the previous question), which
causes do they have?
Open
problems (selected in the previous question), which im-
plications do they have?
Open
problems (selected in the previous question), which
mitigations do you define (if at all)?
Open
Q 33 Considering your personally experienced most critical Closed(MC)
Research Questions
Questions
(Example)

Results
To analyse the inﬂuence of the most cited causes on the most cited problems
and, in turn, of those problems to project failure (as reported by the survey re-
spondents), we visualise the relationships via an alluvial diagram. This diagram is
shown in Figure 3. The decision to relate only the most cited causes to the most
cited RE problems was taken to enhance the visualisation.
Communication flaws between project team and the customer
Customer does not know what he wants
Lack of a well-defined RE process
Lack of experience of RE team members
Lack of time
Missing direct communication to customer
Requirements remain too abstract
Too high team distribution
Unclear roles and responsonsibilities at customer side
Weak qualification of RE team members
Communication flaws between project team and the customer
Communication flaws within the project team
Incomplete and / or hidden requirements
Inconsistent requirements
Insufficient support by customer
Moving targets (changing goals, business processes and / or requirements)
Stakeholders with difficulties in separating requirements from previously known solution designs
Time boxing / Not enough time in general
Underspecified requirements that are too abstract and allow for various interpretations
Weak access to customer needs and / or (internal) business information
Project Completed
Project Failed
Fig. 3 Relation of top 10 causes, top 10 problems, and the project impact.
causes vs problems

Summary
• Surveys are a hybrid method between qualitative and
quantitative research

• Sampling is crucial to have good data

• Piloting is crucial (you have one shot only)

• Clarity of questions and time to answer is key

• Don’t forget about privacy issues

Survey Research In Empirical Software Engineering

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Survey Research In Empirical Software Engineering

Similar to Survey Research In Empirical Software Engineering (20)

Recently uploaded

Recently uploaded (20)

Survey Research In Empirical Software Engineering