Context: Small and non-probabilistic samples represent relevant issues when discussing the external validity of empirical studies in Software Engineering. Goal: To investigate alternatives to improve the quality of samples (size, heterogeneity and level of confidence). Method: To replicate a survey on characteristics of agility in software processes by applying a systematic recruitment strategy over a professional social network. Results: It resulted in a sampling frame composed by 19 groups stratified according two perspectives: sharing of groups' members and main software engineering skills reported by the subjects. In total, 7,745 subjects were randomly recruited, resulting in 291 contributions. Conclusions: This sample was significantly larger, more heterogeneous and presents some strata with higher confidence levels than previous trials samples.
214 - Sampling Improvement in Software Engineering Surveys
1. Sampling Improvement in Software
Engineering Surveys
Rafael Maiani de Mello
rmaiani@cos.ufrj.br
Pedro Correa da Silva
pedrorez@poli.ufrj.br
Guilherme Horta Travassos
ght@cos.ufrj.br
ese.cos.ufrj.br
2. 2
Motivation
• Small and non-probabilistic samples usually:
• reduce external validity;
• make replication difficult;
• limit possibilities of aggregation, and;
• hamper the evaluation of SE technologies .
• Particularly, SE surveys have their results affected
when inadequate samples are used.
3. 3
Context
• Survey on Agility Characteristics and Practices in
Software Processes
– 2 previous trials
– 158 invited subjects (P1)
– 25 participants (S1)
– Only 7 participants had declared high or very high
experience on applying agile approaches in Software
Projects
4. 4
Recruitment Strategy (RS)
Search Question
“Who are the groups from LinkedIn interested in Agility
characteristics and practices concerned with Software
Engineering?”
5. 5
RS Execution
• 289 distinct groups were selected, 62 groups
included after analysis.
Exclusion Criteria # % from Total
Local Groups 97 42.73%
Organizations, publicity and events 66 29.07%
Out of scope 33 14.54%
Vague description 25 11.01%
Single member groups 18 7.93%
Headhunting and job offering groups 8 3.52%
LinkedIn subgroups 3 1.32%
Non-English 1 0.44%
Total of Excluded Groups 227 78.55%
6. 6
Stratification Based on the
Overlapping Rates
Group D
Group B
Group A
Group C
Shared
Members
7. 7
Overlapping Matrix
A B C D E F H I
A 100.00% 4.22% 25.95% 27.60% 25.42% 3.91% 4.35% 3.77%
B 3.48% 100.00% 3.43% 4.00% 2.25% 24.19% 3.01% 2.33%
C 20.04% 3.21% 100.00% 18.65% 20.13% 2.97% 2.62% 2.20%
D 15.49% 2.73% 13.56% 100.00% 16.38% 2.71% 2.35% 2.39%
E 11.32% 1.21% 11.61% 12.99% 100.00% 1.10% 1.52% 1.23%
F 1.70% 12.72% 1.66% 2.09% 1.07% 100.00% 1.62% 1.91%
H 0.90% 0.75% 0.70% 0.87% 0.71% 0.77% 100.00% 45.66%
I 0.64% 0.48% 0.49% 0.73% 0.47% 0.76% 37.73% 100.00%
.
.
.
. . .
14. 14
New Strata- St1
• “Agilists”
• Composed by agility groups
• Personal Skills, Social Skills, SW Analysis and Design
15. 15
New Strata- St2
• Testing Professionals
• Composed mainly by LinkedIn groups devoted to
Software Testing
• Software Testing is the most relevant skill group
16. 16
New Strata- St3
• Programmers
• Composed mainly by LinkedIn groups devoted to
agile practices
• Programming is the most relevant skill group
17. 17
New Strata- St4
• Configuration Managers
• Composed by three LinkedIn groups concerned with
Configuration Management (CM)
• CM is the most relevant skill group, closely followed
by Programming and Personal Skills
18. 18
New Strata- St5
• “System Analysts”
• Composed by a single LinkedIn group devoted to
software architecture
• Main skill groups: personal skills and SW analysis and
design
19. 19
Hypothesis Testing
Heterogeneity
• S2 is more heterogeneous than S1
Region
S1
12 subjects
9 countries
S2
289 subjects
43 countries
USA+Canada 41.7% 38.1%
Europe 33.3% 41.2%
Asia 25% 11.8%
Latin America - 5.9%
Oceania 16.7% 2.1%
Africa - 0.1%
20. 20
Confidence Level
• S1 and S2 has similar confidence levels
Sample Size Mean Normal (KS) Student t-Test
Mann-Whitney
test
S1 25(1) 0.375 Yes - -
S2 291 0.418 Yes 0.055 -
S2-St1 97 0.460 Yes 0.011(0) -
S2-St2 81(3) 0.382 Yes 0.424(3) -
S2-St3 57(7) 0.470 Yes 0.003(7) -
S2-St4 24 0.333 No - 0.415 (0)
S2-St5 31 0.403 No - 0.171 (0)
21. 21
Conclusion
• This study and previous studies suggests that we can
improve the samples quality following a systematic
sampling approach
• It is feasible to characterize better the subject profile
through open and simple questions
• We found evidence regarding the heterogeneity
between members from the same group on social
networks
– However, in a big picture, there is a trend!
22. 22
Sampling Improvement in Software
Engineering Surveys
Rafael Maiani de Mello
rmaiani@cos.ufrj.br
Pedro Correa da Silva
pedrorez@poli.ufrj.br
Guilherme Horta Travassos
ght@cos.ufrj.br
ese.cos.ufrj.br