SlideShare a Scribd company logo
1 of 22
Sampling Improvement in Software 
Engineering Surveys 
Rafael Maiani de Mello 
rmaiani@cos.ufrj.br 
Pedro Correa da Silva 
pedrorez@poli.ufrj.br 
Guilherme Horta Travassos 
ght@cos.ufrj.br 
ese.cos.ufrj.br
2 
Motivation 
• Small and non-probabilistic samples usually: 
• reduce external validity; 
• make replication difficult; 
• limit possibilities of aggregation, and; 
• hamper the evaluation of SE technologies . 
• Particularly, SE surveys have their results affected 
when inadequate samples are used.
3 
Context 
• Survey on Agility Characteristics and Practices in 
Software Processes 
– 2 previous trials 
– 158 invited subjects (P1) 
– 25 participants (S1) 
– Only 7 participants had declared high or very high 
experience on applying agile approaches in Software 
Projects
4 
Recruitment Strategy (RS) 
Search Question 
“Who are the groups from LinkedIn interested in Agility 
characteristics and practices concerned with Software 
Engineering?”
5 
RS Execution 
• 289 distinct groups were selected, 62 groups 
included after analysis. 
Exclusion Criteria # % from Total 
Local Groups 97 42.73% 
Organizations, publicity and events 66 29.07% 
Out of scope 33 14.54% 
Vague description 25 11.01% 
Single member groups 18 7.93% 
Headhunting and job offering groups 8 3.52% 
LinkedIn subgroups 3 1.32% 
Non-English 1 0.44% 
Total of Excluded Groups 227 78.55%
6 
Stratification Based on the 
Overlapping Rates 
Group D 
Group B 
Group A 
Group C 
Shared 
Members
7 
Overlapping Matrix 
A B C D E F H I 
A 100.00% 4.22% 25.95% 27.60% 25.42% 3.91% 4.35% 3.77% 
B 3.48% 100.00% 3.43% 4.00% 2.25% 24.19% 3.01% 2.33% 
C 20.04% 3.21% 100.00% 18.65% 20.13% 2.97% 2.62% 2.20% 
D 15.49% 2.73% 13.56% 100.00% 16.38% 2.71% 2.35% 2.39% 
E 11.32% 1.21% 11.61% 12.99% 100.00% 1.10% 1.52% 1.23% 
F 1.70% 12.72% 1.66% 2.09% 1.07% 100.00% 1.62% 1.91% 
H 0.90% 0.75% 0.70% 0.87% 0.71% 0.77% 100.00% 45.66% 
I 0.64% 0.48% 0.49% 0.73% 0.47% 0.76% 37.73% 100.00% 
. 
. 
. 
. . .
8 
Stratified Sampling 
Recruitment and Effective Sample Size 
Strata Name 
#Distinct 
Members 
Sample Size Respondents 
CI for 
CL=95% 
E1 Agility 114,827 1,031 57 12.98% 
E2 Project Management 5,488 874 40 15.44% 
E3 Agile Practices 1 11,633 955 56 13.06% 
E4 Agile Practices 2 3,864 820 35 16.49% 
E5 Software Testing 1 56,400 1,021 26 19.22% 
E6 Software Testing 2 5,791 882 22 20.86% 
E7 
Configuration 
Management 
17,234 981 23 20.88% 
E8 SW Architecture 7,335 911 31 17.57%
9 
Skill Analysis 
“What come into your mind when you think about your 
five main skills in software engineering?” 
3.49%
10 
Skill Analysis 
• 277 participants answered (95.19%) 
• 1,320 reported skills 
• 325 coded skills 
• 88 skill groups 
3.49%
11 
Skill Group Skill Examples % 
Personal Skill Creativity, Detailing, Learning, Planning 10.56% 
Programming Algorithms, Programming Languages 8.80% 
SW Analysis and Design OO Design, Design Patterns 8.25% 
Social Skill Communication, Leadership 7.78% 
SW Testing Testing, Debugging 7.71% 
Thinking and Reasoning Abstraction, Analytical Thinking 6.24% 
Agile Practices Refactoring, TDD 5.05% 
Agile Characteristic Adaptability, Being Collaborative 5.00% 
SW Requirements Req. Analysis, Requirements Elicitation 4.52% 
SW Quality Quality, Quality Assurance 3.65% 
SW Architecture SW Architecture 3.63% 
Problem Solving Problem Solving 3.31% 
Agile Methods Kanban, Scrum, XP 2.71% 
Business Analysis Business understanding, Business Analysis 2.66% 
Project Management Project Management 2.21% 
Technical Expertise Technical Knowledge 2.06% 
Configuration Management Change Management, Release Management 2.01% 
Agile Agile coaching, Agile thinking, Agility 1.91% 
3.49% 
SW Development Process SW Process Improvement, SW Dev. Life-Cycle 1.27% 
SW Development Development, SW Development 1.12%
12 
Skill Distribution by Strata 
Stratum 
Personal 
Skill 
Programming 
SW Analysis 
and Design 
Social Skill SW Testing 
E1 16.34% 7.48% 11.81% 22.31% 5.30% 
E2 9.81% 10.62% 16.87% 16.73% 5.71% 
E3 12.03% 8.46% 6.94% 20.49% 17.40% 
E4 9.25% 20.56% 14.10% 14.81% 9.13% 
E5 11.77% 2.83% 6.85% 14.80% 34.36% 
E6 9.13% 22.01% 9.21% 3.88% 20.24% 
E7 12.48% 14.36% 12.50% 0.00% 2.93% 
E8 19.20% 13.67% 21.72% 6.98% 4.93%
13 
Skill Distribution 
Similarity Analysis
14 
New Strata- St1 
• “Agilists” 
• Composed by agility groups 
• Personal Skills, Social Skills, SW Analysis and Design
15 
New Strata- St2 
• Testing Professionals 
• Composed mainly by LinkedIn groups devoted to 
Software Testing 
• Software Testing is the most relevant skill group
16 
New Strata- St3 
• Programmers 
• Composed mainly by LinkedIn groups devoted to 
agile practices 
• Programming is the most relevant skill group
17 
New Strata- St4 
• Configuration Managers 
• Composed by three LinkedIn groups concerned with 
Configuration Management (CM) 
• CM is the most relevant skill group, closely followed 
by Programming and Personal Skills
18 
New Strata- St5 
• “System Analysts” 
• Composed by a single LinkedIn group devoted to 
software architecture 
• Main skill groups: personal skills and SW analysis and 
design
19 
Hypothesis Testing 
Heterogeneity 
• S2 is more heterogeneous than S1 
Region 
S1 
12 subjects 
9 countries 
S2 
289 subjects 
43 countries 
USA+Canada 41.7% 38.1% 
Europe 33.3% 41.2% 
Asia 25% 11.8% 
Latin America - 5.9% 
Oceania 16.7% 2.1% 
Africa - 0.1%
20 
Confidence Level 
• S1 and S2 has similar confidence levels 
Sample Size Mean Normal (KS) Student t-Test 
Mann-Whitney 
test 
S1 25(1) 0.375 Yes - - 
S2 291 0.418 Yes 0.055 - 
S2-St1 97 0.460 Yes 0.011(0) - 
S2-St2 81(3) 0.382 Yes 0.424(3) - 
S2-St3 57(7) 0.470 Yes 0.003(7) - 
S2-St4 24 0.333 No - 0.415 (0) 
S2-St5 31 0.403 No - 0.171 (0)
21 
Conclusion 
• This study and previous studies suggests that we can 
improve the samples quality following a systematic 
sampling approach 
• It is feasible to characterize better the subject profile 
through open and simple questions 
• We found evidence regarding the heterogeneity 
between members from the same group on social 
networks 
– However, in a big picture, there is a trend!
22 
Sampling Improvement in Software 
Engineering Surveys 
Rafael Maiani de Mello 
rmaiani@cos.ufrj.br 
Pedro Correa da Silva 
pedrorez@poli.ufrj.br 
Guilherme Horta Travassos 
ght@cos.ufrj.br 
ese.cos.ufrj.br

More Related Content

Similar to 214 - Sampling Improvement in Software Engineering Surveys

Rutgers Governor School - Six Sigma
Rutgers Governor School - Six Sigma  Rutgers Governor School - Six Sigma
Rutgers Governor School - Six Sigma Brandon Theiss, PE
 
Market Research Analytics
Market Research AnalyticsMarket Research Analytics
Market Research AnalyticsKamalika Some
 
Six Sigma Training Tutorial for industrial engineering in factory.pdf
Six Sigma Training Tutorial for industrial engineering in factory.pdfSix Sigma Training Tutorial for industrial engineering in factory.pdf
Six Sigma Training Tutorial for industrial engineering in factory.pdfabdulrohman195
 
Do you know the KEY players in your organization?
Do you know the KEY players in your organization?Do you know the KEY players in your organization?
Do you know the KEY players in your organization?Belinda Egan
 
Mäta Lean/Agile organisationer @ Softhouse frukostseminarium 2013-08-26
Mäta Lean/Agile organisationer @ Softhouse frukostseminarium 2013-08-26Mäta Lean/Agile organisationer @ Softhouse frukostseminarium 2013-08-26
Mäta Lean/Agile organisationer @ Softhouse frukostseminarium 2013-08-26Ola Morin
 
By the Power of Metrics - Lean Kanban North America 2015
By the Power of Metrics - Lean Kanban North America 2015By the Power of Metrics - Lean Kanban North America 2015
By the Power of Metrics - Lean Kanban North America 2015Wolfgang Wiedenroth
 
Otto Vinter - Analysing Your Defect Data for Improvement Potential
Otto Vinter - Analysing Your Defect Data for Improvement PotentialOtto Vinter - Analysing Your Defect Data for Improvement Potential
Otto Vinter - Analysing Your Defect Data for Improvement PotentialTEST Huddle
 
Using Categories to Direct Curriculum Reform and Evaluation
Using Categories to Direct Curriculum Reform and EvaluationUsing Categories to Direct Curriculum Reform and Evaluation
Using Categories to Direct Curriculum Reform and EvaluationExamSoft
 
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02PMI_IREP_TP
 
Day 1 1620 - 1705 - maple - pranabendu bhattacharyya
Day 1   1620 - 1705 - maple - pranabendu bhattacharyyaDay 1   1620 - 1705 - maple - pranabendu bhattacharyya
Day 1 1620 - 1705 - maple - pranabendu bhattacharyyaPMI2011
 
Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing InformationThomas Zimmermann
 
Six Sigma Green Belt for Beginners in a Nutshell
Six Sigma Green Belt for Beginners in a NutshellSix Sigma Green Belt for Beginners in a Nutshell
Six Sigma Green Belt for Beginners in a NutshellMentor Global Delhi
 

Similar to 214 - Sampling Improvement in Software Engineering Surveys (20)

Rutgers Governor School - Six Sigma
Rutgers Governor School - Six Sigma  Rutgers Governor School - Six Sigma
Rutgers Governor School - Six Sigma
 
Market Research Analytics
Market Research AnalyticsMarket Research Analytics
Market Research Analytics
 
Six Sigma Training Tutorial for industrial engineering in factory.pdf
Six Sigma Training Tutorial for industrial engineering in factory.pdfSix Sigma Training Tutorial for industrial engineering in factory.pdf
Six Sigma Training Tutorial for industrial engineering in factory.pdf
 
E discovery Process Improvement
E discovery Process ImprovementE discovery Process Improvement
E discovery Process Improvement
 
Do you know the KEY players in your organization?
Do you know the KEY players in your organization?Do you know the KEY players in your organization?
Do you know the KEY players in your organization?
 
Key People Indicator
Key People IndicatorKey People Indicator
Key People Indicator
 
Mäta Lean/Agile organisationer @ Softhouse frukostseminarium 2013-08-26
Mäta Lean/Agile organisationer @ Softhouse frukostseminarium 2013-08-26Mäta Lean/Agile organisationer @ Softhouse frukostseminarium 2013-08-26
Mäta Lean/Agile organisationer @ Softhouse frukostseminarium 2013-08-26
 
By the Power of Metrics - Lean Kanban North America 2015
By the Power of Metrics - Lean Kanban North America 2015By the Power of Metrics - Lean Kanban North America 2015
By the Power of Metrics - Lean Kanban North America 2015
 
Otto Vinter - Analysing Your Defect Data for Improvement Potential
Otto Vinter - Analysing Your Defect Data for Improvement PotentialOtto Vinter - Analysing Your Defect Data for Improvement Potential
Otto Vinter - Analysing Your Defect Data for Improvement Potential
 
Using Categories to Direct Curriculum Reform and Evaluation
Using Categories to Direct Curriculum Reform and EvaluationUsing Categories to Direct Curriculum Reform and Evaluation
Using Categories to Direct Curriculum Reform and Evaluation
 
6sigma
6sigma6sigma
6sigma
 
Key Challenges in Agile RE @XP2017
Key Challenges in Agile RE @XP2017Key Challenges in Agile RE @XP2017
Key Challenges in Agile RE @XP2017
 
Slides 2015 for contact 2
Slides 2015 for contact 2Slides 2015 for contact 2
Slides 2015 for contact 2
 
Quality circles
Quality circlesQuality circles
Quality circles
 
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
Day1 1620-1705-maple-pranabendubhattacharyya-131008043643-phpapp02
 
Day 1 1620 - 1705 - maple - pranabendu bhattacharyya
Day 1   1620 - 1705 - maple - pranabendu bhattacharyyaDay 1   1620 - 1705 - maple - pranabendu bhattacharyya
Day 1 1620 - 1705 - maple - pranabendu bhattacharyya
 
Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
 
Six sigma training
Six sigma trainingSix sigma training
Six sigma training
 
Six sigma training
Six sigma trainingSix sigma training
Six sigma training
 
Six Sigma Green Belt for Beginners in a Nutshell
Six Sigma Green Belt for Beginners in a NutshellSix Sigma Green Belt for Beginners in a Nutshell
Six Sigma Green Belt for Beginners in a Nutshell
 

More from ESEM 2014

Keynote 2 - The 20% of software engineering practices that contribute to 80% ...
Keynote 2 - The 20% of software engineering practices that contribute to 80% ...Keynote 2 - The 20% of software engineering practices that contribute to 80% ...
Keynote 2 - The 20% of software engineering practices that contribute to 80% ...ESEM 2014
 
Keynote 1 - Engineering Software Analytics Studies
Keynote 1 - Engineering Software Analytics StudiesKeynote 1 - Engineering Software Analytics Studies
Keynote 1 - Engineering Software Analytics StudiesESEM 2014
 
33 - On Knowledge Transfer Skill in Pair Programming
33 - On Knowledge Transfer Skill in Pair Programming33 - On Knowledge Transfer Skill in Pair Programming
33 - On Knowledge Transfer Skill in Pair ProgrammingESEM 2014
 
222 - Design Pattern Decay: The Case for Class Grime
222 - Design Pattern Decay: The Case for Class Grime222 - Design Pattern Decay: The Case for Class Grime
222 - Design Pattern Decay: The Case for Class GrimeESEM 2014
 
210 - Software Population Pyramids: The Current and the Future of OSS Develop...
210 - Software Population Pyramids: The Current and the Future of OSS Develop...210 - Software Population Pyramids: The Current and the Future of OSS Develop...
210 - Software Population Pyramids: The Current and the Future of OSS Develop...ESEM 2014
 
169 - Bridging the Gap: SE Technology Transfer into Practice - Study Design a...
169 - Bridging the Gap: SE Technology Transfer into Practice - Study Design a...169 - Bridging the Gap: SE Technology Transfer into Practice - Study Design a...
169 - Bridging the Gap: SE Technology Transfer into Practice - Study Design a...ESEM 2014
 
196 - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
196  - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...196  - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
196 - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...ESEM 2014
 
42- Using Templates to Elicit Implied Security Requirements from Functional R...
42- Using Templates to Elicit Implied Security Requirements from Functional R...42- Using Templates to Elicit Implied Security Requirements from Functional R...
42- Using Templates to Elicit Implied Security Requirements from Functional R...ESEM 2014
 
166 - ISBSG variables most frequently used for software effort estimation: A ...
166 - ISBSG variables most frequently used for software effort estimation: A ...166 - ISBSG variables most frequently used for software effort estimation: A ...
166 - ISBSG variables most frequently used for software effort estimation: A ...ESEM 2014
 
112 - The Role of Mentoring and Project Characteristics for Onboarding in Ope...
112 - The Role of Mentoring and Project Characteristics for Onboarding in Ope...112 - The Role of Mentoring and Project Characteristics for Onboarding in Ope...
112 - The Role of Mentoring and Project Characteristics for Onboarding in Ope...ESEM 2014
 
224 - Factors Impacting Rapid Releases: An Industrial Case Study
224 - Factors Impacting Rapid Releases: An Industrial Case Study224 - Factors Impacting Rapid Releases: An Industrial Case Study
224 - Factors Impacting Rapid Releases: An Industrial Case StudyESEM 2014
 
215 Towards a Framework to Support Large Scale Sampling in Software Engineeri...
215 Towards a Framework to Support Large Scale Sampling in Software Engineeri...215 Towards a Framework to Support Large Scale Sampling in Software Engineeri...
215 Towards a Framework to Support Large Scale Sampling in Software Engineeri...ESEM 2014
 
201 - Using Qualitative Metasummary to Synthesize Empirical Findings in Liter...
201 - Using Qualitative Metasummary to Synthesize Empirical Findings in Liter...201 - Using Qualitative Metasummary to Synthesize Empirical Findings in Liter...
201 - Using Qualitative Metasummary to Synthesize Empirical Findings in Liter...ESEM 2014
 
130 - Motivated software engineers are engaged and focused, while satisfied o...
130 - Motivated software engineers are engaged and focused, while satisfied o...130 - Motivated software engineers are engaged and focused, while satisfied o...
130 - Motivated software engineers are engaged and focused, while satisfied o...ESEM 2014
 
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...ESEM 2014
 
124 - Impact of Developer Reputation on Code Review Outcomes in OSS Projects:...
124 - Impact of Developer Reputation on Code Review Outcomes in OSS Projects:...124 - Impact of Developer Reputation on Code Review Outcomes in OSS Projects:...
124 - Impact of Developer Reputation on Code Review Outcomes in OSS Projects:...ESEM 2014
 
18 - Impact of Process Conformance on the Effects of Test-driven Development
18 - Impact of Process Conformance on the Effects of Test-driven Development18 - Impact of Process Conformance on the Effects of Test-driven Development
18 - Impact of Process Conformance on the Effects of Test-driven DevelopmentESEM 2014
 
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...ESEM 2014
 
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...ESEM 2014
 
167 - Productivity for proof engineering
167 - Productivity for proof engineering167 - Productivity for proof engineering
167 - Productivity for proof engineeringESEM 2014
 

More from ESEM 2014 (20)

Keynote 2 - The 20% of software engineering practices that contribute to 80% ...
Keynote 2 - The 20% of software engineering practices that contribute to 80% ...Keynote 2 - The 20% of software engineering practices that contribute to 80% ...
Keynote 2 - The 20% of software engineering practices that contribute to 80% ...
 
Keynote 1 - Engineering Software Analytics Studies
Keynote 1 - Engineering Software Analytics StudiesKeynote 1 - Engineering Software Analytics Studies
Keynote 1 - Engineering Software Analytics Studies
 
33 - On Knowledge Transfer Skill in Pair Programming
33 - On Knowledge Transfer Skill in Pair Programming33 - On Knowledge Transfer Skill in Pair Programming
33 - On Knowledge Transfer Skill in Pair Programming
 
222 - Design Pattern Decay: The Case for Class Grime
222 - Design Pattern Decay: The Case for Class Grime222 - Design Pattern Decay: The Case for Class Grime
222 - Design Pattern Decay: The Case for Class Grime
 
210 - Software Population Pyramids: The Current and the Future of OSS Develop...
210 - Software Population Pyramids: The Current and the Future of OSS Develop...210 - Software Population Pyramids: The Current and the Future of OSS Develop...
210 - Software Population Pyramids: The Current and the Future of OSS Develop...
 
169 - Bridging the Gap: SE Technology Transfer into Practice - Study Design a...
169 - Bridging the Gap: SE Technology Transfer into Practice - Study Design a...169 - Bridging the Gap: SE Technology Transfer into Practice - Study Design a...
169 - Bridging the Gap: SE Technology Transfer into Practice - Study Design a...
 
196 - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
196  - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...196  - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
196 - Evaluation in Practice: Artifact-based Requirements Engineering and Sc...
 
42- Using Templates to Elicit Implied Security Requirements from Functional R...
42- Using Templates to Elicit Implied Security Requirements from Functional R...42- Using Templates to Elicit Implied Security Requirements from Functional R...
42- Using Templates to Elicit Implied Security Requirements from Functional R...
 
166 - ISBSG variables most frequently used for software effort estimation: A ...
166 - ISBSG variables most frequently used for software effort estimation: A ...166 - ISBSG variables most frequently used for software effort estimation: A ...
166 - ISBSG variables most frequently used for software effort estimation: A ...
 
112 - The Role of Mentoring and Project Characteristics for Onboarding in Ope...
112 - The Role of Mentoring and Project Characteristics for Onboarding in Ope...112 - The Role of Mentoring and Project Characteristics for Onboarding in Ope...
112 - The Role of Mentoring and Project Characteristics for Onboarding in Ope...
 
224 - Factors Impacting Rapid Releases: An Industrial Case Study
224 - Factors Impacting Rapid Releases: An Industrial Case Study224 - Factors Impacting Rapid Releases: An Industrial Case Study
224 - Factors Impacting Rapid Releases: An Industrial Case Study
 
215 Towards a Framework to Support Large Scale Sampling in Software Engineeri...
215 Towards a Framework to Support Large Scale Sampling in Software Engineeri...215 Towards a Framework to Support Large Scale Sampling in Software Engineeri...
215 Towards a Framework to Support Large Scale Sampling in Software Engineeri...
 
201 - Using Qualitative Metasummary to Synthesize Empirical Findings in Liter...
201 - Using Qualitative Metasummary to Synthesize Empirical Findings in Liter...201 - Using Qualitative Metasummary to Synthesize Empirical Findings in Liter...
201 - Using Qualitative Metasummary to Synthesize Empirical Findings in Liter...
 
130 - Motivated software engineers are engaged and focused, while satisfied o...
130 - Motivated software engineers are engaged and focused, while satisfied o...130 - Motivated software engineers are engaged and focused, while satisfied o...
130 - Motivated software engineers are engaged and focused, while satisfied o...
 
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...
 
124 - Impact of Developer Reputation on Code Review Outcomes in OSS Projects:...
124 - Impact of Developer Reputation on Code Review Outcomes in OSS Projects:...124 - Impact of Developer Reputation on Code Review Outcomes in OSS Projects:...
124 - Impact of Developer Reputation on Code Review Outcomes in OSS Projects:...
 
18 - Impact of Process Conformance on the Effects of Test-driven Development
18 - Impact of Process Conformance on the Effects of Test-driven Development18 - Impact of Process Conformance on the Effects of Test-driven Development
18 - Impact of Process Conformance on the Effects of Test-driven Development
 
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
65 - An Empirical Simulation-based Study of Real-Time Speech Translation for ...
 
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
52 - The Impact of Test Ownership and Team Structure on the Reliability and E...
 
167 - Productivity for proof engineering
167 - Productivity for proof engineering167 - Productivity for proof engineering
167 - Productivity for proof engineering
 

Recently uploaded

What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 

Recently uploaded (20)

What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 

214 - Sampling Improvement in Software Engineering Surveys

  • 1. Sampling Improvement in Software Engineering Surveys Rafael Maiani de Mello rmaiani@cos.ufrj.br Pedro Correa da Silva pedrorez@poli.ufrj.br Guilherme Horta Travassos ght@cos.ufrj.br ese.cos.ufrj.br
  • 2. 2 Motivation • Small and non-probabilistic samples usually: • reduce external validity; • make replication difficult; • limit possibilities of aggregation, and; • hamper the evaluation of SE technologies . • Particularly, SE surveys have their results affected when inadequate samples are used.
  • 3. 3 Context • Survey on Agility Characteristics and Practices in Software Processes – 2 previous trials – 158 invited subjects (P1) – 25 participants (S1) – Only 7 participants had declared high or very high experience on applying agile approaches in Software Projects
  • 4. 4 Recruitment Strategy (RS) Search Question “Who are the groups from LinkedIn interested in Agility characteristics and practices concerned with Software Engineering?”
  • 5. 5 RS Execution • 289 distinct groups were selected, 62 groups included after analysis. Exclusion Criteria # % from Total Local Groups 97 42.73% Organizations, publicity and events 66 29.07% Out of scope 33 14.54% Vague description 25 11.01% Single member groups 18 7.93% Headhunting and job offering groups 8 3.52% LinkedIn subgroups 3 1.32% Non-English 1 0.44% Total of Excluded Groups 227 78.55%
  • 6. 6 Stratification Based on the Overlapping Rates Group D Group B Group A Group C Shared Members
  • 7. 7 Overlapping Matrix A B C D E F H I A 100.00% 4.22% 25.95% 27.60% 25.42% 3.91% 4.35% 3.77% B 3.48% 100.00% 3.43% 4.00% 2.25% 24.19% 3.01% 2.33% C 20.04% 3.21% 100.00% 18.65% 20.13% 2.97% 2.62% 2.20% D 15.49% 2.73% 13.56% 100.00% 16.38% 2.71% 2.35% 2.39% E 11.32% 1.21% 11.61% 12.99% 100.00% 1.10% 1.52% 1.23% F 1.70% 12.72% 1.66% 2.09% 1.07% 100.00% 1.62% 1.91% H 0.90% 0.75% 0.70% 0.87% 0.71% 0.77% 100.00% 45.66% I 0.64% 0.48% 0.49% 0.73% 0.47% 0.76% 37.73% 100.00% . . . . . .
  • 8. 8 Stratified Sampling Recruitment and Effective Sample Size Strata Name #Distinct Members Sample Size Respondents CI for CL=95% E1 Agility 114,827 1,031 57 12.98% E2 Project Management 5,488 874 40 15.44% E3 Agile Practices 1 11,633 955 56 13.06% E4 Agile Practices 2 3,864 820 35 16.49% E5 Software Testing 1 56,400 1,021 26 19.22% E6 Software Testing 2 5,791 882 22 20.86% E7 Configuration Management 17,234 981 23 20.88% E8 SW Architecture 7,335 911 31 17.57%
  • 9. 9 Skill Analysis “What come into your mind when you think about your five main skills in software engineering?” 3.49%
  • 10. 10 Skill Analysis • 277 participants answered (95.19%) • 1,320 reported skills • 325 coded skills • 88 skill groups 3.49%
  • 11. 11 Skill Group Skill Examples % Personal Skill Creativity, Detailing, Learning, Planning 10.56% Programming Algorithms, Programming Languages 8.80% SW Analysis and Design OO Design, Design Patterns 8.25% Social Skill Communication, Leadership 7.78% SW Testing Testing, Debugging 7.71% Thinking and Reasoning Abstraction, Analytical Thinking 6.24% Agile Practices Refactoring, TDD 5.05% Agile Characteristic Adaptability, Being Collaborative 5.00% SW Requirements Req. Analysis, Requirements Elicitation 4.52% SW Quality Quality, Quality Assurance 3.65% SW Architecture SW Architecture 3.63% Problem Solving Problem Solving 3.31% Agile Methods Kanban, Scrum, XP 2.71% Business Analysis Business understanding, Business Analysis 2.66% Project Management Project Management 2.21% Technical Expertise Technical Knowledge 2.06% Configuration Management Change Management, Release Management 2.01% Agile Agile coaching, Agile thinking, Agility 1.91% 3.49% SW Development Process SW Process Improvement, SW Dev. Life-Cycle 1.27% SW Development Development, SW Development 1.12%
  • 12. 12 Skill Distribution by Strata Stratum Personal Skill Programming SW Analysis and Design Social Skill SW Testing E1 16.34% 7.48% 11.81% 22.31% 5.30% E2 9.81% 10.62% 16.87% 16.73% 5.71% E3 12.03% 8.46% 6.94% 20.49% 17.40% E4 9.25% 20.56% 14.10% 14.81% 9.13% E5 11.77% 2.83% 6.85% 14.80% 34.36% E6 9.13% 22.01% 9.21% 3.88% 20.24% E7 12.48% 14.36% 12.50% 0.00% 2.93% E8 19.20% 13.67% 21.72% 6.98% 4.93%
  • 13. 13 Skill Distribution Similarity Analysis
  • 14. 14 New Strata- St1 • “Agilists” • Composed by agility groups • Personal Skills, Social Skills, SW Analysis and Design
  • 15. 15 New Strata- St2 • Testing Professionals • Composed mainly by LinkedIn groups devoted to Software Testing • Software Testing is the most relevant skill group
  • 16. 16 New Strata- St3 • Programmers • Composed mainly by LinkedIn groups devoted to agile practices • Programming is the most relevant skill group
  • 17. 17 New Strata- St4 • Configuration Managers • Composed by three LinkedIn groups concerned with Configuration Management (CM) • CM is the most relevant skill group, closely followed by Programming and Personal Skills
  • 18. 18 New Strata- St5 • “System Analysts” • Composed by a single LinkedIn group devoted to software architecture • Main skill groups: personal skills and SW analysis and design
  • 19. 19 Hypothesis Testing Heterogeneity • S2 is more heterogeneous than S1 Region S1 12 subjects 9 countries S2 289 subjects 43 countries USA+Canada 41.7% 38.1% Europe 33.3% 41.2% Asia 25% 11.8% Latin America - 5.9% Oceania 16.7% 2.1% Africa - 0.1%
  • 20. 20 Confidence Level • S1 and S2 has similar confidence levels Sample Size Mean Normal (KS) Student t-Test Mann-Whitney test S1 25(1) 0.375 Yes - - S2 291 0.418 Yes 0.055 - S2-St1 97 0.460 Yes 0.011(0) - S2-St2 81(3) 0.382 Yes 0.424(3) - S2-St3 57(7) 0.470 Yes 0.003(7) - S2-St4 24 0.333 No - 0.415 (0) S2-St5 31 0.403 No - 0.171 (0)
  • 21. 21 Conclusion • This study and previous studies suggests that we can improve the samples quality following a systematic sampling approach • It is feasible to characterize better the subject profile through open and simple questions • We found evidence regarding the heterogeneity between members from the same group on social networks – However, in a big picture, there is a trend!
  • 22. 22 Sampling Improvement in Software Engineering Surveys Rafael Maiani de Mello rmaiani@cos.ufrj.br Pedro Correa da Silva pedrorez@poli.ufrj.br Guilherme Horta Travassos ght@cos.ufrj.br ese.cos.ufrj.br