SlideShare a Scribd company logo
1 of 34
Early Detection and Guidelines to
Improve Unanswered Questions on
Stack Overflow
Saikat Mondal
Software Research Lab
University of Saskatchewan
Canada
Chanchal Roy
Software Research Lab
University of Saskatchewan
Canada
ISEC 2021
Bhubaneswar, India
C M Khaled Saifullah
Software Research Lab
University of Saskatchewan
Canada
Avijit Bhattacharjee
Software Research Lab
University of Saskatchewan
Canada
Masud Rahman
RAISE Lab
Dalhousie University
Canada
I S E C 2 0 2 1 M a i n T r a c k | | F e b r u a r y 2 0 2 1
2
Most popular among 174 sites of Stack Exchange !
53rd most-visited site in the world, and 49th in the USA !
Get indexed top by Google !
3
Unanswered
Questions
(%)
Research Problem
0
5
10
15
20
25
30
35
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
29%
4
Stack Overflow
Questions
Data
Filtration
Questions
of Interest
Question
Separation
Answered
Question
Unanswered
Question
Feature
Extraction
Comparative
Analysis
(Quantitative)
Comparative
Analysis
(Qualitative)
Selection of
Dimensions
Question
Prediction
Result
Analysis
Methodology
Research Questions
5
Comparison between Unanswered and Answered
Questions using Quantitative Features
6
RQ1
Text Readability
Topic Response Ratio
Topic Entropy
Metric Entropy
Average Terms Entropy
Text-code Ratio
User Reputation
Received Response Ratio
Text Readability
7
RQ1
Readability
Grade
Level
Fig: Text readability (TAQ=Total Answered Questions,
RSAQ=Randomly Sampled Answered Questions,
UAQ=Unanswered Questions)
Extract the title and textual description from
the body of each question
Discard the code elements, code segments,
and stack traces
Automated Reading Index (ARI), Coleman
Liau Index, SMOG Grade, Gunning Fox Index,
Readability Index (RIX) Readability of the answered questions
is significantly higher than that of the
unanswered questions regardless of
the programming languages
Topic Response Ratio
8
RQ1
Topic
Response
Ratio
Fig: Topic response ratio (TAQ=Total Answered
Questions, RSAQ=Randomly Sampled Answered
Questions, UAQ=Unanswered Questions)
First, calculate the answering ratio of each
topic
Then, measure the topic response ratio of
each question
Topic response ratio of the answered
questions is significantly higher than
that of the unanswered questions.
Topic Entropy
9
RQ1
We consider the probability of a question
discussing a certain topic as a random
variable and determine topic entropy of each
question
Topic entropy of answered questions is significantly
lower than that of the unanswered questions
Metric Entropy and Average Terms Entropy
10
RQ1
Metric Entropy is the Shannon entropy divided by the character length of the
question’s text.
We find a relatively lower metric and average terms entropy for
the unanswered questions than that of the answered questions.
To compute Average Term Entropy, we first calculate the entropy for each
term in the Stack Overflow data dump. We then determine the equivalent
entropy of each term. Finally, we calculate the Average Term Entropy for each
question by summing each of the terms entropy for the question divide by the
character length of the question’s text.
Text-code Ratio
11
RQ1
Large code segments with little explanation might
make a question hard to comprehend
Text-code ratio of the unanswered questions is
lower than that of the answered questions
We calculate the text-code ratio by dividing the
character length of the text with the length of the code
segment.
User Reputation
12
RQ1
User
Reputation
in
Posting
Time
Fig: User reputation during question submission time
(TAQ=Total Answered Questions, RSAQ=Randomly
Sampled Answered Questions, UAQ=Unanswered
Questions)
Stack Overflow stores all the activities (e.g.,
votes, acceptances, bounties) of users with
their dates. We thus use the snapshot of the
activities of users to calculate the reputation
during their question submission
For reputation calculation, we used a
standard equation provided by the Stack
Overflow.
Reputation of the authors of unanswered questions is
lower than that of the authors of answered questions
Received Response Ratio
13
RQ1
The past question-answering history of a user might
provide useful information as to whether a question
submitted by the user will be answered or not.
A user with a higher percentage of receiving answers in the
past has a higher chance of receiving answers in the future
We thus measure the percentage of answered
questions asked by users during their new question
submission.
Agreement between Quantitative and Qualitative Findings
14
RQ2
Study Setup
Manual Analysis
Results
Agreement between Quantitative and Qualitative Findings
15
RQ2
1. Problem description and code explanation
2. Topic assignment
3. Question formatting and
4. Appropriateness of the code segments
included with questions.
Study Setup
Manual Analysis
Results
Agreement between Quantitative and Qualitative Findings
16
RQ2
Fig: Comparison (qualitative) between unanswered and
answered questions.
Study Setup
Manual Analysis
Results
Prediction Models and Result Analysis
17
RQ3
Algorithms
Metrics of Success
Model Configuration
Dataset Selection
Prediction of Unanswered Questions
Feature Strength Analysis
Comparison with Baseline Models
Generalizability of the Models’ Performance
 Decision Trees (CART)
 Random Forest (RF)
 K-Nearest Neighbors (KNN)
 Artificial Neural Network (ANN)
Prediction Models and Result Analysis
18
RQ3
Algorithms
Metrics of Success
Model Configuration
Dataset Selection
Prediction of Unanswered Questions
Feature Strength Analysis
Comparison with Baseline Models
Generalizability of the Models’ Performance
 Precision
 Recall
 F1-score, and
 Classification accuracy
Prediction Models and Result Analysis
19
RQ3
Algorithms
Metrics of Success
Model Configuration
Dataset Selection
Prediction of Unanswered Questions
Feature Strength Analysis
Comparison with Baseline Models
Generalizability of the Models’ Performance
 We use a grid search algorithm, namely
GridSearchCV, from the scikit-learn library to
select the best model configuration.
Prediction Models and Result Analysis
20
RQ3
Algorithms
Metrics of Success
Model Configuration
Dataset Selection
Prediction of Unanswered Questions
Feature Strength Analysis
Comparison with Baseline Models
Generalizability of the Models’ Performance
 We use the attributes of the questions submitted
on Stack Overflow in the year 2016 or earlier for
training and questions submitted in the year 2017
for the testing purpose.
Prediction Models and Result Analysis
21
RQ3
Algorithms
Metrics of Success
Model Configuration
Dataset Selection
Prediction of Unanswered Questions
Feature Strength Analysis
Comparison with Baseline Models
Generalizability of the Models’ Performance
Prediction Models and Result Analysis
22
RQ3
Algorithms
Metrics of Success
Model Configuration
Dataset Selection
Prediction of Unanswered Questions
Feature Strength Analysis
Comparison with Baseline Models
Generalizability of the Models’ Performance
Prediction Models and Result Analysis
23
RQ3
Algorithms
Metrics of Success
Model Configuration
Dataset Selection
Prediction of Unanswered Questions
Feature Strength Analysis
Comparison with Baseline Models
Generalizability of the Models’ Performance
R. K. Saha, A. K Saha, and D. E. Perry. Toward understanding the
causes of unanswered questions in software information sites:
a case study of stack overflow. FSE 2013
F. Calefato, F. Lanubile, and N. Novielli. How to ask for technical
help? Evidence-based guidelines for writing questions on Stack
Overflow. IST 94 (2018).
24
Prediction Models and Result Analysis
25
RQ3
Algorithms
Metrics of Success
Model Configuration
Dataset Selection
Prediction of Unanswered Questions
Feature Strength Analysis
Comparison with Baseline Models
Generalizability of the Models’ Performance
Key Findings & Guidelines
26
1 Question format matters
2 Attach code segment when required
3 Improper and non-technical description hurts
4 Stack Overflow is not suitable for all topics
5 Multiple issues in a single question hurts
6 Unanswered questions are not always really unanswered
7 Miscellaneous findings
Key Findings & Guidelines
27
1 Question format matters
2 Attach code segment when required
3 Improper and non-technical description hurts
4 Stack Overflow is not suitable for all topics
5 Multiple issues in a single question hurts
6 Unanswered questions are not always really unanswered
7 Miscellaneous findings
Key Findings & Guidelines
28
1 Question format matters
2 Attach code segment when required
3 Improper and non-technical description hurts
4 Stack Overflow is not suitable for all topics
5 Multiple issues in a single question hurts
6 Unanswered questions are not always really unanswered
7 Miscellaneous findings
Key Findings & Guidelines
29
1 Question format matters
2 Attach code segment when required
3 Improper and non-technical description hurts
4 Stack Overflow is not suitable for all topics
5 Multiple issues in a single question hurts
6 Unanswered questions are not always really unanswered
7 Miscellaneous findings
Key Findings & Guidelines
30
1 Question format matters
2 Attach code segment when required
3 Improper and non-technical description hurts
4 Stack Overflow is not suitable for all topics
5 Multiple issues in a single question hurts
6 Unanswered questions are not always really unanswered
7 Miscellaneous findings
Key Findings & Guidelines
31
1 Question format matters
2 Attach code segment when required
3 Improper and non-technical description hurts
4 Stack Overflow is not suitable for all topics
5 Multiple issues in a single question hurts
6 Unanswered questions are not always really unanswered
7 Miscellaneous findings
Key Findings & Guidelines
32
1 Question format matters
2 Attach code segment when required
3 Improper and non-technical description hurts
4 Stack Overflow is not suitable for all topics
5 Multiple issues in a single question hurts
6 Unanswered questions are not always really unanswered
7 Miscellaneous findings
Acknowledgement
33
This research is supported by the Natural Sciences and Engineering Research Council of
Canada (NSERC), and by a Canada First Research Excellence Fund (CFREF) grant
coordinated by the Global Institute for Food Security (GIFS).
34
THANKS!
Any questions?
Early Detection and Guidelines to Improve
Unanswered Questions on Stack Overflow
Analyze 4.8 million questions
Investigate generalizability of models
Build models to predict unanswered questions
Guide question submitters
@saikatcsebd
saikatcsebd@gmail.com

More Related Content

What's hot

Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLPGVS Chaitanya
 
295B_Report_Sentiment_analysis
295B_Report_Sentiment_analysis295B_Report_Sentiment_analysis
295B_Report_Sentiment_analysisZahid Azam
 
How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...
How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...
How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...Walid Maalej
 
Interannotator Agreement
Interannotator AgreementInterannotator Agreement
Interannotator Agreementjohn6938
 
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Ahmed Magdy Ezzeldin, MSc.
 
Similarity Measures for Semantic Relation Extraction
Similarity Measures for Semantic Relation ExtractionSimilarity Measures for Semantic Relation Extraction
Similarity Measures for Semantic Relation ExtractionAlexander Panchenko
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueJinho Choi
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysissneha penmetsa
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systemsQi He
 
Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...
Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...
Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...Pieter Heyvaert
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
 
Utility of topic extraction on customer experience data
Utility of topic extraction on customer experience dataUtility of topic extraction on customer experience data
Utility of topic extraction on customer experience dataKiran Karkera
 
EVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERING
EVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERINGEVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERING
EVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERINGdannyijwest
 
Semantic Similarity Measures for Semantic Relation Extraction
Semantic Similarity Measures for Semantic Relation ExtractionSemantic Similarity Measures for Semantic Relation Extraction
Semantic Similarity Measures for Semantic Relation ExtractionAlexander Panchenko
 

What's hot (19)

Open domain Question Answering System - Research project in NLP
Open domain  Question Answering System - Research project in NLPOpen domain  Question Answering System - Research project in NLP
Open domain Question Answering System - Research project in NLP
 
Gate
GateGate
Gate
 
295B_Report_Sentiment_analysis
295B_Report_Sentiment_analysis295B_Report_Sentiment_analysis
295B_Report_Sentiment_analysis
 
Document
DocumentDocument
Document
 
How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...
How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...
How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Revi...
 
Interannotator Agreement
Interannotator AgreementInterannotator Agreement
Interannotator Agreement
 
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
Arabic Question Answering: Challenges, Tasks, Approaches, Test-sets, Tools, A...
 
Similarity Measures for Semantic Relation Extraction
Similarity Measures for Semantic Relation ExtractionSimilarity Measures for Semantic Relation Extraction
Similarity Measures for Semantic Relation Extraction
 
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty DialogueTransformers to Learn Hierarchical Contexts in Multiparty Dialogue
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
 
project sentiment analysis
project sentiment analysisproject sentiment analysis
project sentiment analysis
 
Project report
Project reportProject report
Project report
 
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems[KDD 2018 tutorial] End to-end goal-oriented question answering systems
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
 
Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...
Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...
Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...
 
Arabic question answering ‫‬
Arabic question answering ‫‬Arabic question answering ‫‬
Arabic question answering ‫‬
 
Re next15.ppt
Re next15.pptRe next15.ppt
Re next15.ppt
 
Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...Natural language processing for requirements engineering: ICSE 2021 Technical...
Natural language processing for requirements engineering: ICSE 2021 Technical...
 
Utility of topic extraction on customer experience data
Utility of topic extraction on customer experience dataUtility of topic extraction on customer experience data
Utility of topic extraction on customer experience data
 
EVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERING
EVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERINGEVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERING
EVALUATION OF SINGLE-SPAN MODELS ON EXTRACTIVE MULTI-SPAN QUESTION-ANSWERING
 
Semantic Similarity Measures for Semantic Relation Extraction
Semantic Similarity Measures for Semantic Relation ExtractionSemantic Similarity Measures for Semantic Relation Extraction
Semantic Similarity Measures for Semantic Relation Extraction
 

Similar to ISEC-2021-Presentation-Saikat-Mondal

Automatic Identification of Informative Code in Stack Overflow Posts
Automatic Identification of Informative Code in Stack Overflow PostsAutomatic Identification of Informative Code in Stack Overflow Posts
Automatic Identification of Informative Code in Stack Overflow PostsPreetha Chatterjee
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxSurendra Gusain
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxSurendra Gusain
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET Journal
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 
When Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review TestsWhen Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review TestsDelft University of Technology
 
Quora questions pair duplication analysis using semantic analysis
Quora questions pair duplication analysis using semantic analysisQuora questions pair duplication analysis using semantic analysis
Quora questions pair duplication analysis using semantic analysisAkshata Talankar
 
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...Preetha Chatterjee
 
A Benchmark Study on Sentiment Analysis for Software Engineering Research
A Benchmark Study on Sentiment Analysis for Software Engineering ResearchA Benchmark Study on Sentiment Analysis for Software Engineering Research
A Benchmark Study on Sentiment Analysis for Software Engineering ResearchNicole Novielli
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language ModelsDataScienceConferenc1
 
Toward an automated student feedback system for text based assignments - Pete...
Toward an automated student feedback system for text based assignments - Pete...Toward an automated student feedback system for text based assignments - Pete...
Toward an automated student feedback system for text based assignments - Pete...Blackboard APAC
 
What makes a good code example?
What makes a good code example?What makes a good code example?
What makes a good code example?Masud Rahman
 
A data driven approach to query expansion in question answering
A data driven approach to query expansion in question answeringA data driven approach to query expansion in question answering
A data driven approach to query expansion in question answeringLeon Derczynski
 
Business Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysisBusiness Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysisAhsan Khan Eco (Superior College)
 

Similar to ISEC-2021-Presentation-Saikat-Mondal (20)

Automatic Identification of Informative Code in Stack Overflow Posts
Automatic Identification of Informative Code in Stack Overflow PostsAutomatic Identification of Informative Code in Stack Overflow Posts
Automatic Identification of Informative Code in Stack Overflow Posts
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
 
Top 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docxTop 10 Interview Questions for Coding Job.docx
Top 10 Interview Questions for Coding Job.docx
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
MSR2017-RevHelper
MSR2017-RevHelperMSR2017-RevHelper
MSR2017-RevHelper
 
IRJET- Semantic Question Matching
IRJET- Semantic Question MatchingIRJET- Semantic Question Matching
IRJET- Semantic Question Matching
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
When Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review TestsWhen Testing Meets Code Review: Why and How Developers Review Tests
When Testing Meets Code Review: Why and How Developers Review Tests
 
Quora questions pair duplication analysis using semantic analysis
Quora questions pair duplication analysis using semantic analysisQuora questions pair duplication analysis using semantic analysis
Quora questions pair duplication analysis using semantic analysis
 
Text Analytics for Legal work
Text Analytics for Legal workText Analytics for Legal work
Text Analytics for Legal work
 
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
Exploratory Study of Slack Q&A Chats as a Mining Source for Software Engineer...
 
STRICT-SANER2017
STRICT-SANER2017STRICT-SANER2017
STRICT-SANER2017
 
A Benchmark Study on Sentiment Analysis for Software Engineering Research
A Benchmark Study on Sentiment Analysis for Software Engineering ResearchA Benchmark Study on Sentiment Analysis for Software Engineering Research
A Benchmark Study on Sentiment Analysis for Software Engineering Research
 
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
[DSC Europe 23] Dmitry Ustalov - Design and Evaluation of Large Language Models
 
Presentation
PresentationPresentation
Presentation
 
Toward an automated student feedback system for text based assignments - Pete...
Toward an automated student feedback system for text based assignments - Pete...Toward an automated student feedback system for text based assignments - Pete...
Toward an automated student feedback system for text based assignments - Pete...
 
What makes a good code example?
What makes a good code example?What makes a good code example?
What makes a good code example?
 
A data driven approach to query expansion in question answering
A data driven approach to query expansion in question answeringA data driven approach to query expansion in question answering
A data driven approach to query expansion in question answering
 
Business Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysisBusiness Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysis
 
BoTLRet: A Template-based Linked Data Information Retrieval
 BoTLRet: A Template-based Linked Data Information Retrieval BoTLRet: A Template-based Linked Data Information Retrieval
BoTLRet: A Template-based Linked Data Information Retrieval
 

Recently uploaded

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@vikas rana
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝soniya singh
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxFamilyWorshipCenterD
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS
 
Motivation and Theory Maslow and Murray pdf
Motivation and Theory Maslow and Murray pdfMotivation and Theory Maslow and Murray pdf
Motivation and Theory Maslow and Murray pdfakankshagupta7348026
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 

Recently uploaded (20)

Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
 
Motivation and Theory Maslow and Murray pdf
Motivation and Theory Maslow and Murray pdfMotivation and Theory Maslow and Murray pdf
Motivation and Theory Maslow and Murray pdf
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 

ISEC-2021-Presentation-Saikat-Mondal

  • 1. Early Detection and Guidelines to Improve Unanswered Questions on Stack Overflow Saikat Mondal Software Research Lab University of Saskatchewan Canada Chanchal Roy Software Research Lab University of Saskatchewan Canada ISEC 2021 Bhubaneswar, India C M Khaled Saifullah Software Research Lab University of Saskatchewan Canada Avijit Bhattacharjee Software Research Lab University of Saskatchewan Canada Masud Rahman RAISE Lab Dalhousie University Canada I S E C 2 0 2 1 M a i n T r a c k | | F e b r u a r y 2 0 2 1
  • 2. 2 Most popular among 174 sites of Stack Exchange ! 53rd most-visited site in the world, and 49th in the USA ! Get indexed top by Google !
  • 3. 3 Unanswered Questions (%) Research Problem 0 5 10 15 20 25 30 35 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 29%
  • 6. Comparison between Unanswered and Answered Questions using Quantitative Features 6 RQ1 Text Readability Topic Response Ratio Topic Entropy Metric Entropy Average Terms Entropy Text-code Ratio User Reputation Received Response Ratio
  • 7. Text Readability 7 RQ1 Readability Grade Level Fig: Text readability (TAQ=Total Answered Questions, RSAQ=Randomly Sampled Answered Questions, UAQ=Unanswered Questions) Extract the title and textual description from the body of each question Discard the code elements, code segments, and stack traces Automated Reading Index (ARI), Coleman Liau Index, SMOG Grade, Gunning Fox Index, Readability Index (RIX) Readability of the answered questions is significantly higher than that of the unanswered questions regardless of the programming languages
  • 8. Topic Response Ratio 8 RQ1 Topic Response Ratio Fig: Topic response ratio (TAQ=Total Answered Questions, RSAQ=Randomly Sampled Answered Questions, UAQ=Unanswered Questions) First, calculate the answering ratio of each topic Then, measure the topic response ratio of each question Topic response ratio of the answered questions is significantly higher than that of the unanswered questions.
  • 9. Topic Entropy 9 RQ1 We consider the probability of a question discussing a certain topic as a random variable and determine topic entropy of each question Topic entropy of answered questions is significantly lower than that of the unanswered questions
  • 10. Metric Entropy and Average Terms Entropy 10 RQ1 Metric Entropy is the Shannon entropy divided by the character length of the question’s text. We find a relatively lower metric and average terms entropy for the unanswered questions than that of the answered questions. To compute Average Term Entropy, we first calculate the entropy for each term in the Stack Overflow data dump. We then determine the equivalent entropy of each term. Finally, we calculate the Average Term Entropy for each question by summing each of the terms entropy for the question divide by the character length of the question’s text.
  • 11. Text-code Ratio 11 RQ1 Large code segments with little explanation might make a question hard to comprehend Text-code ratio of the unanswered questions is lower than that of the answered questions We calculate the text-code ratio by dividing the character length of the text with the length of the code segment.
  • 12. User Reputation 12 RQ1 User Reputation in Posting Time Fig: User reputation during question submission time (TAQ=Total Answered Questions, RSAQ=Randomly Sampled Answered Questions, UAQ=Unanswered Questions) Stack Overflow stores all the activities (e.g., votes, acceptances, bounties) of users with their dates. We thus use the snapshot of the activities of users to calculate the reputation during their question submission For reputation calculation, we used a standard equation provided by the Stack Overflow. Reputation of the authors of unanswered questions is lower than that of the authors of answered questions
  • 13. Received Response Ratio 13 RQ1 The past question-answering history of a user might provide useful information as to whether a question submitted by the user will be answered or not. A user with a higher percentage of receiving answers in the past has a higher chance of receiving answers in the future We thus measure the percentage of answered questions asked by users during their new question submission.
  • 14. Agreement between Quantitative and Qualitative Findings 14 RQ2 Study Setup Manual Analysis Results
  • 15. Agreement between Quantitative and Qualitative Findings 15 RQ2 1. Problem description and code explanation 2. Topic assignment 3. Question formatting and 4. Appropriateness of the code segments included with questions. Study Setup Manual Analysis Results
  • 16. Agreement between Quantitative and Qualitative Findings 16 RQ2 Fig: Comparison (qualitative) between unanswered and answered questions. Study Setup Manual Analysis Results
  • 17. Prediction Models and Result Analysis 17 RQ3 Algorithms Metrics of Success Model Configuration Dataset Selection Prediction of Unanswered Questions Feature Strength Analysis Comparison with Baseline Models Generalizability of the Models’ Performance  Decision Trees (CART)  Random Forest (RF)  K-Nearest Neighbors (KNN)  Artificial Neural Network (ANN)
  • 18. Prediction Models and Result Analysis 18 RQ3 Algorithms Metrics of Success Model Configuration Dataset Selection Prediction of Unanswered Questions Feature Strength Analysis Comparison with Baseline Models Generalizability of the Models’ Performance  Precision  Recall  F1-score, and  Classification accuracy
  • 19. Prediction Models and Result Analysis 19 RQ3 Algorithms Metrics of Success Model Configuration Dataset Selection Prediction of Unanswered Questions Feature Strength Analysis Comparison with Baseline Models Generalizability of the Models’ Performance  We use a grid search algorithm, namely GridSearchCV, from the scikit-learn library to select the best model configuration.
  • 20. Prediction Models and Result Analysis 20 RQ3 Algorithms Metrics of Success Model Configuration Dataset Selection Prediction of Unanswered Questions Feature Strength Analysis Comparison with Baseline Models Generalizability of the Models’ Performance  We use the attributes of the questions submitted on Stack Overflow in the year 2016 or earlier for training and questions submitted in the year 2017 for the testing purpose.
  • 21. Prediction Models and Result Analysis 21 RQ3 Algorithms Metrics of Success Model Configuration Dataset Selection Prediction of Unanswered Questions Feature Strength Analysis Comparison with Baseline Models Generalizability of the Models’ Performance
  • 22. Prediction Models and Result Analysis 22 RQ3 Algorithms Metrics of Success Model Configuration Dataset Selection Prediction of Unanswered Questions Feature Strength Analysis Comparison with Baseline Models Generalizability of the Models’ Performance
  • 23. Prediction Models and Result Analysis 23 RQ3 Algorithms Metrics of Success Model Configuration Dataset Selection Prediction of Unanswered Questions Feature Strength Analysis Comparison with Baseline Models Generalizability of the Models’ Performance R. K. Saha, A. K Saha, and D. E. Perry. Toward understanding the causes of unanswered questions in software information sites: a case study of stack overflow. FSE 2013 F. Calefato, F. Lanubile, and N. Novielli. How to ask for technical help? Evidence-based guidelines for writing questions on Stack Overflow. IST 94 (2018).
  • 24. 24
  • 25. Prediction Models and Result Analysis 25 RQ3 Algorithms Metrics of Success Model Configuration Dataset Selection Prediction of Unanswered Questions Feature Strength Analysis Comparison with Baseline Models Generalizability of the Models’ Performance
  • 26. Key Findings & Guidelines 26 1 Question format matters 2 Attach code segment when required 3 Improper and non-technical description hurts 4 Stack Overflow is not suitable for all topics 5 Multiple issues in a single question hurts 6 Unanswered questions are not always really unanswered 7 Miscellaneous findings
  • 27. Key Findings & Guidelines 27 1 Question format matters 2 Attach code segment when required 3 Improper and non-technical description hurts 4 Stack Overflow is not suitable for all topics 5 Multiple issues in a single question hurts 6 Unanswered questions are not always really unanswered 7 Miscellaneous findings
  • 28. Key Findings & Guidelines 28 1 Question format matters 2 Attach code segment when required 3 Improper and non-technical description hurts 4 Stack Overflow is not suitable for all topics 5 Multiple issues in a single question hurts 6 Unanswered questions are not always really unanswered 7 Miscellaneous findings
  • 29. Key Findings & Guidelines 29 1 Question format matters 2 Attach code segment when required 3 Improper and non-technical description hurts 4 Stack Overflow is not suitable for all topics 5 Multiple issues in a single question hurts 6 Unanswered questions are not always really unanswered 7 Miscellaneous findings
  • 30. Key Findings & Guidelines 30 1 Question format matters 2 Attach code segment when required 3 Improper and non-technical description hurts 4 Stack Overflow is not suitable for all topics 5 Multiple issues in a single question hurts 6 Unanswered questions are not always really unanswered 7 Miscellaneous findings
  • 31. Key Findings & Guidelines 31 1 Question format matters 2 Attach code segment when required 3 Improper and non-technical description hurts 4 Stack Overflow is not suitable for all topics 5 Multiple issues in a single question hurts 6 Unanswered questions are not always really unanswered 7 Miscellaneous findings
  • 32. Key Findings & Guidelines 32 1 Question format matters 2 Attach code segment when required 3 Improper and non-technical description hurts 4 Stack Overflow is not suitable for all topics 5 Multiple issues in a single question hurts 6 Unanswered questions are not always really unanswered 7 Miscellaneous findings
  • 33. Acknowledgement 33 This research is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), and by a Canada First Research Excellence Fund (CFREF) grant coordinated by the Global Institute for Food Security (GIFS).
  • 34. 34 THANKS! Any questions? Early Detection and Guidelines to Improve Unanswered Questions on Stack Overflow Analyze 4.8 million questions Investigate generalizability of models Build models to predict unanswered questions Guide question submitters @saikatcsebd saikatcsebd@gmail.com

Editor's Notes

  1. Hello everyone, I am Saikat Mondal, Ph.D. student from Software Research lab, University of Saskatchewan, Canada. In this opportunity, I would like to present our paper entitled Early Detection and Guidelines to Improve Unanswered Questions on Stack Overflow. This is a collaborative study with Khaled Saifullah, Avijit Bhattacharjee and Masud Rahman. This study was conducted under the supervision of Professor Dr. Chanchal Roy.
  2. Nowadays, all the developers are connected through Stack Overflow. Stack Overflow is one of the largest and the most popular Q&A sites for programming problems and solutions. + It is the most popular site among 174 sites of Stack Exchange with + 19 million questions, + 29 million answers, + 12 million users, + 9.8 million visits per day, and + 8.6 thousands questions posted per day. + It is the 53rd most-visited site in the world, and 49th in the USA. + Moreover, Stack Overflow often get indexed top by Google search for programming related queries.
  3. However, despite having such popularity, large and engaged community, a huge number of questions are not answered at all. + up to 29% of the questions remain unanswered in the year 2018. + Unfortunately, this percentage has been increasing gradually over the years. + Unanswered questions kill the developers time + impede the normal development progress + and hurt the quality or purpose of this community-oriented knowledge base. In this study, we thus attempt to detect unanswered questions during their submission and guide the question submitters to improve their questions.
  4. First, let me show the overall methodology of our study. + We collect 4.8 million questions related to four popular programming languages - C#, Java, JavaScript, and Python that were posted on or before 2017. + We then separate the answered and unanswered questions and + extract eight features such as Text Readability, Topic Response Ratio, Topic Entropy from both unanswered and answered questions. + We conduct a comparative analysis (quantitative) between the attributes of these two categories. +We then manually analyze 400 questions (200 unanswered + 200 answered) to investigate the qualitative difference between unanswered and answered questions. In particular, we attempt to determine whether the qualitative findings agree with the quantitative findings or not. + We build four machine learning models to predict whether the Stack Overflow questions might be answered or not. +Finally, we analyze the results of our models and also compare with the state-of-the-art models.
  5. In particular, we attempt to answer three research questions in this study. + First, How do the unanswered questions differ from the answered questions in terms of their texts and their submitters’ activities? + Second, How do the unanswered and answered questions differ qualitatively? Do the qualitative findings agree with the quantitative findings? And + Third, Can we predict the unanswered questions of Stack Overflow during their submission?
  6. To answer first research question, we contrast between the unanswered and answered questions using eight metrics that capture different aspects of a question. The metrics are + Text readability, Topic Response Ratio, Topic entropy, Metric entropy, average term entropy, text-code ratio, user reputation and received response ratio.
  7. First, text readability. You know, readability of a text document depends on the complexity of its vocabulary, syntax and the presentation style such as line lengths, word lengths, sentence lengths, syllable counts. We extract the title and textual description from the body of each question to measure its text readability. We discard the code elements, code segments, and stack traces. We compute five popular readability metrics: Automated Reading Index (ARI), Coleman Liau Index, SMOG Grade, Gunning Fox Index, Readability Index (RIX). These metrics are widely used to estimate the comprehension difficulty of the English language texts. First, we calculate the individual score, that means grade level, for each of the five metrics, and then determine the average readability of each question’s text. The lower the grade level, the easier the text is to read, and conversely, the higher the grade level, the more difficult the text is to read. + As shown in the box plot, we find a significant difference in the text readability between answered and unanswered questions. Answered questions have higher text readability (i.e., lower grade level) than that of the unanswered questions. We see that Unanswered questions often contain long, multi-syllable words whereas the answered questions generally contain small, simple words. We also address the data imbalance issue during our comparative analysis. Since the number of answered questions is higher than that of the unanswered questions in our dataset, we randomly under-sample the answered questions to balance the dataset. Then we compare the readability scores between unanswered and answered questions again with the sampled data. Interestingly, we were able to reproduce the same finding as above. We also perform statistical significance tests such as Mann-Whitney-Wilcoxon and Cliff’s Delta and found statistically significance difference with small effect size. + That is, the readability of the answered questions is significantly higher than that of the unanswered questions regardless of the programming languages.
  8. In Stack Overflow, questions often might not receive answers due to their topical difficulty. Questions related to certain topics such as software libraries, platform, IDE usually get less attention than others (e.g., jquery-selectors, lisp) in SO. We first examine which topics are more likely to receive answers and which are not. In SO, tags often capture the topics of a submitted question. We first calculate the answering ratio of each topic dividing the number of answered questions by total questions associated with the topic. Since each question can have multiple topics (tags), we then measure the topic response ratio of each question. We sum all the topics response ratio and divide by the number of topics. + See the box plot, the topic response ratio of the unanswered questions is lower than that of the answered questions. It indicates that the topics of the unanswered questions are of little interest to the SO user community. It is also possible that there are not sufficient experts to answer the topics discussed in the unanswered questions. We also found this difference as statistically significant with medium effect size using Mann-Whitney-Wilcoxon and Cliff’s delta statistical test (i.e., p-value = 0 <0.01 and Cliff’s d = 0.40 (medium)). + That is, Topic response ratio of the answered questions is significantly higher than that of the unanswered questions.
  9. Single tags, for example Java, assigned to a question often fail to explain the topics or subject matter of the question. On the contrary, multiple tags attached to a question could help the users better understand the topics and thus help them find the questions of their interest. We analyze the ambiguity of question topics using topic entropy and attempt to determine whether the unanswered questions use more ambiguous topics than those of the answered questions. + We see that topic entropy of answered questions is significantly lower than that of the unanswered questions. Such a result indicates that the unanswered questions are more ambiguous than the answered questions, which possibly left them with no answers.
  10. We use Metric Entropy and Average Terms Entropy to estimate the randomness of terms used in the textual part of the unanswered and answered questions. Metric Entropy is the Shannon entropy divided by the character length of the question’s text. It represents the randomness of the terms against a question. Average Terms Entropy also estimates the randomness of each term for the question’s text. However, in this case, the entropy of each term is calculated against all the questions on SO. + As shown in Table, we find a relatively lower metric and average terms entropy for the unanswered questions than that of the answered questions. That is, the unanswered questions use more uncommon terms in their texts than the answered questions do. It indicates that the unanswered questions might use inappropriate terms that do not describe the problem correctly. Moreover, they might touch such topics that often remain unanswered. According to our analysis, we found the differences are statistically significant (i.e., p = 0 <0.01) using the Mann-Whitney-Wilcoxon test. However, the effect size is either small or negligible.
  11. The ratio between text and code of a question is often considered as an important dimension of question quality. Large code segments with little explanation might make a question hard to comprehend. We extract the code segments and texts from the body of a question. Then we calculate the text-code ratio by dividing the character length of the text with the length of the code segment. When the code segment is absent from a question, we assign a large number as the text-code ratio. + According to our analysis, the text-code ratio of the unanswered questions is lower than that of the answered questions. That is, insufficient explanation of the code segment could be a potential factor that might explain why the questions remain unanswered. We also measure whether the difference of text-code ratio between unanswered and answered questions is statistically significant. From the Mann-Whitney-Wilcoxon test, we find significant p-value (p-value = 0 <0.01). However, the effect size from the Cliff’s Delta test was found negligible.
  12. The reputation system of Stack Overflow is designed to incentivize contributions and to allow assessment from the users. Existing study shows that a question submitted by a user with a higher reputation has a higher possibility of getting answers. We thus consider the users’ reputation as a distinctive feature of Stack Overflow questions. We calculate the reputation of users during their question submission using a standard equation provided by the Stack Overflow. + According to our analysis, the reputation of the authors of unanswered questions is lower than that of the authors of answered questions. It confirms that the high reputation score increases the chance of getting answers. Using the Mann-Whitney-Wilcoxon test, we find a significant p-value (i.e., p-value = 0 <0.01) although the effect size is small (i.e., Cliff’s d = 0.24).
  13. The past question-answering history of a user might provide useful information as to whether a question submitted by the user will be answered or not. Thus, we investigate a user’s question-answering statistics. In particular, we measure the percentage of answered questions asked by users during their new question submission. + According to our analysis, a user with a higher percentage of receiving answers in the past has a higher chance of receiving answers in the future. As shown in Table, the authors of the answered questions received answers for about 71% of their submitted questions in the past. On the contrary, the same percentage is about 60% for the authors of the unanswered questions. The difference is statically significant (i.e., p-value = 0 <0.01) with a small effect size (i.e., Cliff’s d = 0.30).
  14. In RQ2, we find the agreement between Quantitative, and Qualitative findings. We first find the top 10 topics that were unanswered and another top 10 topics that were frequently answered. Finally, we randomly choose 20 questions for each of these 20 topics, that means 400 questions for manual analysis.
  15. Two of the authors conduct manual analysis to see the different quality aspects of unanswered questions, contrasting them with answered questions. We first examine the randomly sampled questions with no particular viewpoints in mind. We then discuss together to share our understanding and set a few dimensions to proceed with our analysis. The authors rank them with four major quality aspects from 0 (lowest) to 5 (highest). + The aspects are – i) problem description and code explanation ii) topic assignment ii) question formatting and iv) appropriateness of the code segments included with questions.
  16. See the Fig. that shows the results of our manual analysis. For each of the four aspects, the average ranking for unanswered questions is lower than that of the answered questions. + For example, the appropriateness of the code segments is 2.7 (on the scale of 5) for unanswered questions. Such rank is 3.63 for the answered questions.
  17. We use four algorithms for building prediction models - + Decision Trees (CART), Random Forest (RF), K-Nearest Neighbors (KNN), and Artificial Neural Network (ANN). We choose these algorithms for two reasons. First, they are capable of building reliable models, even when the relationship between the predictors and class is nonlinear or complex. Second, these algorithms are widely used in the relevant studies.
  18. We evaluate our prediction models using four appropriate performance metrics + precision, recall, F1-score, and classification accuracy.
  19. We use a grid search algorithm, + namely GridSearchCV, from the scikit-learn library to select the best model configuration. GridSearchCV algorithm works by running and evaluating the model performance of all possible combinations of parameters. We also examine the accuracy of both train and test datasets and adjust the parameters so that models do not over-fit.
  20. We keep the training and testing data separate. + We use the attributes of the questions submitted on Stack Overflow in the year 2016 or earlier for training and questions submitted in the year 2017 for the testing purpose. We have several time-dependent features (e.g., user reputation). Thus, we make sure that we predict the unseen questions based on past questions.
  21. To predict unanswered questions, we build four machine learning models. + We see that all four models can predict the question classes with 61%-79% accuracy. + Random Forest performs the best among four models with precision about 78% and recall 81%. + Decision Tree is found as the second best with precision about 73% and recall 67% followed by + KNN with precision about 68% and recall 67%, and + ANN with precision about 59% and recall 72%. One clear distinction can be drawn from the results that, individual feature processing algorithms such as Random Forest and Decision Tree are comparatively better suited for our dataset than the collective feature processing algorithms such as ANN and KNN. Moreover, Random Forest employs an ensemble learning, considers multiple trees to come to a decision and thus can avoid over-fitting and suited better for our dataset.
  22. While answering RQ1, we showed that there are several attributes, whose values are different for unanswered and answered questions. Such findings indicate that these attributes might be the key estimators in predicting whether a question will be answered or not. A ranking of these attributes might help to select the top estimators in the prediction task. We thus use ExtraTreesClassifier, an inbuilt class of scikit-learn to rank the features. + As shown in Fig., topic response ratio has the highest classification strength. It indicates that selection of appropriate tags (or topics) during question submission at Stack Overflow is important. User reputation is the second most effective feature. That is, the trusted users by the Stack Overflow community are more likely to get answers in response to their questions. The next two important features are metric entropy and text readability.
  23. Our proposed models able to predict questions about 60–80% of the time correctly. However, we are interested to compare our models with the baseline models. Thus, we choose two state-of-the-art studies by + Saha et al., and Calefato et al. that are related to our study.
  24. Figure shows the comparative results between the proposed models and the baseline models. The performance of the models by Calefato et al. is found comparatively poor than the others. Their accuracy and precision were found in the range of 50–52% for all the four algorithms. The strength of the features used by Saha et al. is relatively higher than the features of Calefato et al. We see that the models trained with the features by Saha et al. outperform than that of the models by Calefato et al. The performance of the models by Saha et al. is about 1–7% higher than the models by Calefato et al. The overall accuracy of the models by Saha et al. is found 53–55% and the precision is found 53–59%. However, the proposed machine learning models outperform both the baseline models. The precision of the proposed models is about 6–25% higher than the models by Saha et al., and about 9–27% higher than the models by Calefato et al. Proposed models also have the improvement of overall accuracy between 5–25% from Saha et al., and 7–27% from Calefato et al.
  25. To test our models to confirm the generalizability of their performance and the strength of the features, we select questions from two different programming languages that are not used to train our models. In particular, we collect 100K questions (50K unanswered + 50K answered) of C++ programming language and 8K questions (4K unanswered + 4K answered) of Go programming languages. Go is an emerging programming language and thus we could not collect more questions. We select Random Forest and Decision Trees to test with C++ and Go. + See the table that shows the precision and accuracy of the models for C++ and Go. Results show that the models perform reasonably well, although they lose some precision and accuracy (less than 15%). Our further investigation reveals a few causes that might explain why the performance has declined. We see that the code segments included in the questions related to the Go language are comparatively small than that of the Java and C# languages. On the contrary, the length of C++ code segments are fairly long but self-explanatory. These scenarios above might affect the attributes such as text-code ratio, text readability. Despite having such discrepancies, the proposed model can predict the unanswered question with a maximum of about 66% precision and 67% accuracy. Such findings confirm the general acceptance of our features and models in predicting the unanswered questions.
  26. Now, we report the key findings and guidelines. First, proper formatting of the description in a question is essential. In particular, the different elements (e.g., text, code, hyperlink) are recommended to place inside the appropriate HTML tags to improve their visibility and readability. Otherwise, the question might be hard to read, which could force the expert users to leave the questions without answering.
  27. Second, If a question describes a code related issue, it should include an example code segment to support the problem statement.
  28. Third, A clear description might prompt the expert users to answer a question. According to our investigation, we also find that the usage of appropriate technical terms is important. An ambiguous and non-technical description of a technical problem might mislead the users and thus hurt its chance of getting the solutions.
  29. Fourth, Several topics are more likely to be unanswered in Stack Overflow. According to our analysis, top unanswered topics are primarily related to programming IDEs, libraries, and frameworks. A lot of unanswered questions from these topics are related to internal configurations and algorithms. To answer such questions, the creator and the maintaining team need to be active users at Stack Overflow, but they are not active in Stack Overflow. During manual analysis, we see that 8% of unanswered questions are directed to other sites. Thus, the users should (1) avoid submitting the off-topic questions and (2) carefully select the tags for their questions.
  30. Fifth, some users often discuss multiple issues in a single question, which is not recommended by the community members. According to our investigation, question with multiple issues or concerns often fail to receive the answers.
  31. Sixth, we find that several questions received working solutions in their comment threads. According to our manual investigation, 5% unanswered questions receive answers in the comment threads. Thus, although many questions are seen as unanswered, they are actually answered.
  32. We also find several other causes that might leave the questions unanswered. First, several questions remain unanswered since they discuss outdated technology. Second, many unanswered questions were found with misleading titles. Third, users sometimes requested for suggestions instead of discussing problems clearly and asking particular solutions. Forth, some questions were perceived as duplicates to other questions.
  33. This research is supported by the Natural Sciences and Engineering Research Council of Canada (NSERC), and by a Canada First Research Excellence Fund (CFREF) grant coordinated by the Global Institute for Food Security (GIFS).
  34. That is all I have for this presentation. I sincerely appreciate your attention. Thank you all. Feel free to ask any questions.