Factors Impacting Speed of Answers on Technical Q&A Sites
1. Understanding the Factors for Fast Answers in
Technical Q&A Websites: An Empirical Study of Four
Stack Exchange Websites
Journal First Presentation - Empirical Software Engineering
Shaowei
Wang
Tse-Hsun
(Peter) Chen
Ahmed E.
Hassan
1
10. We study the top four most popular Q&A websites in
Stack Exchange network
10
11. • Selection criteria for studied questions:
• Questions that have an accepted answer
• Questions that have at least a score of 1
• Questions that are not self-answered
We study the top four most popular Q&A websites in
Stack Exchange network
11
12. We study the top four most popular Q&A websites in
Stack Exchange network
55,853 questions
70,336 questions
7,134 questions
10,776 questions
12
13. We study the relationship between the studied factors and
the speed of getting an accepted answer
Metrics
calculation
Model
construction
Model
interpretation
Model
assessment
13
14. We study the relationship between the studied factors and
the speed of getting an accepted answer
Metrics
calculation
Model
construction
Model
interpretation
Model
assessment
Question
(16 factors)
Answer
(4 factors)
Asker
(20 factors)
Answerer
(6 factors)
14
16. We study the relationship between the studied factors and
the speed of getting an accepted answer
AUC
Metrics
calculation
Model
construction
Model
interpretation
Model
assessment
16
17. Explanatory power
(Wald χ2 test)
Relationship visualization
We study the relationship between the studied factors and
the speed of getting an accepted answer
Metrics
calculation
Model
construction
Model
interpretation
Model
assessment
17
18. Our models achieve an AUC of 0.85-0.95
AUC=0.95
AUC=0.94
AUC=0.85
AUC=0.86
18
19. Our models achieve an AUC of 0.85-0.95
AUC=0.95
AUC=0.94
AUC=0.85
AUC=0.86
Our models have a good enough
fit for interpretation.
19
20. Top 1 factor: past speed of answering questions of an
answerer
Past speed of
answering questions
of an answerer
20
21. A question tends to receive a fast accepted answer from
answerers who previously answered questions fast
Probabilityofgetting
aslowacceptedanswer
Past speed of answering questions
of an answerer before (hours in
logarithm scale)
21
22. A question tends to receive a fast accepted answer from
answerers who previously answered questions fast
A wide confidence interval indicates
that the relationship is less clear due
to the lack of data points in
that data range.
Probabilityofgetting
aslowacceptedanswer
Probabilityofgetting
aslowacceptedanswer
Past speed of answering questions
of an answerer before (hours in
logarithm scale)
22
23. Past speed of answering questions of an answerer (hours in logarithm scale)
Probabilityofgetting
aslowacceptedanswer
A question tends to receive a fast accepted answer from
answerers who previously answered questions fast
23
24. Top 2 factor: length of body of a question
Past speed of
answering questions
of an answerer
Length of body of a
question
24
26. Top 3 factor: past speed of getting accepted answers
of tags of a question
Past speed of
answering questions
of an answerer
Past speed of getting
accepted answers of
tags of a question
Length of body of a
question
26
27. Probabilityofgetting
aslowacceptedanswer
A question with tags that received accepted answers fast
tends to receive a fast accepted answer
Time of getting accepted answers of tags of a question in the past (hours in logarithm scale)
27
28. Fast accepted answers rely heavily on the answerer
0
10
20
30
40
50
60
70
Stack Overflow Mathematics Ask Ubuntu Super User
%ofexplanatorypower
Question Asker Answer Answerer
28
30. Suggestions for Technical Q&A website designers
Deliver questions to the right answerers and
motivate them to answer questions faster.
30
31. 86% - 96% of the accepted answers are posted by
answerers that answered more than 5 questions before
31
32. • Non-frequent answerers (<= 5 answers)
• People that answered no more than 5 answers in the past
• Frequent answerers (> 5 answers)
• People that answered more than 5 answers in the past
Non-frequent answerers vs. Frequent answerers
32
34. 34
The current incentive system only motivates frequent answerers
well, but not non-frequent answerers
35. Non-frequent answerers are answering questions that are
as important as ones answered by non-frequent answerers
Meanscoreofquestions
35
36. Suggestions for Technical Q&A website designers
Deliver questions to the right answerers and
motivate them to answer questions faster.
Improve the incentive system to attract the non-
frequent answerers to become more active.
36
38. Frequent answerers probably game the incentive system
Yeah, some folks are going to specialize in super-fast answers
to easy questions and get more rep points than deserved,…
The bigger problem is that this has the side effect of causing
interesting but more difficult questions to get ignored. Typical
example: someone asks a question that gets a lot of views and two or more upvotes,
but it's hard enough that no one can answer within an hour or so.
38
39. Suggestions for Technical Q&A website designers
Deliver questions to the right answerers and
motivate them to answer questions faster.
Improve the incentive system to attract the non-
frequent answerers to become more active.
Improve the incentive system to factor in the value
and difficulty of questions.
39
hi, thanks for the introduction and for your coming.
I am shaowei, a postdoc from queen’s university. Today I will present our paper, which is understanding the factors for faster answers in technique q&a website. This paper is down together with peter from Concordia and Ahmed from queen’s.
Developers keep facing problems, whenever they do development, testing, maintenance. Problems fill developers’ life.
To help
Developers spend 58% of their time on comprehension activities.
~50 million
monthly visitors
Developers spend 58% of their time on comprehension activities.
~50 million
monthly visitors
Developers spend 58% of their time on comprehension activities.
~50 million
monthly visitors
In other words, developers ask questions very frequently.
The median waiting time of a question to get answer is 0.5 hour in general. How to shorten the waiting time to get an accepted answer is an interesting question to study.
To understand the factors that impact the speed of ,, and provide insights for users and websites designers to improve their system.
To achieve this goal, we study four most popular websites in stack exchange network .
We select the questions that have at least 1 score, cos we want to make sure the question has enough attention from the community and the quality is reasonable
55k from stack overflow
To understand the factors that may impact the speed of getting an accepted answer for a question.
The reason we only select the top 20% and bottom 20% is that we want to find the factors that really the impact the really and fast questions.
Remove oopti and table as well.
Inset equation x2
Logo and auc
Logo and auc
Wide gray area means larger confidence interval. the relationship is less clear
probability of getting a slow answer increases significantly when thevalue of A Median Speed Answer increases up until an inflection point with asmall confidence interval (i.e., the gray bands are narrow).
After the inflection point, the curve goes down gradually but with a wide confidence interval.
After the inflection point, the probability goes down slowly with a largeruncertainty (i.e., the relationship is less clear due to the lack of data points inthat data range).
Wide gray area means larger confidence interval. the relationship is less clear
probability of getting a slow answer increases significantly when thevalue of A Median Speed Answer increases up until an inflection point with asmall confidence interval (i.e., the gray bands are narrow).
After the inflection point, the curve goes down gradually but with a wide confidence interval.
After the inflection point, the probability goes down slowly with a largeruncertainty (i.e., the relationship is less clear due to the lack of data points inthat data range).
Wide gray area means larger confidence interval. the relationship is less clear
probability of getting a slow answer increases significantly when thevalue of A Median Speed Answer increases up until an inflection point with asmall confidence interval (i.e., the gray bands are narrow).
After the inflection point, the curve goes down gradually but with a wide confidence interval.
After the inflection point, the probability goes down slowly with a largeruncertainty (i.e., the relationship is less clear due to the lack of data points inthat data range). More importantly, this finding is hold across the different sites.
Speed for an answerer to answer questions in the past
Length of an answer body (controlling factor)
Length of an question body
Wide gray area means larger confidence interval. the relationship is less clear
probability of getting a slow answer increases significantly when thevalue of A Median Speed Answer increases up until an inflection point with asmall confidence interval (i.e., the gray bands are narrow).
After the inflection point, the curve goes down gradually but with a wide confidence interval.
After the inflection point, the probability goes down slowly with a largeruncertainty (i.e., the relationship is less clear due to the lack of data points inthat data range).
Speed for an answerer to answer questions in the past
Length of an answer body (controlling factor)
Length of an question body
Tag also matters.
Logo and number
In general, fast accepted answer rely on the people who answer the question.
We look at the improvement of reputation score for people with different reputation score.
There are non-frequent answers
The questions that are answered by non-frequent answerers are as important as these are answered by frequent answerer. However, Non-frequent answerers are the bottleneck for fast answers. So the possible explanation is that some new questions require concert knowledge that only such non-frequent answerers have. such non-frequent answerers do not actively stay on SO, therefore delay the answers.
Long title
To find the possible reason of this, we explore the posts on stack overflow meta
select top 5 * from posts as a join posts as q on q.acceptedanswerid = a.id
where DATEDIFF(week, a.creationdate, q.creationdate) > 1
Developers spend 58% of their time on comprehension activities.