Mining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter

Mining StackOverﬂow
to Turn the IDE into a
Self-conﬁdent Programming Prompter
Luca Ponzanelli, Gabriele Bavota,
Massimiliano Di Penta, Rocco Oliveto,
and Michele Lanza
http://prompter.inf.usi.ch

Recommender Systems for
Software Engineering
M. P. Robillard, R. J. Walker, and T. Zimmerman
Recommender systems for software engineering
IEEE Software, 2010
“RSSEs are software applications that
provide information estimated to be
valuable for a software engineering task
in a given context”

No Spontaneous
Recommendation
No
Self-Conﬁdence

Prompter

●
NP P
20406080100
Treatment
Completeness
Development Task
NP = Without Prompter
P = With Prompter
Prompter
is eﬀective in
development tasks

Query Generation
Service
Prompter
Eclipse
Code
Context

Query Generation
Service
Prompter
Eclipse
Code
Context
org.tartarus.snowball.SnowballStemmer
org.tartarus.snowball.ext.englishStemmer
java.util.List
java.util.ArrayList
String
API Types
setCurrent, getCurrent, stem, add
API Method Names
@Override
public List<String> filter(final List<String> tokens) {
final List<String> stemmed = new ArrayList<String>();
for(final String t : tokens){
SnowballStemmer stemmer = new englishStemmer();
stemmer.setCurrent(t);
stemmer.stem();
stemmed.add(stemmer.getCurrent());
}
return stemmed;
}
Entity Code

Query Generation
Service
Prompter
Eclipse
Query

Query Generation
Service
Prompter
Eclipse
Query
@Override
stemmer.stem();
}
return stemmed;
}
Term Frequency Entropy Frequency * (1- Entropy)
stemmer 6 0.15 5.1
stemmed 3 0.15 2.55
tokens 2 0.45 1.1
list 4 0.74 1.04
snowball 1 0.11 0.89
stem 1 0.25 0.75
english 1 0.51 0.49
ﬁlter 1 0.58 0.42
array 1 0.72 0.28
set 1 0.8 0.2
add 1 0.84 0.16

Search
Engines
Proxy
Ranking
Model
Query Generation
Service
Search Service
Prompter
Eclipse
Query
Code
Context

Google
Bing
Blekko
Search Engines
Query
Ranking
Model
Search
Engines
Proxy
Search Service
Code
Context
Query Generation
Service
Prompter
Eclipse
Results

Search
Engines
Proxy
Ranking
Model
Prompter
Query Generation
Service
Search Service
Eclipse
Discussions IDs
Stack Overﬂow
API Service
Google
Bing
Blekko
Search Engines

Stack Overﬂow
API Service
Ranking
Model
Search
Engines
Proxy
Prompter
Query Generation
Service
Search Service
Eclipse
Documents
Google
Bing
Blekko
Search Engines

Stack Overﬂow
API Service
Search
Engines
Proxy
Search Service
Prompter
Eclipse
Ranked
Results
Query Generation
Service
Ranking
Model
Google
Bing
Blekko
Search Engines

Stack Overﬂow
API Service
Query Generation
Service
Prompter
Eclipse
Code
Context
Search
Engines
Proxy
Search Service
Prompter
Eclipse
Ranked
Results
Ranking
Model
Google
Bing
Blekko
Search Engines
Prompter
Eclipse
Query
Query
Code
Context
Query
Results
Documents

Query Generation
Service
Ranking
Model Stack Overﬂow
API Service
Prompter
Eclipse
Code
Context
Search
Engines
Proxy
Search Service
Ranked
Results
Google
Bing
Blekko
Search EnginesQuery
Query
Code
Context
Query
Results
Documents

~
<code>
…
</code>
@Override
stemmer.stem();
}
return stemmed;
}
Entity Code

Code Related
Textual Similarity
Code Similarity
API Types Similarity
API Method Names Similarity
Community Related
Question Score
Accepted Answer Score
User Reputation
Tags Similarity
Code Context
Similarity Features

74 Code Context
2
Retrieve Discussions
(37 for Calibration)
1
Manual Classiﬁcation
3 S =
nX
i=1
wi · fi
nX
i=1
wi = 1
having
Model Calibration
4

Find all wi that maximize the
number of relevant
discussions ranked at the top
Model Calibration
S =
nX
i=1
wi · fi
nX
i=1
wi = 1
having

Code Related (fi) wi
Textual Similarity 0.32
Code Similarity 0.00
API Types Similarity 0.00
API Method Names Similarity 0.30
Community Related (fi) wi
Question Score 0.07
Accepted Answer Score 0.00
User Reputation 0.13
Tags Similarity 0.18
S =
nX
i=1
wi · fi
nX
i=1
wi = 1
having
Model Calibration

Study I
Evaluating Recommendations Accuracy
Study II
Evaluating Prompter with Developers
Validation

33 Participants
(Online Survey)
Industry 13
Ph.D. 9
Master 7
Bachelor 2
Faculty 2
Study I
Evaluating Recommendations Accuracy

“76% of the discussions where
considered related (median 4) or
strongly related (median 5) by
developers, while only 10% was
considered as unrelated.”
Study I
Summary

12 Participants
Industry 6
Master 3
Bachelor 3
Study II
Evaluating Prompter with Developers

Development Maintenance
Prompter Prompter
Without PrompterWithout Prompter

NP P
406080100
Treatment
Completeness
●
NP P
20406080100
Treatment
Completeness
Maintenance Task Development Task
NP = Without Prompter
P = With Prompter
Study II
Quantitative Analysis

Study II
Qualitative Analysis
11 out of 12 Participants would use
Prompter in their daily activities

Study II
Qualitative Analysis
11 out of 12 Participants would use
Prompter in their daily activities
Explicitly write and execute queries

Turn the IDE into
a Self-Conﬁdent
Programming Prompter
Luca Ponzanelli, Gabriele Bavota,
Massimiliano Di Penta, Rocco Oliveto,
and Michele Lanza

Mining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter

More Related Content

Viewers also liked

Recently uploaded

Mining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter