Mining StackOverflow
to Turn the IDE into a
Self-confident Programming Prompter
Luca Ponzanelli, Gabriele Bavota,
Massimiliano Di Penta, Rocco Oliveto,
and Michele Lanza
http://prompter.inf.usi.ch
The Lone
Developer
Collaborating
People
Pair
Programming
Pair
Programming
Online
Resources
Recommender Systems for
Software Engineering
M. P. Robillard, R. J. Walker, and T. Zimmerman
Recommender systems for software engineering
IEEE Software, 2010
“RSSEs are software applications that
provide information estimated to be
valuable for a software engineering task
in a given context”
No Spontaneous
Recommendation
No Spontaneous
Recommendation
No
Self-Confidence
Pair
Programming
The
Prompter
Programming
Prompter
Prompter
http://prompter.inf.usi.ch
●
NP P
20406080100
Treatment
Completeness
Development Task
NP = Without Prompter
P = With Prompter
Prompter
is effective in
development tasks
Prompter
Eclipse
Query Generation
Service
Prompter
Eclipse
Code
Context
Query Generation
Service
Prompter
Eclipse
Code
Context
org.tartarus.snowball.SnowballStemmer
org.tartarus.snowball.ext.englishStemmer
java.util.List
java.util.ArrayList
String
API Types
setCurrent, getCurrent, stem, add
API Method Names
@Override
public List<String> filter(final List<String> tokens) {
final List<String> stemmed = new ArrayList<String>();
for(final String t : tokens){
SnowballStemmer stemmer = new englishStemmer();
stemmer.setCurrent(t);
stemmer.stem();
stemmed.add(stemmer.getCurrent());
}
return stemmed;
}
Entity Code
Query Generation
Service
Prompter
Eclipse
Query
Query Generation
Service
Prompter
Eclipse
Query
@Override
public List<String> filter(final List<String> tokens) {
final List<String> stemmed = new ArrayList<String>();
for(final String t : tokens){
SnowballStemmer stemmer = new englishStemmer();
stemmer.setCurrent(t);
stemmer.stem();
stemmed.add(stemmer.getCurrent());
}
return stemmed;
}
Term Frequency Entropy Frequency * (1- Entropy)
stemmer 6 0.15 5.1
stemmed 3 0.15 2.55
tokens 2 0.45 1.1
list 4 0.74 1.04
snowball 1 0.11 0.89
stem 1 0.25 0.75
english 1 0.51 0.49
filter 1 0.58 0.42
array 1 0.72 0.28
set 1 0.8 0.2
add 1 0.84 0.16
Search
Engines
Proxy
Ranking
Model
Query Generation
Service
Search Service
Prompter
Eclipse
Query
Code
Context
Google
Bing
Blekko
Search Engines
Query
Ranking
Model
Search
Engines
Proxy
Search Service
Code
Context
Query Generation
Service
Prompter
Eclipse
Results
Search
Engines
Proxy
Ranking
Model
Prompter
Query Generation
Service
Search Service
Eclipse
Discussions IDs
Stack Overflow
API Service
Google
Bing
Blekko
Search Engines
Stack Overflow
API Service
Ranking
Model
Search
Engines
Proxy
Prompter
Query Generation
Service
Search Service
Eclipse
Documents
Google
Bing
Blekko
Search Engines
Stack Overflow
API Service
Search
Engines
Proxy
Search Service
Prompter
Eclipse
Ranked
Results
Query Generation
Service
Ranking
Model
Google
Bing
Blekko
Search Engines
Stack Overflow
API Service
Query Generation
Service
Prompter
Eclipse
Code
Context
Search
Engines
Proxy
Search Service
Prompter
Eclipse
Ranked
Results
Ranking
Model
Google
Bing
Blekko
Search Engines
Prompter
Eclipse
Query
Query
Code
Context
Query
Results
Documents
Query Generation
Service
Ranking
Model Stack Overflow
API Service
Prompter
Eclipse
Code
Context
Search
Engines
Proxy
Search Service
Ranked
Results
Google
Bing
Blekko
Search EnginesQuery
Query
Code
Context
Query
Results
Documents
<code>
…
</code>
~
<code>
…
</code>
@Override
public List<String> filter(final List<String> tokens) {
final List<String> stemmed = new ArrayList<String>();
for(final String t : tokens){
SnowballStemmer stemmer = new englishStemmer();
stemmer.setCurrent(t);
stemmer.stem();
stemmed.add(stemmer.getCurrent());
}
return stemmed;
}
Entity Code
Code Related
Textual Similarity
Code Similarity
API Types Similarity
API Method Names Similarity
Community Related
Question Score
Accepted Answer Score
User Reputation
Tags Similarity
Code Context
Similarity Features
74 Code Context
2
Retrieve Discussions
(37 for Calibration)
1
Manual Classification
3 S =
nX
i=1
wi · fi
nX
i=1
wi = 1
having
Model Calibration
4
Find all wi that maximize the
number of relevant
discussions ranked at the top
Model Calibration
S =
nX
i=1
wi · fi
nX
i=1
wi = 1
having
Code Related (fi) wi
Textual Similarity 0.32
Code Similarity 0.00
API Types Similarity 0.00
API Method Names Similarity 0.30
Community Related (fi) wi
Question Score 0.07
Accepted Answer Score 0.00
User Reputation 0.13
Tags Similarity 0.18
S =
nX
i=1
wi · fi
nX
i=1
wi = 1
having
Model Calibration
Validation
Study I
Evaluating Recommendations Accuracy
Study II
Evaluating Prompter with Developers
Validation
33 Participants
(Online Survey)
Industry 13
Ph.D. 9
Master 7
Bachelor 2
Faculty 2
Study I
Evaluating Recommendations Accuracy
“76% of the discussions where
considered related (median 4) or
strongly related (median 5) by
developers, while only 10% was
considered as unrelated.”
Study I
Summary
12 Participants
Industry 6
Master 3
Bachelor 3
Study II
Evaluating Prompter with Developers
Development Maintenance
Prompter Prompter
Without PrompterWithout Prompter
Development Maintenance
Prompter Prompter
Without PrompterWithout Prompter
NP P
406080100
Treatment
Completeness
●
NP P
20406080100
Treatment
Completeness
Maintenance Task Development Task
NP = Without Prompter
P = With Prompter
Study II
Quantitative Analysis
Study II
Qualitative Analysis
11 out of 12 Participants would use
Prompter in their daily activities
Study II
Qualitative Analysis
11 out of 12 Participants would use
Prompter in their daily activities
Explicitly write and execute queries
Turn the IDE into
a Self-Confident
Programming Prompter
Luca Ponzanelli, Gabriele Bavota,
Massimiliano Di Penta, Rocco Oliveto,
and Michele Lanza
http://prompter.inf.usi.ch

Mining Stack Overflow to Tun the IDE into a Self-confident Programming Prompter