SlideShare a Scribd company logo
1 of 24
Download to read offline
RECOMMENDING RELEVANT SECTIONS
FROM A WEBPAGE ABOUT
PROGRAMMING ERRORS AND
EXCEPTIONS
Mohammad Masudur Rahman, and Chanchal K. Roy
Software Research Lab, Department of Computer Science
University of Saskatchewan, Canada
25th Center for Advanced Studies Conference (CASCON
2015)
2
Exception triggering point
SOLVING EXCEPTION
(STEP I: QUERY SELECTION)
3
Selection of traditional search query
Switching to web browser for
web search
This query may not
be sufficient enough
for most of the
exceptions
SOLVING EXCEPTION
(STEP II: WEB SEARCH)
4
 The browser does NOT know the context
(i.e., details) of the exception.
 Not much helpful ranking
 Forces the developer to SWITCH back and
forth between IDE and browser.
 Trial and error in searching
 19% of development time in web search
Switching is
often
distracting
SOLVING EXCEPTION
(STEP III: MAPPING TO PAGE SECTIONS )
5
Mapping
 Mapping between the exception & relevant
page sections non-trivial
 Automated mapping between exception &
relevant page sections
 IDE-based web page content suggestion
for review
5
OUTLINE OF THIS TALK
6
Content Suggest
Architecture
Metrics & Algorithm
Empirical evaluation &
validation (using
webpages)
Validation with IR techniques
(using SO posts)
Conclusion
CONTENTSUGGEST—ARCHITECTURE
7
Start End
7
PROPOSED METRICS
 Content Density (CTD)
 Text Density (TD)
 Link Density (LD)
 Code Density (CD)
 Purity of textual content, less hyperlinks
 Content Relevance (CTR)
 Text Relevance (TR)
 Code Relevance (CR)
 Relevance of textual content with exception, interesting
tokens
 Content Score (CTS) = γ*Content Density +
δ*Content Relevance
 Normalized metrics 8
PROPOSED TECHNIQUE (CONTENTSUGGEST)
HTML
HEAD
BODY
TITLE
STYLE
SCRIPT
DIV DIV
H1 P
B
OL
LI
LI
LI
H1 TABLE P
TBODY
TR
TR
TD
TD
TD
TD
Text
Text Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
DIV
P
P
9
PROPOSED TECHNIQUE (CONTENTSUGGEST)-
-SCORING
HTML
HEAD
BODY
TITLE
DIV DIV
H1 P
B
OL
LI
LI
LI
H1 TABLE P
TBODY
TR
TR
TD
TD
TD
TD
Text
Text Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text 10
PROPOSED TECHNIQUE (CONTENTSUGGEST)-
-TAGGING
HTML
HEAD
BODY
TITLE
DIV DIV
H1 P
B
OL
LI
LI
LI
H1 TABLE P
TBODY
TR
TR
TD
TD
TD
TD
Text
Text Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text 11
Content
Noise
PROPOSED TECHNIQUE (CONTENTSUGGEST)-
-FILTERING
HTML
HEAD
BODY
TITLE
DIV DIV
H1 P
B
OL
LI
LI
LI
H1 TABLE P
TBODY
TR
TR
TD
TD
TD
TD
Text
Text Text
Text
Text
Text
Text
Text
Text
Text
Text
Text
Text 12
Content
Noise
EXPERIMENTS
13
(80 exceptions +
250 web pages)
Manual analysis
(25 hours)
Gold sections
Evaluation Validation (Sun et al)
Stack Overflow
Crowd
SO Posts
ContentSuggest
IR (VSM, LSA)
PERFORMANCE METRICS
 Precision (P): % of the retrieved content (a) that
belong to gold content (b) of the page.
 Recall (R): % of gold content (b) that is retrieved
(a) by the technique.
 F1-measure (F1): Combination of Precision (P) &
Recall (R).
14
||
|),(|
a
baLCS
P 
||
|),(|
b
baLCS
R 
RP
RP
F



2
1
RESEARCH QUESTIONS (4)
 RQ1: How effective is ContentSuggest in
recommending relevant content from a web page?
 RQ2: How effective are the proposed metrics in
identifying relevant page content?
 RQ3: Can ContentSuggest outperform the baseline
technique?
 RQ4: Does ContentSuggest perform better than IR
techniques (VSM, LSI) in identifying relevant content?
15
ANSWERING RQ1 & RQ2– EVALUATION OF
TECHNIQUE & METRICS
16
Scores Metric SO-Pages Non-SO Pages All Pages
{ Content Density } MP 50.91% 49.50% 50.07%
MR 91.74% 75.71% 82.18%
MF 62.32% 53.76% 57.22%
{ Content Relevance } MP 86.63% 69.17% 76.23%
MR 52.17% 57.66% 55.44%
MF 61.07% 55.88% 57.98%
{ Content Density,
Content Relevance }
(Proposed Technique)
MP 92.64% 74.60% 81.96%
MR 74.17% 78.51% 76.74%
MF 80.95% 73.09% 76.30%
[ SO = Stack Overflow, MP = Mean Precision, MR = Mean Recall,
MF = Mean F1-measure ]
ANSWERING RQ3– COMPARISON WITH
BASELINE TECHNIQUE
17
Content Extractor Metric SO-Pages Non-SO Pages All Pages
Sun et al.
(SIGIR 2011)
MP 52.63% 38.89% 44.44%
MR 86.49% 41.84% 59.88%
MF 62.57% 34.49% 45.84%
ContentSuggest
(Proposed Technique)
MP 92.64% 74.60% 81.96%
MR 74.17% 78.51% 76.74%
MF 80.95% 73.09% 76.30%
[ SO = Stack Overflow, MP = Mean Precision, MR = Mean Recall,
MF = Mean F1-measure ]
 Performed better for all 3 sets of pages– SO pages, Non-
SO pages, and All Pages
 Performed better for all metrics– precision, recall and F-
measure.
ANSWERING RQ3– COMPARISON WITH
BASELINE TECHNIQUE
18
ANSWERING RQ4– COMPARISON WITH IR
TECHNIQUES (VSM, LSI)
19
Content Extractor Metric Accepted Posts Most Voted Posts
Latent Semantic Analysis
(Marcus et al, ICSE 2003)
MP 19.98% 23.02%
MR 21.78% 23.17%
MF 18.43% 21.07%
Vector Space Model
(Antoniol et al, TSE 2002)
MP 22.50% 33.89%
MR 23.08% 31.90%
MF 19.77% 30.44%
Content Suggest
(Proposed Technique)
MP 23.10% 31.36%
MR 45.15% 54.42%
MF 26.99% 35.90%
ANSWERING RQ4– COMPARISON WITH IR
TECHNIQUES (VSM, LSI)
20
THREATS TO VALIDITY
 Gold content preparation: Despite cross-
validation may contain subjective bias.
 Limited training dataset: Metric weights trained
based on limited dataset.
 Usability concern: Fully fledged user-study
required to validate the applicability of the
technique. Limited study performed with 6
participants.
21
TAKE-HOME MESSAGE
 19% of development time spent simply in web
search (Brandt et al, SIGCHI 2009)
 Mapping between information in IDE and in web
page could be non-trivial, time-consuming.
 ContentSuggest automates such mapping in the
context of exception handling.
 Content Density and Content Relevance are
found effective in identifying relevant sections from
a web page.
 ContentSuggest outperforms one baseline
technique and two IR techniques (VSM, LSI).
22
THANK YOU!!
23
REFERENCES
[1] J. Brandt, P.J. Guo, J. Lewenstein, M. Dontcheva, and S. R. Klemmer. Two Studies
of Opportunistic Programming: Interleaving Web Foraging, Learning, and Writing
Code. In Proc. SIGCHI, pages 1589-1598, 2009
[2] G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo. Recovering
traceability links between code and documentation. TSE, 28(10):970-983, 2002
[3] A. Marcus and J.I. Maletic. Recovering Documentation-toSource-Code Traceability
Links Using Latent Semantic Indexing. In Proc. ICSE, pages 125-135, 2003
[4] F. Sun, D. Song, and L. Liao. DOM Based Content Extraction via Text Density. In
Proc. SIGIR, pages 245-254, 2011.
[5] Luca Ponzanelli, Alberto Bacchelli, and Michele Lanza. Seahawk: Stack Overflow
in the IDE. In Proc. ICSE, pages 1295-1298, 2013
[6] M.M Rahman, S. Yeasmin, and C. Roy. Towards a ContextAware IDE-Based Meta
Search Engine for Recommendation about Programming Errors and Exceptions. In
Proc. CSMRWCRE, pages 194-203, 2014
[7] ContentSuggest Web Portal. URL http://www.usask.ca/~mor543/contentsuggest
[8] C.K. Roy and J.R. Cordy. NICAD: Accurate Detection of Near Miss Intentional
Clones Using Flexible Pretty-Printing and Code Normalization. In Proc. ICPC,
pages 172-181, 2008.
24

More Related Content

Similar to ContentSuggest--Recommendation of Relevant Sections from a Webpage about Errors & Exceptions

Survey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageSurvey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageIRJET Journal
 
Paper id 25201463
Paper id 25201463Paper id 25201463
Paper id 25201463IJRAT
 
Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-MeetingMasud Rahman
 
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...IRJET Journal
 
Search Engine Scrapper
Search Engine ScrapperSearch Engine Scrapper
Search Engine ScrapperIRJET Journal
 
CHGIS-June-2016-presentation-Moldofsky
CHGIS-June-2016-presentation-MoldofskyCHGIS-June-2016-presentation-Moldofsky
CHGIS-June-2016-presentation-MoldofskyKevin T. Roy
 
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...IJCNCJournal
 
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...IJCNCJournal
 
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
 It's MY JOB: Identifying and Improving Content Quality for Online recruitmen... It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...IIIT Hyderabad
 
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...ijtsrd
 
DEVELOPMENT OF BLAST EMAIL, CHATTING, AND SMS FEATURES ON EMPLOYEE DATA APPLI...
DEVELOPMENT OF BLAST EMAIL, CHATTING, AND SMS FEATURES ON EMPLOYEE DATA APPLI...DEVELOPMENT OF BLAST EMAIL, CHATTING, AND SMS FEATURES ON EMPLOYEE DATA APPLI...
DEVELOPMENT OF BLAST EMAIL, CHATTING, AND SMS FEATURES ON EMPLOYEE DATA APPLI...AM Publications
 
CSE NEW_4th yr w.e.f. 2018-19.pdf
CSE NEW_4th yr w.e.f. 2018-19.pdfCSE NEW_4th yr w.e.f. 2018-19.pdf
CSE NEW_4th yr w.e.f. 2018-19.pdfssuser5a7261
 
A survey of web metrics
A survey of web metricsA survey of web metrics
A survey of web metricsunyil96
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Debdoot Mukherjee
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Debdoot Mukherjee
 
A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...IJECEIAES
 
IRJET - Student Pass Percentage Dedection using Ensemble Learninng
IRJET  - Student Pass Percentage Dedection using Ensemble LearninngIRJET  - Student Pass Percentage Dedection using Ensemble Learninng
IRJET - Student Pass Percentage Dedection using Ensemble LearninngIRJET Journal
 
Load Speed PSI development of webcore vitals
Load Speed PSI development of webcore vitalsLoad Speed PSI development of webcore vitals
Load Speed PSI development of webcore vitalsrahmathidayat471220
 

Similar to ContentSuggest--Recommendation of Relevant Sections from a Webpage about Errors & Exceptions (20)

Ak4301197200
Ak4301197200Ak4301197200
Ak4301197200
 
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug TriageSurvey on Software Data Reduction Techniques Accomplishing Bug Triage
Survey on Software Data Reduction Techniques Accomplishing Bug Triage
 
Paper id 25201463
Paper id 25201463Paper id 25201463
Paper id 25201463
 
Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-Meeting
 
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...
Enactment of Firefly Algorithm and Fuzzy C-Means Clustering For Consumer Requ...
 
Rajshree1.pdf
Rajshree1.pdfRajshree1.pdf
Rajshree1.pdf
 
Search Engine Scrapper
Search Engine ScrapperSearch Engine Scrapper
Search Engine Scrapper
 
CHGIS-June-2016-presentation-Moldofsky
CHGIS-June-2016-presentation-MoldofskyCHGIS-June-2016-presentation-Moldofsky
CHGIS-June-2016-presentation-Moldofsky
 
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
WEB ATTACK PREDICTION USING STEPWISE CONDITIONAL PARAMETER TUNING IN MACHINE ...
 
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
Web Attack Prediction using Stepwise Conditional Parameter Tuning in Machine ...
 
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
 It's MY JOB: Identifying and Improving Content Quality for Online recruitmen... It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
It's MY JOB: Identifying and Improving Content Quality for Online recruitmen...
 
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
Study of Software Defect Prediction using Forward Pass RNN with Hyperbolic Ta...
 
DEVELOPMENT OF BLAST EMAIL, CHATTING, AND SMS FEATURES ON EMPLOYEE DATA APPLI...
DEVELOPMENT OF BLAST EMAIL, CHATTING, AND SMS FEATURES ON EMPLOYEE DATA APPLI...DEVELOPMENT OF BLAST EMAIL, CHATTING, AND SMS FEATURES ON EMPLOYEE DATA APPLI...
DEVELOPMENT OF BLAST EMAIL, CHATTING, AND SMS FEATURES ON EMPLOYEE DATA APPLI...
 
CSE NEW_4th yr w.e.f. 2018-19.pdf
CSE NEW_4th yr w.e.f. 2018-19.pdfCSE NEW_4th yr w.e.f. 2018-19.pdf
CSE NEW_4th yr w.e.f. 2018-19.pdf
 
A survey of web metrics
A survey of web metricsA survey of web metrics
A survey of web metrics
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
 
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
 
A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...A simplified predictive framework for cost evaluation to fault assessment usi...
A simplified predictive framework for cost evaluation to fault assessment usi...
 
IRJET - Student Pass Percentage Dedection using Ensemble Learninng
IRJET  - Student Pass Percentage Dedection using Ensemble LearninngIRJET  - Student Pass Percentage Dedection using Ensemble Learninng
IRJET - Student Pass Percentage Dedection using Ensemble Learninng
 
Load Speed PSI development of webcore vitals
Load Speed PSI development of webcore vitalsLoad Speed PSI development of webcore vitals
Load Speed PSI development of webcore vitals
 

More from Masud Rahman

HereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityHereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityMasud Rahman
 
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...Masud Rahman
 
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanPhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanMasud Rahman
 
PhD proposal of Masud Rahman
PhD proposal of Masud RahmanPhD proposal of Masud Rahman
PhD proposal of Masud RahmanMasud Rahman
 
PhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanPhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanMasud Rahman
 
Doctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanDoctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanMasud Rahman
 
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Masud Rahman
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationMasud Rahman
 
CodeInsight-SCAM2015
CodeInsight-SCAM2015CodeInsight-SCAM2015
CodeInsight-SCAM2015Masud Rahman
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017Masud Rahman
 
QUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeQUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeMasud Rahman
 
CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016Masud Rahman
 

More from Masud Rahman (20)

HereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityHereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie University
 
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
 
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanPhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of Saskatchewan
 
PhD proposal of Masud Rahman
PhD proposal of Masud RahmanPhD proposal of Masud Rahman
PhD proposal of Masud Rahman
 
PhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanPhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud Rahman
 
Doctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanDoctoral Symposium of Masud Rahman
Doctoral Symposium of Masud Rahman
 
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
 
MSR2017-Challenge
MSR2017-ChallengeMSR2017-Challenge
MSR2017-Challenge
 
MSR2017-RevHelper
MSR2017-RevHelperMSR2017-RevHelper
MSR2017-RevHelper
 
STRICT-SANER2017
STRICT-SANER2017STRICT-SANER2017
STRICT-SANER2017
 
MSR2015-Challenge
MSR2015-ChallengeMSR2015-Challenge
MSR2015-Challenge
 
MSR2014-Challenge
MSR2014-ChallengeMSR2014-Challenge
MSR2014-Challenge
 
CodeInsight-SCAM2015
CodeInsight-SCAM2015CodeInsight-SCAM2015
CodeInsight-SCAM2015
 
STRICT-SANER2015
STRICT-SANER2015STRICT-SANER2015
STRICT-SANER2015
 
CMPT-842-BRACK
CMPT-842-BRACKCMPT-842-BRACK
CMPT-842-BRACK
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017
 
RACK-SANER2016
RACK-SANER2016RACK-SANER2016
RACK-SANER2016
 
QUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeQUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-Singapore
 
CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016
 

Recently uploaded

Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxryandux83rd
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6Vanessa Camilleri
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfChristalin Nelson
 
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...Nguyen Thanh Tu Collection
 
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEPART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEMISSRITIMABIOLOGYEXP
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
DBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfDBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfChristalin Nelson
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineCeline George
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptxmary850239
 

Recently uploaded (20)

Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Chi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical VariableChi-Square Test Non Parametric Test Categorical Variable
Chi-Square Test Non Parametric Test Categorical Variable
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptx
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6
 
CARNAVAL COM MAGIA E EUFORIA _
CARNAVAL COM MAGIA E EUFORIA            _CARNAVAL COM MAGIA E EUFORIA            _
CARNAVAL COM MAGIA E EUFORIA _
 
Indexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdfIndexing Structures in Database Management system.pdf
Indexing Structures in Database Management system.pdf
 
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
CHUYÊN ĐỀ ÔN THEO CÂU CHO HỌC SINH LỚP 12 ĐỂ ĐẠT ĐIỂM 5+ THI TỐT NGHIỆP THPT ...
 
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEPART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
DBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfDBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdf
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command Line
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx
 

ContentSuggest--Recommendation of Relevant Sections from a Webpage about Errors & Exceptions

  • 1. RECOMMENDING RELEVANT SECTIONS FROM A WEBPAGE ABOUT PROGRAMMING ERRORS AND EXCEPTIONS Mohammad Masudur Rahman, and Chanchal K. Roy Software Research Lab, Department of Computer Science University of Saskatchewan, Canada 25th Center for Advanced Studies Conference (CASCON 2015)
  • 3. SOLVING EXCEPTION (STEP I: QUERY SELECTION) 3 Selection of traditional search query Switching to web browser for web search This query may not be sufficient enough for most of the exceptions
  • 4. SOLVING EXCEPTION (STEP II: WEB SEARCH) 4  The browser does NOT know the context (i.e., details) of the exception.  Not much helpful ranking  Forces the developer to SWITCH back and forth between IDE and browser.  Trial and error in searching  19% of development time in web search Switching is often distracting
  • 5. SOLVING EXCEPTION (STEP III: MAPPING TO PAGE SECTIONS ) 5 Mapping  Mapping between the exception & relevant page sections non-trivial  Automated mapping between exception & relevant page sections  IDE-based web page content suggestion for review 5
  • 6. OUTLINE OF THIS TALK 6 Content Suggest Architecture Metrics & Algorithm Empirical evaluation & validation (using webpages) Validation with IR techniques (using SO posts) Conclusion
  • 8. PROPOSED METRICS  Content Density (CTD)  Text Density (TD)  Link Density (LD)  Code Density (CD)  Purity of textual content, less hyperlinks  Content Relevance (CTR)  Text Relevance (TR)  Code Relevance (CR)  Relevance of textual content with exception, interesting tokens  Content Score (CTS) = γ*Content Density + δ*Content Relevance  Normalized metrics 8
  • 9. PROPOSED TECHNIQUE (CONTENTSUGGEST) HTML HEAD BODY TITLE STYLE SCRIPT DIV DIV H1 P B OL LI LI LI H1 TABLE P TBODY TR TR TD TD TD TD Text Text Text Text Text Text Text Text Text Text Text Text Text DIV P P 9
  • 10. PROPOSED TECHNIQUE (CONTENTSUGGEST)- -SCORING HTML HEAD BODY TITLE DIV DIV H1 P B OL LI LI LI H1 TABLE P TBODY TR TR TD TD TD TD Text Text Text Text Text Text Text Text Text Text Text Text Text 10
  • 11. PROPOSED TECHNIQUE (CONTENTSUGGEST)- -TAGGING HTML HEAD BODY TITLE DIV DIV H1 P B OL LI LI LI H1 TABLE P TBODY TR TR TD TD TD TD Text Text Text Text Text Text Text Text Text Text Text Text Text 11 Content Noise
  • 12. PROPOSED TECHNIQUE (CONTENTSUGGEST)- -FILTERING HTML HEAD BODY TITLE DIV DIV H1 P B OL LI LI LI H1 TABLE P TBODY TR TR TD TD TD TD Text Text Text Text Text Text Text Text Text Text Text Text Text 12 Content Noise
  • 13. EXPERIMENTS 13 (80 exceptions + 250 web pages) Manual analysis (25 hours) Gold sections Evaluation Validation (Sun et al) Stack Overflow Crowd SO Posts ContentSuggest IR (VSM, LSA)
  • 14. PERFORMANCE METRICS  Precision (P): % of the retrieved content (a) that belong to gold content (b) of the page.  Recall (R): % of gold content (b) that is retrieved (a) by the technique.  F1-measure (F1): Combination of Precision (P) & Recall (R). 14 || |),(| a baLCS P  || |),(| b baLCS R  RP RP F    2 1
  • 15. RESEARCH QUESTIONS (4)  RQ1: How effective is ContentSuggest in recommending relevant content from a web page?  RQ2: How effective are the proposed metrics in identifying relevant page content?  RQ3: Can ContentSuggest outperform the baseline technique?  RQ4: Does ContentSuggest perform better than IR techniques (VSM, LSI) in identifying relevant content? 15
  • 16. ANSWERING RQ1 & RQ2– EVALUATION OF TECHNIQUE & METRICS 16 Scores Metric SO-Pages Non-SO Pages All Pages { Content Density } MP 50.91% 49.50% 50.07% MR 91.74% 75.71% 82.18% MF 62.32% 53.76% 57.22% { Content Relevance } MP 86.63% 69.17% 76.23% MR 52.17% 57.66% 55.44% MF 61.07% 55.88% 57.98% { Content Density, Content Relevance } (Proposed Technique) MP 92.64% 74.60% 81.96% MR 74.17% 78.51% 76.74% MF 80.95% 73.09% 76.30% [ SO = Stack Overflow, MP = Mean Precision, MR = Mean Recall, MF = Mean F1-measure ]
  • 17. ANSWERING RQ3– COMPARISON WITH BASELINE TECHNIQUE 17 Content Extractor Metric SO-Pages Non-SO Pages All Pages Sun et al. (SIGIR 2011) MP 52.63% 38.89% 44.44% MR 86.49% 41.84% 59.88% MF 62.57% 34.49% 45.84% ContentSuggest (Proposed Technique) MP 92.64% 74.60% 81.96% MR 74.17% 78.51% 76.74% MF 80.95% 73.09% 76.30% [ SO = Stack Overflow, MP = Mean Precision, MR = Mean Recall, MF = Mean F1-measure ]  Performed better for all 3 sets of pages– SO pages, Non- SO pages, and All Pages  Performed better for all metrics– precision, recall and F- measure.
  • 18. ANSWERING RQ3– COMPARISON WITH BASELINE TECHNIQUE 18
  • 19. ANSWERING RQ4– COMPARISON WITH IR TECHNIQUES (VSM, LSI) 19 Content Extractor Metric Accepted Posts Most Voted Posts Latent Semantic Analysis (Marcus et al, ICSE 2003) MP 19.98% 23.02% MR 21.78% 23.17% MF 18.43% 21.07% Vector Space Model (Antoniol et al, TSE 2002) MP 22.50% 33.89% MR 23.08% 31.90% MF 19.77% 30.44% Content Suggest (Proposed Technique) MP 23.10% 31.36% MR 45.15% 54.42% MF 26.99% 35.90%
  • 20. ANSWERING RQ4– COMPARISON WITH IR TECHNIQUES (VSM, LSI) 20
  • 21. THREATS TO VALIDITY  Gold content preparation: Despite cross- validation may contain subjective bias.  Limited training dataset: Metric weights trained based on limited dataset.  Usability concern: Fully fledged user-study required to validate the applicability of the technique. Limited study performed with 6 participants. 21
  • 22. TAKE-HOME MESSAGE  19% of development time spent simply in web search (Brandt et al, SIGCHI 2009)  Mapping between information in IDE and in web page could be non-trivial, time-consuming.  ContentSuggest automates such mapping in the context of exception handling.  Content Density and Content Relevance are found effective in identifying relevant sections from a web page.  ContentSuggest outperforms one baseline technique and two IR techniques (VSM, LSI). 22
  • 24. REFERENCES [1] J. Brandt, P.J. Guo, J. Lewenstein, M. Dontcheva, and S. R. Klemmer. Two Studies of Opportunistic Programming: Interleaving Web Foraging, Learning, and Writing Code. In Proc. SIGCHI, pages 1589-1598, 2009 [2] G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo. Recovering traceability links between code and documentation. TSE, 28(10):970-983, 2002 [3] A. Marcus and J.I. Maletic. Recovering Documentation-toSource-Code Traceability Links Using Latent Semantic Indexing. In Proc. ICSE, pages 125-135, 2003 [4] F. Sun, D. Song, and L. Liao. DOM Based Content Extraction via Text Density. In Proc. SIGIR, pages 245-254, 2011. [5] Luca Ponzanelli, Alberto Bacchelli, and Michele Lanza. Seahawk: Stack Overflow in the IDE. In Proc. ICSE, pages 1295-1298, 2013 [6] M.M Rahman, S. Yeasmin, and C. Roy. Towards a ContextAware IDE-Based Meta Search Engine for Recommendation about Programming Errors and Exceptions. In Proc. CSMRWCRE, pages 194-203, 2014 [7] ContentSuggest Web Portal. URL http://www.usask.ca/~mor543/contentsuggest [8] C.K. Roy and J.R. Cordy. NICAD: Accurate Detection of Near Miss Intentional Clones Using Flexible Pretty-Printing and Code Normalization. In Proc. ICPC, pages 172-181, 2008. 24

Editor's Notes

  1. Introduce yourself +introductory statements. Today, I am going to talk about how to identify and extract relevant sections automatically from a web page for programming errors and exceptions.
  2. Programming exception is a very frequent and a common experience for software programmers or developers. Once an exception is encountered, the developer identifies the target source line that triggers the exception. They may do some debugging. However, often they go for a web search for quick solution.
  3. In case of web search, the first step is query selection. Often people choose the exception message, the very first line from the stack trace as a search query. Although it might not be sufficient most of the time. However, the next step is of course context-switching, they switch from IDE to the web browser.
  4. The second step is performing the web search itself. However, this search is not often much productive. Studies showed that developers spend about 19% of the development time in web search. Besides, there are some practical challenges. The browser does not know the detailed context of the exception in the IDE, thus the returned pages are not much effective. There is a constant switching between IDE and web browser, which is time-consuming.
  5. However, the most time-consuming step is probably the mapping between problem details in the IDE and the information in web page. Such mapping is non-trivial. It also becomes difficult since they are in two different context– IDE and browser, which are not connected. In this paper, we provide automation support in such mapping between error details and relevant sections from the web page. provides the whole support inside the IDE, that resolves the context-switching problem as well.
  6. This is the outline of my today’s talk. I will first focus on the architecture of our proposed tool– ContentSuggest. Then we will dive into the proposed metrics and the algorithm for relevant content identification. Then we perform two types of experiments:-- (1) evaluation using web pages, and (2) evaluation using SO posts. Then, I will conclude the talk with take-home messages.
  7. This is the system architecture of our tool– ContentSuggest. Suppose we have a search engine embedded in the IDE, and it returns a list of results for an exception encountered. Now, a developer wants to explore the search results. So, our process starts when the developer clicks a web page. Once clicked the page URL is sent to the content extractor module. The extractor module then collects the page content, analyzes the DOM tree of the page. It determines content quality and relevance using the exception details from the IDE, and apply different proposed metrics, which we will discuss in a minute. The different sections of the page is ranked and the top-ranked section in terms of content quality & relevance is returned to the IDE. The IDE then shows that section. The idea is developer would check only the most relevant part from a page. If satisfied, the she can check the whole page for further analysis. This way, she doesn’t need to go through a number of pages all the time.
  8. These are the metrics we used for ranking of different sections from a web page. We consider content density– that refers to the purity of the content. So, if it is only text, then the content is pure. But if it contains hyperlinks like ads, widgets, or anything that sounds like noise, then the content is noisy. We also consider content relevance, that means whether the section in page discusses the relevant exception or not. The content should contain relevant tokens, method calls or relevant code snippet, may be similar to the code in the IDE. We finally combine these two aspects--- content density and content relevance to derive content score for each of the sections from the page.
  9. Now, lets take a look into our technique that determines the most relevant section from a given web page. Suppose, the web page has a structure like this, and our task is to identify the most relevant node. This is DOM structure of a HTML page, that means a document can be represented as a tree. First step is to delete the non-text items such as style, script or img tags. Next step is to determine the content score for each of the nodes from the tree. These are the direct children of body node. Now , delete such child of body that falls below a threshold, here we use the score of body as the threshold. For example, this one is deleted.
  10. Now, we go through each of the remaining child, and look for the maximum score holder. For example, this TD contains the most pure and the most relevant content for the exception. Now what our technique does is– mark that TD as content and rest of the siblings as noise. And this process back tracks up to body node.
  11. Thus, we a get a DOM tree, where each node is either annotated as content or noise. Here, the bold colored nodes are content nodes.
  12. Now we just keep the content node, and discard the noisy node. This way, we keep the page structure, but isolate the most pure and the most relevant content for the encountered exception. This is the most relevant section from the web page.
  13. Now comes the experiments. We conducted experiments with 250 web pages related to 80 programming exceptions. The gold set, the most relevant sections from each page, is created from 25 hour manual work, and it is used for both evaluation and validation purposes. We also conduct experiments with Stack Overflow posts, compare our performance with 2 information retrieval techniques.
  14. We used these three performance metrics for evaluation and validation. The key idea is here the Longest common subsequence of words. That means, we collect the word overlap between retrieved content and gold content to derive these metrics. The same idea was used by earlier techniques from relevant literature.
  15. We try to answer these four research questions from our experiments. How does our technique perform in general? How effective are our proposed metrics– content density and content relevance ? Can our technique perform better than the baseline technique from the literature? Can it perform better than the established IR techniques such Vector Space Model and Latent Semantic Index?
  16. Here we note that when only content density is considered, the performance is not much interesting. Content relevance also does not work well alone. However, when both metrics are combined, the statistics are interesting. We get about 82% precision, 77% recall and about 76% F1-measure, which are promising according to literature.
  17. We compared with a closely-related technique from literature. We also divide the dataset into different sets, and conduct experiments. Our technique performs better for all set and all metrics than the baseline technique. One possible explanation for this performance is probably our content relevance paradigm.
  18. We also used the box plot to analyze the comparison results. We note that our performance measures are undoubtedly higher than the competing technique. Our measures have less variance, the medians are close to 90% + On the other hand, the baseline technique has relatively lower measures.
  19. Since the gold set in the first experiment is manually developed, it might contains some subjective bias. We thus also conducted experiments using Stack Overflow posts, where we try to find out the accepted solution as well as the top-voted answer from the page. Then we also compare with two established IR techniques for the same task. We found that our technique performs relatively better than those 2 techniques. Our technique performs better especially in recall, and therefore in F-measure metrics.
  20. We do more investigation using the box plot. Here, we see our measures have relatively more variance, but better performance in recall and f-measure. Especially mapping such post from a SO page is a real challenge since it involves a lot of factors. However, our technique is found relatively promising compared to the traditional alternatives we have so far.
  21. We identified three threats to the validity of our findings. Gold set might contain some subjective bias, and its hard to remove them properly. However, we performed second experiment to handle the threat. The usability of the technique is properly evaluated yet. However, we did that evaluation in a limited scale.
  22. So, these are take-home messages. Developers spend about 19% of their time for web search, and mapping information from IDE to web browser can be a real challenge. Our technique addresses that concern, and maps exception information from IDE to a web page. We consider purity and relevance of the content for extracting the most relevant sections from a page. Our technique performs better than a baseline technique and 2 IR techniques. The tool can be found online for testing.
  23. Thanks for your time. Questions!!