SlideShare a Scribd company logo
1 of 23
20th Working Conference on Reverse Engineering
(WCRE 2013), Koblenz, Germany

AN IDE-BASED CONTEXT-AWARE
META SEARCH ENGINE
Mohammad Masudur Rahman, Shamima Yeasmin, and
Chanchal K. Roy
Department of Computer Science
University of Saskatchewan
SOFTWARE MAINTENANCE, BUGS &
EXCEPTIONS
EXCEPTION HANDLING: IDE SUPPORT

2

1
EXCEPTION HANDLING: DEVELOPERS
(NOVICE & EXPERT)
EXCEPTION HANDLING: WEB SEARCH
IDE-BASED WEB SEARCH
About 80% effort on Software Maintenance
 Bug fixation– error and exception handling
 Developers spend about 19% of time in web search
 Traditional web search




Does not consider context of search (No ties between
IDE and web browser)
 Context-switching and distracting
 Time consuming
 Often not much productive

o IDE-Based context-aware search addresses
those issues.
EXISTING RELATED WORKS
Cordeiro et al. (RSSE’ 2012)– Context-based
recommendation system
 Ponzanelli et al. (ICSE 2013)– Seahawk
 Poshyvanyk et al. (IWICSS 2007)– COTS (Google
Desktop) into Eclipse IDE
 Brandt et al. (SIGCHI 2010)– Integrating Google
web search into IDE

MOTIVATION EXPERIMENTS
83 Exceptions
 Solutions found for at most 58 exceptions.


Search
Query

Common
Results

Google Only

Yahoo Only

Bing
Only

Content Only

32

09

16

18

Content and
Context

47

09

11

10
THE KEY IDEA !! META SEARCH ENGINE
PROPOSED IDE-BASED META SEARCH MODEL
PROPOSED IDE-BASED META SEARCH
MODEL


Distinguished Features
Meta search engine– captures data from multiple
search engines
 More precise context– both stack trace and associated
code as exception context
 Popularity and confidence of result links
 Complete web browsing experience within the IDE

PROPOSED METRICS & SCORES
Title to title Matching Score (Stitle)– Cosine similarity
measurement
 Stack trace Matching Score (Sst)– SimHash based
similarity measurement
 Code context Matching Score (Scc)– SimHash
based similarity measurement
 StackOverflow Vote Score (Sso)– Summation of
differences between up and down votes for all
posts in the link

PROPOSED METRICS & SCORES
Top Ten Score (Stt)– Position of result link in the top
10 of each provider.
 Page Rank Score (Spr)-- Relative popularity among
all links in the corpus using Page Rank algorithm.
 Site Traffic Rank Score (Sstr)-- Alexa and Compete
Rank of each link
 Search Engine weight (Ssew)---Relative reliability or
importance of each search engine. Experiments
with 75 programming queries against the search
engines.

METRICS NORMALIZATION

S i , normalized

Si

min( S i )

max( S i )

min( S i )

Normalization applied to -- Sst , Scc , Sso , Stt , Spr
and Sstr
 Avoiding bias to any particular aspect

FINAL SCORE COMPONENTS
Content Relevance
Scnt=Stitle
 Context Relevance
Scxt=(Sst + Scc)/2
 Link Popularity
Spop=(Sso +Spr + Sstr)/3
 Search Engine Confidence
Sser=(Ssew x Stt)

EXPERIMENT OVERVIEW
25 Exceptions collected from Eclipse IDE
workspaces.
 Related to Eclipse plug-in framework and Java
Application Development
 Solutions chosen from exhaustive web search with
cross validations by peers
 Recommended results manually validated.

EXPERIMENTAL RESULTS
Score

Top 10

Rank10

Top 20

Rank20

Scnt

10

3.60

16

8.63

Scnt, Scxt

11

3.00

16

7.43

Scnt, Spop

13

4.69

18

8.11

Scnt, Sser

23

4.39

23

4.39

Scnt, Scxt, Spop

13

4.07

18

7.61

Scnt, Scxt, Sser

24

4.45

24

4.45

Scnt, Scxt, Sser, Spop

23

4.26

24

4.54

Top10: No. of test cases solved when the top 10 results
considered
Rank10: Average rank of solutions when the top 10 results considered
USER STUDY
Five interesting exception test cases.
 Five CS graduates research students as
participants.
 Top 10 results from SurfClipse randomly presented
to the participants.
 To avoid the bias of choosing top rated solutions.
 64.28% agreement found.

USER STUDY RESULTS
Question ID

ANSR

ANSM

Agreement

Q1

2.8

2.0

71.43%

Q2

4.6

2.8

60.87%

Q3

4.6

2.4

52.17%

Q4

4.2

3.0

71.43%

Q5

5.8

3.8

65.52%

Overall

4.4

2.8

64.28%

ANSR: Avg. no. of solutions recommended by the participants.
ANSM: Avg. no. of solution matched with that by our approach.
Agreement: % of agreement between solutions.
THREATS TO VALIDITY
Search is not real time yet.
 Different aspects need different weights.

LATEST UPDATES
A Distributed model for IDE-Based web search–
client-server architecture, remotely hosted web
service
 Parallel processing in computation
 Two modes of operations– proactive and interactive
 Granular refinement of metrics and assigning
relative weights (i.e., importance)
 Complete IDE-based web search solution.

CONCLUSION & FUTURE WORKS
A novel IDE-Based search with meta search
capabilities
 Exploits existing search service providers
 Considers content, context, popularity and
search engine confidence of a result.
 Recommends correct solution for 24(96%) out of 25
test cases.
 64.28% agreement in user study.
 Needs more extended experiments and user study.
 Metrics need to be fine-tuned and more granulated.

THANK YOU !!!

More Related Content

Viewers also liked

Creating Living Style Guides to Improve Performance
Creating Living Style Guides to Improve PerformanceCreating Living Style Guides to Improve Performance
Creating Living Style Guides to Improve PerformanceNicole Sullivan
 
Data science challenges in flight search
Data science challenges in flight searchData science challenges in flight search
Data science challenges in flight searchData Science Society
 
The Future of Human Machine Interfaces (HMI)
The Future of Human Machine Interfaces (HMI)The Future of Human Machine Interfaces (HMI)
The Future of Human Machine Interfaces (HMI)Daniel Zahler
 
Online Travel Agencies and Meta-search Websites - A Travel Studio System?
Online Travel Agencies and Meta-search Websites - A Travel Studio System?Online Travel Agencies and Meta-search Websites - A Travel Studio System?
Online Travel Agencies and Meta-search Websites - A Travel Studio System?Robert Cole
 
Introduction to airline reservation systems
Introduction to airline reservation systemsIntroduction to airline reservation systems
Introduction to airline reservation systemsJava and .NET Architect
 
Product Definition
Product DefinitionProduct Definition
Product DefinitionMark Curphey
 
Air ticket reservation system presentation
Air ticket reservation system presentation Air ticket reservation system presentation
Air ticket reservation system presentation Smit Patel
 

Viewers also liked (8)

Creating Living Style Guides to Improve Performance
Creating Living Style Guides to Improve PerformanceCreating Living Style Guides to Improve Performance
Creating Living Style Guides to Improve Performance
 
Data science challenges in flight search
Data science challenges in flight searchData science challenges in flight search
Data science challenges in flight search
 
Air france
Air franceAir france
Air france
 
The Future of Human Machine Interfaces (HMI)
The Future of Human Machine Interfaces (HMI)The Future of Human Machine Interfaces (HMI)
The Future of Human Machine Interfaces (HMI)
 
Online Travel Agencies and Meta-search Websites - A Travel Studio System?
Online Travel Agencies and Meta-search Websites - A Travel Studio System?Online Travel Agencies and Meta-search Websites - A Travel Studio System?
Online Travel Agencies and Meta-search Websites - A Travel Studio System?
 
Introduction to airline reservation systems
Introduction to airline reservation systemsIntroduction to airline reservation systems
Introduction to airline reservation systems
 
Product Definition
Product DefinitionProduct Definition
Product Definition
 
Air ticket reservation system presentation
Air ticket reservation system presentation Air ticket reservation system presentation
Air ticket reservation system presentation
 

Similar to An IDE-Based Context-Aware Meta Search Engine

SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)Masud Rahman
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year JourneyLionel Briand
 
CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016Masud Rahman
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingSteve Feldman
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)Steve Feldman
 
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...Daniel Valcarce
 
Final Presentation V3
Final Presentation V3Final Presentation V3
Final Presentation V3weichen
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approachGarima Nanda
 
Prov4J: A Semantic Web Framework for Generic Provenance Management
Prov4J: A Semantic Web Framework for Generic Provenance Management Prov4J: A Semantic Web Framework for Generic Provenance Management
Prov4J: A Semantic Web Framework for Generic Provenance Management Andre Freitas
 
Overview of OSLC - INCOSE IW 2018 MBSE Workshop
Overview of OSLC - INCOSE IW 2018 MBSE Workshop Overview of OSLC - INCOSE IW 2018 MBSE Workshop
Overview of OSLC - INCOSE IW 2018 MBSE Workshop Axel Reichwein
 
Exploiting Context in Dealing with Programming Errors and Exceptions
Exploiting Context in Dealing with Programming Errors and ExceptionsExploiting Context in Dealing with Programming Errors and Exceptions
Exploiting Context in Dealing with Programming Errors and ExceptionsMasud Rahman
 
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...BAINIDA
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Spark Summit
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
 
Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-MeetingMasud Rahman
 
From Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems ArchitecturesFrom Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems ArchitecturesObeo
 
2cee Master Cocomo20071
2cee Master Cocomo200712cee Master Cocomo20071
2cee Master Cocomo20071CS, NcState
 
Chi 2008 katsanos et al auto_cardsorter_final
Chi 2008 katsanos et al auto_cardsorter_finalChi 2008 katsanos et al auto_cardsorter_final
Chi 2008 katsanos et al auto_cardsorter_finalNikolaos Tselios
 
QUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeQUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeMasud Rahman
 

Similar to An IDE-Based Context-Aware Meta Search Engine (20)

SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
SurfClipse-- An IDE based context-aware Meta Search Engine (ERA Track)
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year Journey
 
CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016CORRECT-ToolDemo-ASE2016
CORRECT-ToolDemo-ASE2016
 
B2 2006 sizing_benchmarking
B2 2006 sizing_benchmarkingB2 2006 sizing_benchmarking
B2 2006 sizing_benchmarking
 
B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)B2 2006 sizing_benchmarking (1)
B2 2006 sizing_benchmarking (1)
 
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
 
Final Presentation V3
Final Presentation V3Final Presentation V3
Final Presentation V3
 
Question Answering System using machine learning approach
Question Answering System using machine learning approachQuestion Answering System using machine learning approach
Question Answering System using machine learning approach
 
Prov4J: A Semantic Web Framework for Generic Provenance Management
Prov4J: A Semantic Web Framework for Generic Provenance Management Prov4J: A Semantic Web Framework for Generic Provenance Management
Prov4J: A Semantic Web Framework for Generic Provenance Management
 
Overview of OSLC - INCOSE IW 2018 MBSE Workshop
Overview of OSLC - INCOSE IW 2018 MBSE Workshop Overview of OSLC - INCOSE IW 2018 MBSE Workshop
Overview of OSLC - INCOSE IW 2018 MBSE Workshop
 
Exploiting Context in Dealing with Programming Errors and Exceptions
Exploiting Context in Dealing with Programming Errors and ExceptionsExploiting Context in Dealing with Programming Errors and Exceptions
Exploiting Context in Dealing with Programming Errors and Exceptions
 
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...
 
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
Clipper: A Low-Latency Online Prediction Serving System: Spark Summit East ta...
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...
 
Code-Review-COW56-Meeting
Code-Review-COW56-MeetingCode-Review-COW56-Meeting
Code-Review-COW56-Meeting
 
From Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems ArchitecturesFrom Model-based to Model and Simulation-based Systems Architectures
From Model-based to Model and Simulation-based Systems Architectures
 
2cee Master Cocomo20071
2cee Master Cocomo200712cee Master Cocomo20071
2cee Master Cocomo20071
 
Chi 2008 katsanos et al auto_cardsorter_final
Chi 2008 katsanos et al auto_cardsorter_finalChi 2008 katsanos et al auto_cardsorter_final
Chi 2008 katsanos et al auto_cardsorter_final
 
QUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeQUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-Singapore
 
STRICT-SANER2017
STRICT-SANER2017STRICT-SANER2017
STRICT-SANER2017
 

More from Masud Rahman

HereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityHereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityMasud Rahman
 
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...Masud Rahman
 
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanPhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanMasud Rahman
 
PhD proposal of Masud Rahman
PhD proposal of Masud RahmanPhD proposal of Masud Rahman
PhD proposal of Masud RahmanMasud Rahman
 
PhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanPhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanMasud Rahman
 
Doctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanDoctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanMasud Rahman
 
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Masud Rahman
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationMasud Rahman
 
CodeInsight-SCAM2015
CodeInsight-SCAM2015CodeInsight-SCAM2015
CodeInsight-SCAM2015Masud Rahman
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017Masud Rahman
 
ACER-ASE2017-slides
ACER-ASE2017-slidesACER-ASE2017-slides
ACER-ASE2017-slidesMasud Rahman
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureMasud Rahman
 

More from Masud Rahman (20)

HereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie UniversityHereWeCode 2022: Dalhousie University
HereWeCode 2022: Dalhousie University
 
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
 
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of SaskatchewanPhD Seminar - Masud Rahman, University of Saskatchewan
PhD Seminar - Masud Rahman, University of Saskatchewan
 
PhD proposal of Masud Rahman
PhD proposal of Masud RahmanPhD proposal of Masud Rahman
PhD proposal of Masud Rahman
 
PhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud RahmanPhD Comprehensive exam of Masud Rahman
PhD Comprehensive exam of Masud Rahman
 
Doctoral Symposium of Masud Rahman
Doctoral Symposium of Masud RahmanDoctoral Symposium of Masud Rahman
Doctoral Symposium of Masud Rahman
 
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
 
MSR2017-Challenge
MSR2017-ChallengeMSR2017-Challenge
MSR2017-Challenge
 
MSR2017-RevHelper
MSR2017-RevHelperMSR2017-RevHelper
MSR2017-RevHelper
 
MSR2015-Challenge
MSR2015-ChallengeMSR2015-Challenge
MSR2015-Challenge
 
MSR2014-Challenge
MSR2014-ChallengeMSR2014-Challenge
MSR2014-Challenge
 
CodeInsight-SCAM2015
CodeInsight-SCAM2015CodeInsight-SCAM2015
CodeInsight-SCAM2015
 
STRICT-SANER2015
STRICT-SANER2015STRICT-SANER2015
STRICT-SANER2015
 
CMPT-842-BRACK
CMPT-842-BRACKCMPT-842-BRACK
CMPT-842-BRACK
 
RACK-Tool-ICSE2017
RACK-Tool-ICSE2017RACK-Tool-ICSE2017
RACK-Tool-ICSE2017
 
RACK-SANER2016
RACK-SANER2016RACK-SANER2016
RACK-SANER2016
 
CORRECT-ICSE2016
CORRECT-ICSE2016CORRECT-ICSE2016
CORRECT-ICSE2016
 
ACER-ASE2017-slides
ACER-ASE2017-slidesACER-ASE2017-slides
ACER-ASE2017-slides
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
 

Recently uploaded

_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptxPoojaSen20
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docxPoojaSen20
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 

Recently uploaded (20)

_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
PSYCHIATRIC History collection FORMAT.pptx
PSYCHIATRIC   History collection FORMAT.pptxPSYCHIATRIC   History collection FORMAT.pptx
PSYCHIATRIC History collection FORMAT.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
MENTAL STATUS EXAMINATION format.docx
MENTAL     STATUS EXAMINATION format.docxMENTAL     STATUS EXAMINATION format.docx
MENTAL STATUS EXAMINATION format.docx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 

An IDE-Based Context-Aware Meta Search Engine

  • 1. 20th Working Conference on Reverse Engineering (WCRE 2013), Koblenz, Germany AN IDE-BASED CONTEXT-AWARE META SEARCH ENGINE Mohammad Masudur Rahman, Shamima Yeasmin, and Chanchal K. Roy Department of Computer Science University of Saskatchewan
  • 6. IDE-BASED WEB SEARCH About 80% effort on Software Maintenance  Bug fixation– error and exception handling  Developers spend about 19% of time in web search  Traditional web search   Does not consider context of search (No ties between IDE and web browser)  Context-switching and distracting  Time consuming  Often not much productive o IDE-Based context-aware search addresses those issues.
  • 7. EXISTING RELATED WORKS Cordeiro et al. (RSSE’ 2012)– Context-based recommendation system  Ponzanelli et al. (ICSE 2013)– Seahawk  Poshyvanyk et al. (IWICSS 2007)– COTS (Google Desktop) into Eclipse IDE  Brandt et al. (SIGCHI 2010)– Integrating Google web search into IDE 
  • 8. MOTIVATION EXPERIMENTS 83 Exceptions  Solutions found for at most 58 exceptions.  Search Query Common Results Google Only Yahoo Only Bing Only Content Only 32 09 16 18 Content and Context 47 09 11 10
  • 9. THE KEY IDEA !! META SEARCH ENGINE
  • 10. PROPOSED IDE-BASED META SEARCH MODEL
  • 11. PROPOSED IDE-BASED META SEARCH MODEL  Distinguished Features Meta search engine– captures data from multiple search engines  More precise context– both stack trace and associated code as exception context  Popularity and confidence of result links  Complete web browsing experience within the IDE 
  • 12. PROPOSED METRICS & SCORES Title to title Matching Score (Stitle)– Cosine similarity measurement  Stack trace Matching Score (Sst)– SimHash based similarity measurement  Code context Matching Score (Scc)– SimHash based similarity measurement  StackOverflow Vote Score (Sso)– Summation of differences between up and down votes for all posts in the link 
  • 13. PROPOSED METRICS & SCORES Top Ten Score (Stt)– Position of result link in the top 10 of each provider.  Page Rank Score (Spr)-- Relative popularity among all links in the corpus using Page Rank algorithm.  Site Traffic Rank Score (Sstr)-- Alexa and Compete Rank of each link  Search Engine weight (Ssew)---Relative reliability or importance of each search engine. Experiments with 75 programming queries against the search engines. 
  • 14. METRICS NORMALIZATION S i , normalized Si min( S i ) max( S i ) min( S i ) Normalization applied to -- Sst , Scc , Sso , Stt , Spr and Sstr  Avoiding bias to any particular aspect 
  • 15. FINAL SCORE COMPONENTS Content Relevance Scnt=Stitle  Context Relevance Scxt=(Sst + Scc)/2  Link Popularity Spop=(Sso +Spr + Sstr)/3  Search Engine Confidence Sser=(Ssew x Stt) 
  • 16. EXPERIMENT OVERVIEW 25 Exceptions collected from Eclipse IDE workspaces.  Related to Eclipse plug-in framework and Java Application Development  Solutions chosen from exhaustive web search with cross validations by peers  Recommended results manually validated. 
  • 17. EXPERIMENTAL RESULTS Score Top 10 Rank10 Top 20 Rank20 Scnt 10 3.60 16 8.63 Scnt, Scxt 11 3.00 16 7.43 Scnt, Spop 13 4.69 18 8.11 Scnt, Sser 23 4.39 23 4.39 Scnt, Scxt, Spop 13 4.07 18 7.61 Scnt, Scxt, Sser 24 4.45 24 4.45 Scnt, Scxt, Sser, Spop 23 4.26 24 4.54 Top10: No. of test cases solved when the top 10 results considered Rank10: Average rank of solutions when the top 10 results considered
  • 18. USER STUDY Five interesting exception test cases.  Five CS graduates research students as participants.  Top 10 results from SurfClipse randomly presented to the participants.  To avoid the bias of choosing top rated solutions.  64.28% agreement found. 
  • 19. USER STUDY RESULTS Question ID ANSR ANSM Agreement Q1 2.8 2.0 71.43% Q2 4.6 2.8 60.87% Q3 4.6 2.4 52.17% Q4 4.2 3.0 71.43% Q5 5.8 3.8 65.52% Overall 4.4 2.8 64.28% ANSR: Avg. no. of solutions recommended by the participants. ANSM: Avg. no. of solution matched with that by our approach. Agreement: % of agreement between solutions.
  • 20. THREATS TO VALIDITY Search is not real time yet.  Different aspects need different weights. 
  • 21. LATEST UPDATES A Distributed model for IDE-Based web search– client-server architecture, remotely hosted web service  Parallel processing in computation  Two modes of operations– proactive and interactive  Granular refinement of metrics and assigning relative weights (i.e., importance)  Complete IDE-based web search solution. 
  • 22. CONCLUSION & FUTURE WORKS A novel IDE-Based search with meta search capabilities  Exploits existing search service providers  Considers content, context, popularity and search engine confidence of a result.  Recommends correct solution for 24(96%) out of 25 test cases.  64.28% agreement in user study.  Needs more extended experiments and user study.  Metrics need to be fine-tuned and more granulated. 

Editor's Notes

  1. Good Morning everyoneI am Masudur Rahman from University of Saskatchewan. Welcome to my presentation.Here, I am going to present our paper titled as “An IDE-Based Context-Aware Meta Search Engine”Basically, here, we proposed an IDE-Based recommendation system that works like a meta search engine, that means, it captures results from multiple search engines against a selected exception, and then analyze them to produce a better and context-relevant result set.
  2. Study shows that about 50%-80% effort is spent on software maintenance, And one of major concern during maintenance is bug fixation.Software bugs are generally associated with runtime different errors and exceptions. To deal with that errors and exceptions, developers spend a lot of time, its about 19% of their programming time.Why? Its because of the traditional web search which has no ties with the IDE.It does not consider the context of the problem developer is facing and developer has to include the context information into the search query, which is challenging, because which term is more important than others is not clear; so, basically, this is a trial and error approach for the developer which is time-consuming.Besides, the switching between IDE and the web browser is often not very interesting if you are trying to concentrate on a problem in the IDE.So, what is the solution???IDE-based search engine, and it has to consider the problem context of course.
  3. There are some existing studies that try to address the issues of traditional web search.However, they are basically based on StackOverflow, for example the first two works.StackOverflow is a big source of information and recently it has 1.9 million users with 12 million posts; However, we cannot ignore the whole web for information, and that is why our approach comes into play.The rest two works basically tries to integrate Google desktop search and Google web search in the IDE; However, we are interested to exploit multiple search engines to get more confident set of results for the developer.The baseline idea is to leverage the existing resources for solving technical challenges in a smart way.
  4. This is our proposed meta search model for IDE-based recommendation.It has tow modules:Client moduleComputation module.Once the developer selects an exception from Error log or console view, the client module captures the error message, stack trace and the context code likely responsible for exception and sends to the computation module.Upon getting the search request, the computation module sends the error message to multiple search engines. We use Google, Bing, Yahoo and the StackOverflow API to collect results and use them to develop the corpus. Once the corpus is developed, we apply our proposed metrics and algorithms to produce a result set that is relevant to the encountered exception.Now lets assume, what the traditional search engines do? Do they consider context? No. It’s the developer who has to represent the context besides the error message in the search query.
  5. So, basically, we are providing four interesting and essential things in this model.It exploits the idea of meta search. Why meta search? Lets discuss in the Q/A session.We are considering more precise context: both stack trace and context code.We are also considering popularity and confidence of a result link.And the developer can readily browse the web link recommended with in IDE.
  6. These are metrics we consider to determine the relevance of a result page against the query exception.Please note that we collected the exception message and exception context in the form of stack trace and context code during search request.So, title to tile matching basically tries to determine the content similarity between exception message and the result page title. We use cosine similarity measurement for that.Then comes the context information. We did HTML scraping and extract the content from different tags like pre, code, blockquote as they are likely to contain the context information about the discussed problems in the page. Once extracted, we use SimHash based similarity to determine the relevance of the discussed problem with the query exception. SimHash basically produces a Hash value for a block of content, and if the hash value of two blocks are closer, they are considered similar. We use this metric for both stack trace and context code matching.
  7. We also consider other metrics likeTop ten score – it marks if a result is found within the top 10 results of any search engine.Page Rank score – we develop a artificial network among the result links in the corpus to determine their relative importance using PageRank algorithm.Site Traffic Rank score – we collect Alexa rank for each result link.Search engine weight– we calculate the support for each result from the search engines.
  8. For most of the metricsWe use this formula to perform the normalization if it is not already normalized.
  9. So, we have got different perspectives of each result linkAnd now, we use those perspectives to determine different types of scores.We get:Content-relevanceContext-relevancePopularity andSearch engine confidence for each result.Then, we give all of them equal share in the final score, and now we are working on their relative weights.
  10. We design a limited experiment with 25 exceptions related to Eclipse plug-in framework And we got interesting results.
  11. Here, we decompose different component scores and show how much effective they are in recommendation.We collect the top 10 and top 20 results and found that our algorithm can recommend up to 24 exceptions.More interestingly, solutions are found within the top 5 positions mostly.
  12. We also try to test the recommendations through a user study, because, the approach is all about the users benefit.So, we collect the top 10 recommended results for five exceptions and present the results to the participants in a randomized order.The idea is to check how developers apply their sense of relevance.
  13. Here, we got 64.28% agreement between our results and their confirmation.Basically, what we did is– we tried to map their selection to the top 5 results of each exception and found that agreement.We also noticed that the result which they mark as relevant are found mostly in the top 5 results of our recommendation.So, it shows that the tool is working quite accurately in relevance computation as an initial attempt.However, it requires more extended experiment and extended user study to claim something solid.
  14. Here are some latest updates about what we did by this time:We applied parallel processing in the computation model to make it work faster. Present version is quite slower and not like real time.We implemented a client-server architecture for this search so that it can be platform-independent and any IDE can leverage the search as a web serviceWe implemented two modes of operation– proactive and interactive, where the proactive version automatically triggers upon an exception.We did more analysis with the metrics and scores.
  15. To summarize,We proposed a novel IDE-based search approach that exploits problem context and collects results from a meta search engine.Our preliminary experiments show some interesting results.However, of course the idea needs to be further experimented and tested to discover its potential which is our future workAnd we are working on that.
  16. So, that’s all about my talk. Thanks to all for your time.Questions ??