SlideShare a Scribd company logo
1 of 44
Download to read offline
Combining Information
   Retrieval and Relevance
Feedback for Concept Location

              Sonia Haiduc

       - WSU CS Graduate Seminar -
              Jan 19, 2010
                                     1
Software changes
• Software maintenance: 50-90% of the global
  costs of software
• Why does software change?
  – Requirements change
• What types of changes?
  – Implement new functionality
  – Change existing functionality
  – Fix bugs, etc.

        How do we change software?
                                               2
Process of software change




                             3
Concept location
• Starts with a change request (bug report, new
  feature request)
• Ends with the location where the change is to
  start (e.g., a class or a method)
• Approaches:
  – Static: searching, navigating source code based on
    dependencies between entities (inheritance, method
    calls, etc.), etc.
  – Dynamic: analyze execution traces
  – Combined
                                                         4
Concept location example
Bug #2373904, ATunes v1.10.0

• Title: Fullscreen doesn't work for
  Windows
• Text from bug tracker: Fullscreen looks
  like screenshot.


                                            5
Searching
• One of the main activities performed by
  developers while changing software
• Used during concept location
• IDEs support searching
• Traditional source code search: keyword
  and regular expression matching
• Modern source code search: Information
  Retrieval (IR)
                                            6
IR-based concept location
      Query




                         IR Engine



Ranked list of results               Index




                                             7
The searching paradox

• User has an information need because
  she does not know something
• Search systems are designed to satisfy
  these needs, but the user needs to know
  what she is looking for (query)
• If the user knows what she is looking for,
  there may not be a need to search in the
  first place
                                               8
Challenge in search-based CL:
             the query



• Text in the query needs to match the text in the
  source code
• Difficult to formulate good queries
  - Unfamiliar source code
  - Unknown target
• Some developers write better queries than others
                                                     9
Eclipse bug #13926

Bug description:
 JFace Text Editor Leaves a Black
 Rectangle on Content Assist text insertion.
 Inserting a selected completion proposal
 from the context information popup causes
 a black rectangle to appear on top of the
 display.

                                         10
Queries
• Q1: jface text editor black rectangle insert text

• Q2: jface text editor rectangle insert context
  information

• Q3: jface text editor content assist

• Q4: jface insert selected completion proposal
  context information

                                                      11
Ranked list of results

       Results
• Q1: jface text editor black rectangle insert text
   – Position of buggy method: 7496
• Q2: jface text editor rectangle insert context
  information
   – Position of buggy method: 258
• Q3: jface text editor content assist
   – Position of buggy method: 119
• Q4: jface insert selected completion proposal context
  information
   – Position of buggy method: 723
      Q5: Whole change request: position 54
                                                        12
IR-based CL in unfamiliar
               software

Developers:
• Rarely begin with a good query: hard to choose
  the right words
• Even after reformulation, vague idea of what to
  look for -> queries not always better
• Can recognize whether the results retrieved are
  relevant or not to the problem

                                                    13
Relevance Feedback (RF)
• Idea: you may not know how to articulate what you
  are looking for, but you will recognize it when you
  see it
• Helps users iteratively refine their query until they
  find what they are looking for
• Every consecutive query is meant to bring the
  search results closer to the target (“more like this”)
• Queries are reformulated automatically based on:
   – The initial results to the query
   – Information about whether or not these results are
     relevant to the goal, often provided by users (feedback)
                                                          14
Types of relevance feedback
• Explicit
   – Users specify which search results are relevant and
     which are not
• Implicit
   – User navigates
   – Clicked-on documents: relevant
   – Example: Google search (user logged in)
• Pseudo
   – The top n search results are automatically considered
     relevant
                                                           15
IR + RF
• RF found to be one of the most powerful methods for
  improving IR performance (40-60% increase in
  precision)
• Often you just need feedback about a few documents to
  really make your results better
• First and most common approach: Rocchio
   –   Recognize what is relevant for your goal and what is irrelevant
   –   Add terms from relevant documents to query
   –   Remove terms from irrelevant documents from query
   –   Iterative process – rounds of feedback
   –   Four parameters, representing the number of documents marked
       in a round of feedback, and the importance of: the initial query,
       relevant terms, irrelevant terms

                                                                      16
Search Engine
java




                17
java +coffee +flavor +buy
–programming –island -sdk




java +island +christmas +buy +vacation
-coffee –programming -sdk




java +programming +example +arraylist
–island –coffee –sdk


                                         18
19
20
21
22
IR + RF for concept location
• Developers can focus on source code rather
  than queries
• Developers writing poorer queries can get
  results as good as the ones writing good queries
• Process:
  –   Corpus creation
  –   Indexing
  –   Query formulation
  –   Ranking documents
  –   Results examination
  –   Feedback

                                                 23
Corpus creation
• First stage in using an IR technique
• Source code considered as a text corpus:
  identifiers and comments
• The source code is parsed using a developer-
  defined granularity level (i.e., methods or
  classes)
• Each method (or class) is extracted from the
  source code and has a corresponding document
  in the corpus
• Eliminating common terms from corpus
• Natural language processing (NLP) techniques
  can be applied to the corpus (e.g., stemming)
                                              24
Indexing

• The corpus is indexed using the IR technique
• A mathematical representation of the corpus is
  created - Vector Space Model (VSM)
• Each document (i.e., each method or class) has a
  corresponding vector in the VSM




                                                25
Query formulation

• A developer selects a set of words from the change
  request which constitutes the initial query
• If filtering or NLP was used in the corpus creation,
  the query will get the same treatment
• Simpler approach: consider the entire change
  request as the initial query
• Query is also represented in VSM as a vector


                                                   26
Ranking documents
• Similarities between the query and every
  document from the source code are computed
• The documents in the corpus are ranked
  according to the similarity to the query
• The similarity measure depends on the IR
  method used
   – Cosine similarity: angle between the vector
     representations of the query and the document


                                                27
Results examination
• Iterative process
• Steps:
  1. The developer examines the top N documents in the
     ranked list of results
  2. For every document examined, the developer makes a
     decision on whether the document will be changed
  3. If it will be changed, the search succeeded and
     concept location ends.
  4. Else, the developer marks the document as relevant,
     irrelevant. If a decision cannot be made, examine the
     next document in the list
  5. After the N documents are marked a new query is
     automatically formulated
                                                       28
Change Request = Initial Query
JFace Text Editor Leaves a Black Rectangle on Content Assist text
insertion. Inserting a selected completion proposal from the
context information popup causes a black rectangle to appear on
top of the display.

                                  IR

1.   createContextInfoPopup() in                                    RF
     org.eclipse.jface.text.contentassist.ContextInformationPopup
2.   configure() in
     org.eclipse.jdt.internal.debug.ui.JDIContentAssistPreference
3.   showContextProposals() in
     org.eclipse.jface.text.contentassist.ContextInformationPopup

+ words in       documents             - words in      documents

                           New Query                                29
IRRF tool
• IR Engine: Lucene
  – Based on the Vector Space Model (VSM)
  – Input: methods, query
  – Output: a ranked list of methods ordered by their textual
    similarity to the query


• Relevance feedback: Rocchio algorithm
  – Models a way of incorporating relevance feedback
    information into the VSM
  – Relevance tags: “relevant”, “irrelevant”, “don’t know”
                                                             30
First study (ICSM ‘09)
• Goal: study if the use of RF improves IR
• One developer providing RF
  – 7 years of programming experience (5 years in Java)
  – Not familiar with the systems used
• Concept location for 16 bug reports from 3
  systems
  – Extracted the text of the bug reports and the list of
    changed methods from bug tracking systems
  – The modified methods are called target methods
  – Task: locate one of the target methods
                                                            31
Methodology
• Consider bug textual description as initial query
• Perform concept location:
  – Using only IR
  – Using IRRF (N = 1, 3 ,5 documents in a round of
    feedback)
• Metric: #methods analyzed until finding a target
  method
• IRRF: iterative – stop when:
  – Target method in top N results - Success
  – More than 50 methods analyzed - Failure
  – Position of target methods in the ranked list of results
    increases for 2 consecutive rounds (query moving away
    from target) - Failure                                   32
Systems


 System     Vers.   LOC    Methods      Classes
 Eclipse     2.0 2,500,000 74,996        7,500
  jEdit     4.2      300,000   5,366     750
Adempiere   3.1.0    330,000   28,622   1,900



                                               33
Results

 System     RF improves IR   RF does not
                             improve IR
 Eclipse          6               1

  jEdit           3              3

Adempiere         4              1

  Total          13              5

                                           34
Conclusions of the first study

• RF is a promising technique for easing query
  formulation during concept location

• RF improves IR-based concept location in most
  cases

• In some cases RF does not improve IR (user
  mislead by the poor description of the bug)

                                                  35
Some limitations of the study
• RF depends on a series of parameters
  – we chose the values based on literature and
    preliminary tests
  – we used the same values for all systems


• Only one developer provided feedback

• CL ended automatically when a target method
  was retrieved in the top n results

                                                  36
Second study - goals
• Address the limitations of the first study
• Parameters:
   – Design a methodology to automatically find the most
     suited parameters for a particular software system
   – Calibrate the parameters separately for each system
• Number of developers:
   – Collect data from a larger number of developers
• Realistic concept location process (CL):
   – Does not end automatically when target method is in
     top N results
   – Developers can change their mind about the
     relevance of a method
                                                           37
Calibration of parameters
• Implemented a tool that automatically finds the RF
  parameters
  –   Number of documents in a round of feedback (N)
  –   Weight of terms in the initial query (a)
  –   Weight of terms from relevant documents (b)
  –   Weight of terms from irrelevant documents (c)
• Idea: mimic CL on a software system using a set of
  bug reports and their target methods
• 4 change requests per system, 4 software systems

                                                       38
Calibration of parameters (contd)
• For each change request - 500 configurations:
   –   N = 1, 2, 3, 4, 5
   –   a = 0.25, 0.5, 0.75, 1
   –   b = 0, 0.25, 0.5, 0.75, 1
   –   c = 0, 0.25, 0.5, 0.75, 1
• For each configuration {N, a, b, c}:
   – Used ASTAR algorithm to simulate an ideal user: find the optimum
     set of relevance tags for the methods (relevant or irrelevant) so that
     the target methods are retrieved the fastest
   – Recorded the number of methods analyzed to find a target method
• Chose the parameter combination that lead to the best
  results for all 4 change requests for a system
• Findings: each system has a different configuration of
  parameters
                                                                      39
Study setup
• Participants: 18 students in a Data Mining
  class @ West Virginia University
• Java quiz
• Systems and change requests:
  – 4 Java systems, different domains and sizes
  – 4 bug fix requests for each system (different
    than the ones used for calibration)
• IRRF Tool: integrates the parameters
  determined in the calibration phase
                                                    40
Change in methodology
• Relevance marks: “relevant”, “irrelevant”,
  “don’t know”, “target”
• Concept location stops when:
  – The user marks a method as “target”
  – The user analyzed more than 50 methods
• Users can change their mind about the
  relevance of a method at any time query
  reformulated accordingly
                                               41
Status


• Data collection finalized

• Data analysis in progress




                              42
Conclusions
• RF is a good addition to IR for concept
  location
  – Easier query formulation
  – Better results
• RF parameters need to be adjusted
  automatically to each software system
• More studies needed to understand better
  the situations when RF does not improve IR
                                            43
Future work
• Relevance feedback in IDEs, to help
  developers search and navigate source
  code
• RF for other software engineering tasks
• Other RF algorithms
• Other types of relevance feedback:
  implicit, pseudo
• Study how developers decide if a
  document is relevant or not to their task
                                              44

More Related Content

What's hot

Information Retrieval
Information RetrievalInformation Retrieval
Information Retrievalrchbeir
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Lucidworks
 
Mining Product Reputations On the Web
Mining Product Reputations On the WebMining Product Reputations On the Web
Mining Product Reputations On the Webfeiwin
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Rakebul Hasan
 
Semantic Annotation of Documents
Semantic Annotation of DocumentsSemantic Annotation of Documents
Semantic Annotation of Documentssubash chandra
 
Tanvi Motwani- A Few Examples Go A Long Way
Tanvi Motwani- A Few Examples Go A Long WayTanvi Motwani- A Few Examples Go A Long Way
Tanvi Motwani- A Few Examples Go A Long WayTanvi Motwani
 
Answer extraction and passage retrieval for
Answer extraction and passage retrieval forAnswer extraction and passage retrieval for
Answer extraction and passage retrieval forWaheeb Ahmed
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisStuart Wrigley
 
Evaluating Semantic Search Systems to Identify Future Directions of Research
Evaluating Semantic Search Systems to Identify Future Directions of ResearchEvaluating Semantic Search Systems to Identify Future Directions of Research
Evaluating Semantic Search Systems to Identify Future Directions of ResearchStuart Wrigley
 
Topic modeling of marketing scientific papers: An experimental survey
Topic modeling of marketing scientific papers: An experimental surveyTopic modeling of marketing scientific papers: An experimental survey
Topic modeling of marketing scientific papers: An experimental surveyICDEcCnferenece
 
Probabilistic Models of Novel Document Rankings for Faceted Topic Retrieval
Probabilistic Models of Novel Document Rankings for Faceted Topic RetrievalProbabilistic Models of Novel Document Rankings for Faceted Topic Retrieval
Probabilistic Models of Novel Document Rankings for Faceted Topic RetrievalYI-JHEN LIN
 
Reverse engineering and theory building v3
Reverse engineering and theory building v3Reverse engineering and theory building v3
Reverse engineering and theory building v3ClarkTony
 
Probabilistic retrieval model
Probabilistic retrieval modelProbabilistic retrieval model
Probabilistic retrieval modelbaradhimarch81
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown BagDataTactics
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsParang Saraf
 
RES812 U4 Individual Project
RES812  U4 Individual ProjectRES812  U4 Individual Project
RES812 U4 Individual ProjectThienSi Le
 

What's hot (20)

Sybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal PresentationSybrandt Thesis Proposal Presentation
Sybrandt Thesis Proposal Presentation
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
Implementing Conceptual Search in Solr using LSA and Word2Vec: Presented by S...
 
2011 EASE - Motivation in Software Engineering: A Systematic Review Update
2011 EASE - Motivation in Software Engineering: A Systematic Review Update2011 EASE - Motivation in Software Engineering: A Systematic Review Update
2011 EASE - Motivation in Software Engineering: A Systematic Review Update
 
Mining Product Reputations On the Web
Mining Product Reputations On the WebMining Product Reputations On the Web
Mining Product Reputations On the Web
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
 
Semantic Annotation of Documents
Semantic Annotation of DocumentsSemantic Annotation of Documents
Semantic Annotation of Documents
 
Text categorization
Text categorizationText categorization
Text categorization
 
Tanvi Motwani- A Few Examples Go A Long Way
Tanvi Motwani- A Few Examples Go A Long WayTanvi Motwani- A Few Examples Go A Long Way
Tanvi Motwani- A Few Examples Go A Long Way
 
Answer extraction and passage retrieval for
Answer extraction and passage retrieval forAnswer extraction and passage retrieval for
Answer extraction and passage retrieval for
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
 
Evaluating Semantic Search Systems to Identify Future Directions of Research
Evaluating Semantic Search Systems to Identify Future Directions of ResearchEvaluating Semantic Search Systems to Identify Future Directions of Research
Evaluating Semantic Search Systems to Identify Future Directions of Research
 
Topic modeling of marketing scientific papers: An experimental survey
Topic modeling of marketing scientific papers: An experimental surveyTopic modeling of marketing scientific papers: An experimental survey
Topic modeling of marketing scientific papers: An experimental survey
 
Probabilistic Models of Novel Document Rankings for Faceted Topic Retrieval
Probabilistic Models of Novel Document Rankings for Faceted Topic RetrievalProbabilistic Models of Novel Document Rankings for Faceted Topic Retrieval
Probabilistic Models of Novel Document Rankings for Faceted Topic Retrieval
 
Reverse engineering and theory building v3
Reverse engineering and theory building v3Reverse engineering and theory building v3
Reverse engineering and theory building v3
 
Probabilistic retrieval model
Probabilistic retrieval modelProbabilistic retrieval model
Probabilistic retrieval model
 
Data Science and Analytics Brown Bag
Data Science and Analytics Brown BagData Science and Analytics Brown Bag
Data Science and Analytics Brown Bag
 
Haystacks slides
Haystacks slidesHaystacks slides
Haystacks slides
 
Concurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector RepresentationsConcurrent Inference of Topic Models and Distributed Vector Representations
Concurrent Inference of Topic Models and Distributed Vector Representations
 
RES812 U4 Individual Project
RES812  U4 Individual ProjectRES812  U4 Individual Project
RES812 U4 Individual Project
 

Similar to Combining IR with Relevance Feedback for Concept Location

Concept Location using Information Retrieval and Relevance Feedback
Concept Location using Information Retrieval and Relevance FeedbackConcept Location using Information Retrieval and Relevance Feedback
Concept Location using Information Retrieval and Relevance FeedbackSonia Haiduc
 
Irrf Presentation
Irrf PresentationIrrf Presentation
Irrf Presentationgregoryg
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
 
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.comPersonalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.comLucidworks
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Lucidworks
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentationTao Feng
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineSalford Systems
 
Improved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
Improved Search With Lucene 4.0 - NOVA Lucene/Solr MeetupImproved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
Improved Search With Lucene 4.0 - NOVA Lucene/Solr Meetuprcmuir
 
Personalized Re-Ranking of Documents
Personalized Re-Ranking of DocumentsPersonalized Re-Ranking of Documents
Personalized Re-Ranking of Documentskswapna9
 
Measuring the quality of web search engines
Measuring the quality of web search enginesMeasuring the quality of web search engines
Measuring the quality of web search enginesDirk Lewandowski
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notesAnandh Arumugakan
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comSimon Hughes
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Lucidworks
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningJoaquin Delgado PhD.
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningS. Diana Hu
 

Similar to Combining IR with Relevance Feedback for Concept Location (20)

Concept Location using Information Retrieval and Relevance Feedback
Concept Location using Information Retrieval and Relevance FeedbackConcept Location using Information Retrieval and Relevance Feedback
Concept Location using Information Retrieval and Relevance Feedback
 
Irrf Presentation
Irrf PresentationIrrf Presentation
Irrf Presentation
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.comPersonalized Search and Job Recommendations - Simon Hughes, Dice.com
Personalized Search and Job Recommendations - Simon Hughes, Dice.com
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 
Data council sf amundsen presentation
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
 
Machine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search EngineMachine Learned Relevance at A Large Scale Search Engine
Machine Learned Relevance at A Large Scale Search Engine
 
Improved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
Improved Search With Lucene 4.0 - NOVA Lucene/Solr MeetupImproved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
Improved Search With Lucene 4.0 - NOVA Lucene/Solr Meetup
 
Personalized Re-Ranking of Documents
Personalized Re-Ranking of DocumentsPersonalized Re-Ranking of Documents
Personalized Re-Ranking of Documents
 
Measuring the quality of web search engines
Measuring the quality of web search enginesMeasuring the quality of web search engines
Measuring the quality of web search engines
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
Unit 1
Unit 1Unit 1
Unit 1
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 
Final presentation
Final presentationFinal presentation
Final presentation
 

Recently uploaded

Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxAmita Gupta
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 

Recently uploaded (20)

Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Combining IR with Relevance Feedback for Concept Location

  • 1. Combining Information Retrieval and Relevance Feedback for Concept Location Sonia Haiduc - WSU CS Graduate Seminar - Jan 19, 2010 1
  • 2. Software changes • Software maintenance: 50-90% of the global costs of software • Why does software change? – Requirements change • What types of changes? – Implement new functionality – Change existing functionality – Fix bugs, etc. How do we change software? 2
  • 4. Concept location • Starts with a change request (bug report, new feature request) • Ends with the location where the change is to start (e.g., a class or a method) • Approaches: – Static: searching, navigating source code based on dependencies between entities (inheritance, method calls, etc.), etc. – Dynamic: analyze execution traces – Combined 4
  • 5. Concept location example Bug #2373904, ATunes v1.10.0 • Title: Fullscreen doesn't work for Windows • Text from bug tracker: Fullscreen looks like screenshot. 5
  • 6. Searching • One of the main activities performed by developers while changing software • Used during concept location • IDEs support searching • Traditional source code search: keyword and regular expression matching • Modern source code search: Information Retrieval (IR) 6
  • 7. IR-based concept location Query IR Engine Ranked list of results Index 7
  • 8. The searching paradox • User has an information need because she does not know something • Search systems are designed to satisfy these needs, but the user needs to know what she is looking for (query) • If the user knows what she is looking for, there may not be a need to search in the first place 8
  • 9. Challenge in search-based CL: the query • Text in the query needs to match the text in the source code • Difficult to formulate good queries - Unfamiliar source code - Unknown target • Some developers write better queries than others 9
  • 10. Eclipse bug #13926 Bug description: JFace Text Editor Leaves a Black Rectangle on Content Assist text insertion. Inserting a selected completion proposal from the context information popup causes a black rectangle to appear on top of the display. 10
  • 11. Queries • Q1: jface text editor black rectangle insert text • Q2: jface text editor rectangle insert context information • Q3: jface text editor content assist • Q4: jface insert selected completion proposal context information 11
  • 12. Ranked list of results Results • Q1: jface text editor black rectangle insert text – Position of buggy method: 7496 • Q2: jface text editor rectangle insert context information – Position of buggy method: 258 • Q3: jface text editor content assist – Position of buggy method: 119 • Q4: jface insert selected completion proposal context information – Position of buggy method: 723 Q5: Whole change request: position 54 12
  • 13. IR-based CL in unfamiliar software Developers: • Rarely begin with a good query: hard to choose the right words • Even after reformulation, vague idea of what to look for -> queries not always better • Can recognize whether the results retrieved are relevant or not to the problem 13
  • 14. Relevance Feedback (RF) • Idea: you may not know how to articulate what you are looking for, but you will recognize it when you see it • Helps users iteratively refine their query until they find what they are looking for • Every consecutive query is meant to bring the search results closer to the target (“more like this”) • Queries are reformulated automatically based on: – The initial results to the query – Information about whether or not these results are relevant to the goal, often provided by users (feedback) 14
  • 15. Types of relevance feedback • Explicit – Users specify which search results are relevant and which are not • Implicit – User navigates – Clicked-on documents: relevant – Example: Google search (user logged in) • Pseudo – The top n search results are automatically considered relevant 15
  • 16. IR + RF • RF found to be one of the most powerful methods for improving IR performance (40-60% increase in precision) • Often you just need feedback about a few documents to really make your results better • First and most common approach: Rocchio – Recognize what is relevant for your goal and what is irrelevant – Add terms from relevant documents to query – Remove terms from irrelevant documents from query – Iterative process – rounds of feedback – Four parameters, representing the number of documents marked in a round of feedback, and the importance of: the initial query, relevant terms, irrelevant terms 16
  • 18. java +coffee +flavor +buy –programming –island -sdk java +island +christmas +buy +vacation -coffee –programming -sdk java +programming +example +arraylist –island –coffee –sdk 18
  • 19. 19
  • 20. 20
  • 21. 21
  • 22. 22
  • 23. IR + RF for concept location • Developers can focus on source code rather than queries • Developers writing poorer queries can get results as good as the ones writing good queries • Process: – Corpus creation – Indexing – Query formulation – Ranking documents – Results examination – Feedback 23
  • 24. Corpus creation • First stage in using an IR technique • Source code considered as a text corpus: identifiers and comments • The source code is parsed using a developer- defined granularity level (i.e., methods or classes) • Each method (or class) is extracted from the source code and has a corresponding document in the corpus • Eliminating common terms from corpus • Natural language processing (NLP) techniques can be applied to the corpus (e.g., stemming) 24
  • 25. Indexing • The corpus is indexed using the IR technique • A mathematical representation of the corpus is created - Vector Space Model (VSM) • Each document (i.e., each method or class) has a corresponding vector in the VSM 25
  • 26. Query formulation • A developer selects a set of words from the change request which constitutes the initial query • If filtering or NLP was used in the corpus creation, the query will get the same treatment • Simpler approach: consider the entire change request as the initial query • Query is also represented in VSM as a vector 26
  • 27. Ranking documents • Similarities between the query and every document from the source code are computed • The documents in the corpus are ranked according to the similarity to the query • The similarity measure depends on the IR method used – Cosine similarity: angle between the vector representations of the query and the document 27
  • 28. Results examination • Iterative process • Steps: 1. The developer examines the top N documents in the ranked list of results 2. For every document examined, the developer makes a decision on whether the document will be changed 3. If it will be changed, the search succeeded and concept location ends. 4. Else, the developer marks the document as relevant, irrelevant. If a decision cannot be made, examine the next document in the list 5. After the N documents are marked a new query is automatically formulated 28
  • 29. Change Request = Initial Query JFace Text Editor Leaves a Black Rectangle on Content Assist text insertion. Inserting a selected completion proposal from the context information popup causes a black rectangle to appear on top of the display. IR 1. createContextInfoPopup() in RF org.eclipse.jface.text.contentassist.ContextInformationPopup 2. configure() in org.eclipse.jdt.internal.debug.ui.JDIContentAssistPreference 3. showContextProposals() in org.eclipse.jface.text.contentassist.ContextInformationPopup + words in documents - words in documents New Query 29
  • 30. IRRF tool • IR Engine: Lucene – Based on the Vector Space Model (VSM) – Input: methods, query – Output: a ranked list of methods ordered by their textual similarity to the query • Relevance feedback: Rocchio algorithm – Models a way of incorporating relevance feedback information into the VSM – Relevance tags: “relevant”, “irrelevant”, “don’t know” 30
  • 31. First study (ICSM ‘09) • Goal: study if the use of RF improves IR • One developer providing RF – 7 years of programming experience (5 years in Java) – Not familiar with the systems used • Concept location for 16 bug reports from 3 systems – Extracted the text of the bug reports and the list of changed methods from bug tracking systems – The modified methods are called target methods – Task: locate one of the target methods 31
  • 32. Methodology • Consider bug textual description as initial query • Perform concept location: – Using only IR – Using IRRF (N = 1, 3 ,5 documents in a round of feedback) • Metric: #methods analyzed until finding a target method • IRRF: iterative – stop when: – Target method in top N results - Success – More than 50 methods analyzed - Failure – Position of target methods in the ranked list of results increases for 2 consecutive rounds (query moving away from target) - Failure 32
  • 33. Systems System Vers. LOC Methods Classes Eclipse 2.0 2,500,000 74,996 7,500 jEdit 4.2 300,000 5,366 750 Adempiere 3.1.0 330,000 28,622 1,900 33
  • 34. Results System RF improves IR RF does not improve IR Eclipse 6 1 jEdit 3 3 Adempiere 4 1 Total 13 5 34
  • 35. Conclusions of the first study • RF is a promising technique for easing query formulation during concept location • RF improves IR-based concept location in most cases • In some cases RF does not improve IR (user mislead by the poor description of the bug) 35
  • 36. Some limitations of the study • RF depends on a series of parameters – we chose the values based on literature and preliminary tests – we used the same values for all systems • Only one developer provided feedback • CL ended automatically when a target method was retrieved in the top n results 36
  • 37. Second study - goals • Address the limitations of the first study • Parameters: – Design a methodology to automatically find the most suited parameters for a particular software system – Calibrate the parameters separately for each system • Number of developers: – Collect data from a larger number of developers • Realistic concept location process (CL): – Does not end automatically when target method is in top N results – Developers can change their mind about the relevance of a method 37
  • 38. Calibration of parameters • Implemented a tool that automatically finds the RF parameters – Number of documents in a round of feedback (N) – Weight of terms in the initial query (a) – Weight of terms from relevant documents (b) – Weight of terms from irrelevant documents (c) • Idea: mimic CL on a software system using a set of bug reports and their target methods • 4 change requests per system, 4 software systems 38
  • 39. Calibration of parameters (contd) • For each change request - 500 configurations: – N = 1, 2, 3, 4, 5 – a = 0.25, 0.5, 0.75, 1 – b = 0, 0.25, 0.5, 0.75, 1 – c = 0, 0.25, 0.5, 0.75, 1 • For each configuration {N, a, b, c}: – Used ASTAR algorithm to simulate an ideal user: find the optimum set of relevance tags for the methods (relevant or irrelevant) so that the target methods are retrieved the fastest – Recorded the number of methods analyzed to find a target method • Chose the parameter combination that lead to the best results for all 4 change requests for a system • Findings: each system has a different configuration of parameters 39
  • 40. Study setup • Participants: 18 students in a Data Mining class @ West Virginia University • Java quiz • Systems and change requests: – 4 Java systems, different domains and sizes – 4 bug fix requests for each system (different than the ones used for calibration) • IRRF Tool: integrates the parameters determined in the calibration phase 40
  • 41. Change in methodology • Relevance marks: “relevant”, “irrelevant”, “don’t know”, “target” • Concept location stops when: – The user marks a method as “target” – The user analyzed more than 50 methods • Users can change their mind about the relevance of a method at any time query reformulated accordingly 41
  • 42. Status • Data collection finalized • Data analysis in progress 42
  • 43. Conclusions • RF is a good addition to IR for concept location – Easier query formulation – Better results • RF parameters need to be adjusted automatically to each software system • More studies needed to understand better the situations when RF does not improve IR 43
  • 44. Future work • Relevance feedback in IDEs, to help developers search and navigate source code • RF for other software engineering tasks • Other RF algorithms • Other types of relevance feedback: implicit, pseudo • Study how developers decide if a document is relevant or not to their task 44