SlideShare a Scribd company logo
Evaluating How Developers Use
General-Purpose Web-Search for
Code Retrieval
Date: May 29, 2018
1
Md Masudur Rahman, Jed Barson, Sydney Paul, Joshua Kayan, Federico Andres Lois,
Sebastian Fernandez Quezada, Christopher Parnin, Kathryn T. Stolee, Baishakhi Ray
Coding Task
2
Convert a date string to a time object
3
string to time
4
string to time
string to time
Search Log
5
string to time
Java
string to time
Search Log
6
string to time using java
string to time
Search Log
7
string to time using java
Search Log
string to time using java
string to time
8
string to time using java
Search Log
string to time using java
string to time
9
string to time using java
DateTime
Search Log
string to time using java
string to time
10
date string to DateTime using java
string to time using java
string to time
Search Log
11
date string to DateTime using java
Search Log
string to time using java
date string to DateTime using java
string to time
12
date string to DateTime using java
Joda Time
library
Search Log
string to time using java
date string to DateTime using java
string to time
13
date string to DateTime using Joda Time library
string to time using java
date string to DateTime using java
string to time
Search Log
14
date string to DateTime using Joda Time library
Search Log
string to time using java
date string to DateTime using java
date string to DateTime using Joda…
string to time
15
date string to DateTime using Joda Time library
16
date string to DateTime using Joda Time library
X
X
X
17
world cup fixtures
Search Log
string to time using java
date string to DateTime using java
date string to DateTime using Joda …
string to time
18
world cup fixtures
string to time using java
date string to DateTime using java
date string to DateTime using Joda …
world cup fixtures
string to time
19
place to visit in gothenburg
Search Log
string to time using java
date string to DateTime using java
date string to DateTime using Joda …
world cup fixtures
string to time
20
Search Log
string to time using java
date string to DateTime using java
date string to DateTime using Joda …
world cup fixtures
place to visit in gothenburg
string to time
place to visit in gothenburg
Code Query
Code
Query
string to time

string to time using java

date string to DateTime using java

date string to DateTime using Joda Time library 21
Search Task
Code
Query
string to time

string to time using java

date string to DateTime using java

date string to DateTime using Joda Time library 22
Convert a date string to a DateTime object
using Joda Time library
Search Task
Code vs Non-code
Code Non-Code
Query
world cup fixtures

place to visit in gothenburg

hotel in gothenburg
23
Query
string to time

string to time using java

date string to DateTime using java

date string to DateTime using Joda Time library
General Purpose Search Engine for
Code Retrieval
Code Non-Code
Query
world cup fixtures

place to visit in gothenburg

hotel in gothenburg
24
Query
string to time

string to time using java

date string to DateTime using java

date string to DateTime using Joda Time library
Research Goal
Code Non-Code
Query
world cup fixtures

place to visit in gothenburg

hotel in gothenburg
25
๏ Query characteristics

๏ User behaviorQuery
string to time

string to time using java

date string to DateTime using java

date string to DateTime using Joda Time library
Dataset
26
Query Search Log
string to time using java
date string to DateTime using java
date string to DateTime using Joda Time library
world cup fixtures
place to visit in gothenburg
string to time
Users: 310 (mostly developer)
Consist of code and non-code queries
Total query: 150K
Chrome plugin
hotel in gothenburg
Dataset
27
Query Search Log
?
No label
Code or Non-code
string to time using java
date string to DateTime using java
date string to DateTime using Joda Time library
world cup fixtures
place to visit in gothenburg
string to time
hotel in gothenburg
Dataset
28
Query Search Log
?
No label
Code or Non-code
Query Classifier
string to time using java
date string to DateTime using java
date string to DateTime using Joda Time library
world cup fixtures
place to visit in gothenburg
string to time
hotel in gothenburg
29
Intent-based Query Classification
Code Intent Analysis
30
Query: javascript function to get mp3 play length
Code Intent Analysis
31
Query: javascript function to get mp3 play length CodeScore
?
Code Intent Analysis
32
Token Code Intent
S = set of code related tags
n = popularity of a tag
Query: javascript function to get mp3 play length CodeScore
17 7 0 6 5 8 3 ?
Code Intent Analysis
33
Query: javascript function to get mp3 play length CodeScore
17 7 0 6 5 8 3 46
Token Code Intent Query Code Intent
Query Code Score
34
Query Code Score
string to time 12
string to time using java 20
date string to DateTime using java 22.5
world cup fixtures 0
messi curly goal 2.6
place to visit in gothenburg 0
Query Code Score
35
Query Code Score Label
string to time 12 ?
string to time using java 20 ?
date string to DateTime using java 22.5 ?
world cup fixtures 0 ?
messi curly goal 2.6 ?
place to visit in gothenburg 0 ?
Query Code Score
36
Query Code Score Label
string to time 12 ?
string to time using java 20 ?
date string to DateTime using java 22.5 ?
world cup fixtures 0 ?
messi curly goal 2.6 ?
place to visit in gothenburg 0 ?
Classifier Evaluation
Precision: 87%

Recall: 86%

F1-score: 87%
Threshold = 10
Manually annotated 380
queries
Query Code Score
37
Query Code Score Label
string to time 12 Code
string to time using java 20 Code
date string to DateTime using java 22.5 Code
world cup fixtures 0 Non-code
messi curly goal 2.6 Non-code
place to visit in gothenburg 0 Non-code
Classifier Evaluation
Precision: 87%

Recall: 86%

F1-score: 87%
Threshold = 10
Manually annotated 380
queries
Query Code Score
38
Query Code Score Label
string to time 12 Code
string to time using java 20 Code
date string to DateTime using java 22.5 Code
world cup fixtures 0 Non-code
messi curly goal 2.6 Non-code
place to visit in gothenburg 0 Non-code
Code : 89K (59%)

Non-code : 61K (41%)
Annotated Data
Classifier Evaluation
Precision: 87%

Recall: 86%

F1-score: 87%
Threshold = 10
Manually annotated 380
queries
Research Questions
39
Query
Characteristics
User Behavior
RQ1. How do query characteristics differ for code and
non-code queries?
RQ2. How do search behaviors vary for code and
non-code related queries?
RQ3. How do task sessions vary for code and non-
code related search tasks?
Results
40
RQ1: Query Characteristics
41
Code queries often longer
(more tokens) than non-code
date string to DateTime using java
date string to DateTime using Joda Time library
world cup fixtures
messi curly goal
hotel in gothenburgjavascript function to get mp3 play length
Code Non-code
RQ1: Query Characteristics
42
date string to DateTime using java
date string to DateTime using Joda Time library
world cup fixtures
messi curly goal
hotel in gothenburgjavascript function to get mp3 play length
Code Non-code
RQ1: Query Characteristics
43
date string to DateTime using java
date string to DateTime using Joda Time library
world cup fixtures
messi curly goal
hotel in gothenburgjavascript function to get mp3 play length
Code Non-code
RQ1: Query Characteristics
44
Code Non-code
16K 12K 33K
Code queries contain less vocabulary (unique tokens) than non-code
45
RQ2: Query Search Behavior
Query # term added # term deleted
Code
string to time - -
string to time using java 2 -
date string to DateTime using
Joda Time library
4 2
Non-code
hotel in gothenburg - -
best hotel in gothenburg 1 -
46
RQ2: Query Search Behavior
Query # term added # term deleted
Code
string to time - -
string to time using java 2 -
date string to DateTime using
Joda Time library
4 2
Non-code
hotel in gothenburg - -
best hotel in gothenburg 1 -
Edited query
47
User often add/delete more terms
(avg. 2) to a code compared to non-
code (avg. 1)
RQ2: Query Search Behavior
Query # term added # term deleted
Code
string to time - -
string to time using java 2 -
date string to DateTime using
Joda Time library
4 2
Non-code
hotel in gothenburg - -
best hotel in gothenburg 1 -
48
RQ2: Query Search Behavior
Query # term added # term deleted Code Score
Code
string to time - - 12
string to time using java 2 - 20
date string to DateTime using
Joda Time library
4 2 30.5
49
RQ2: Query Search Behavior
Query # term added # term deleted Code Score
Code
string to time - - 12
string to time using java 2 - 20
date string to DateTime using
Joda Time library
4 2 30.5
Edit query to increase code intent
50
RQ3: Task Search Behavior
Query # query Task intent
Code
Task
string to time
4
Converting a date
string to a Time
object
string to time using java
date string to DateTime
using Joda Time library
Non-code
Task
hotel in gothenburg
2
Hotel booking in
Gothenburgbest hotel in gothenburg
More queries required to complete a code task
51
RQ3: Task Search Behavior
Query Task intent
Search duration
(minute)
# web visit
Code
Task
string to time
Converting a date
string to a Time
object
6 15
string to time using java
date string to DateTime
using Joda Time library
Non-code
Task
hotel in gothenburg Hotel booking in
Sweden
2 5
hotel in stockholm
More time and website visit required
to complete code related tasks
Summary
Code Non-Code
52
Code queries are linguistically different
Users modify code queries more often
Users give significantly more effort for
code task
General Search Engine
Summary
Code Non-Code
General Search Engine
53
Code queries are linguistically different
Users modify code queries more often
Users spend significantly more effort for
code task
Code search is less effective
Summary
Code Non-Code
General Search Engine
54
Code queries are linguistically different
Users modify code queries more often
Users spend significantly more effort for
code task
Code search is less effective
Special treatment required
to improve code retrieval
Question?
Code Non-Code
General Search Engine
55
Code queries are linguistically different
Users modify code queries more often
Users spend significantly more effort for
code task
Code search is less effective
Special treatment required
to improve code retrieval

More Related Content

Similar to Evaluating How Developers Use General-Purpose Web-Search for Code Retrieval

PharoJS: Hijack the JavaScript Ecosystem
PharoJS: Hijack the JavaScript EcosystemPharoJS: Hijack the JavaScript Ecosystem
PharoJS: Hijack the JavaScript Ecosystem
ESUG
 
Msr2010 ibrahim
Msr2010 ibrahimMsr2010 ibrahim
Msr2010 ibrahimSAIL_QU
 
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
Elasticsearch
 
The Ring programming language version 1.6 book - Part 186 of 189
The Ring programming language version 1.6 book - Part 186 of 189The Ring programming language version 1.6 book - Part 186 of 189
The Ring programming language version 1.6 book - Part 186 of 189
Mahmoud Samir Fayed
 
How will development change with LLMs
How will development change with LLMsHow will development change with LLMs
How will development change with LLMs
Microsoft, InfuseAI, Appier, IBM, KaiOS
 
RACK-SANER2016
RACK-SANER2016RACK-SANER2016
RACK-SANER2016
Masud Rahman
 
Multiplatform development with Kotlin
Multiplatform  development with KotlinMultiplatform  development with Kotlin
Multiplatform development with Kotlin
Gaetan Zoritchak
 
Análisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic Stack
Elasticsearch
 
The Ring programming language version 1.4 book - Part 30 of 30
The Ring programming language version 1.4 book - Part 30 of 30The Ring programming language version 1.4 book - Part 30 of 30
The Ring programming language version 1.4 book - Part 30 of 30
Mahmoud Samir Fayed
 
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsBack to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
MongoDB
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
SATOSHI TAGOMORI
 
GraphQL-ify your API - JFall 2022
GraphQL-ify your API - JFall 2022GraphQL-ify your API - JFall 2022
GraphQL-ify your API - JFall 2022
Soham Dasgupta
 

Similar to Evaluating How Developers Use General-Purpose Web-Search for Code Retrieval (14)

PharoJS: Hijack the JavaScript Ecosystem
PharoJS: Hijack the JavaScript EcosystemPharoJS: Hijack the JavaScript Ecosystem
PharoJS: Hijack the JavaScript Ecosystem
 
Msr2010 ibrahim
Msr2010 ibrahimMsr2010 ibrahim
Msr2010 ibrahim
 
Análisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic StackAnálisis de las novedades del Elastic Stack
Análisis de las novedades del Elastic Stack
 
Tech talks#6: Code Refactoring
Tech talks#6: Code RefactoringTech talks#6: Code Refactoring
Tech talks#6: Code Refactoring
 
The Ring programming language version 1.6 book - Part 186 of 189
The Ring programming language version 1.6 book - Part 186 of 189The Ring programming language version 1.6 book - Part 186 of 189
The Ring programming language version 1.6 book - Part 186 of 189
 
How will development change with LLMs
How will development change with LLMsHow will development change with LLMs
How will development change with LLMs
 
RACK-SANER2016
RACK-SANER2016RACK-SANER2016
RACK-SANER2016
 
My New Resume
My New ResumeMy New Resume
My New Resume
 
Multiplatform development with Kotlin
Multiplatform  development with KotlinMultiplatform  development with Kotlin
Multiplatform development with Kotlin
 
Análisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic StackAnálisis del roadmap del Elastic Stack
Análisis del roadmap del Elastic Stack
 
The Ring programming language version 1.4 book - Part 30 of 30
The Ring programming language version 1.4 book - Part 30 of 30The Ring programming language version 1.4 book - Part 30 of 30
The Ring programming language version 1.4 book - Part 30 of 30
 
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsBack to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
 
GraphQL-ify your API - JFall 2022
GraphQL-ify your API - JFall 2022GraphQL-ify your API - JFall 2022
GraphQL-ify your API - JFall 2022
 

Recently uploaded

Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
ArianaBusciglio
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Ashish Kohli
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
Wasim Ak
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 

Recently uploaded (20)

Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
Group Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana BuscigliopptxGroup Presentation 2 Economics.Ariana Buscigliopptx
Group Presentation 2 Economics.Ariana Buscigliopptx
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 

Evaluating How Developers Use General-Purpose Web-Search for Code Retrieval

  • 1. Evaluating How Developers Use General-Purpose Web-Search for Code Retrieval Date: May 29, 2018 1 Md Masudur Rahman, Jed Barson, Sydney Paul, Joshua Kayan, Federico Andres Lois, Sebastian Fernandez Quezada, Christopher Parnin, Kathryn T. Stolee, Baishakhi Ray
  • 2. Coding Task 2 Convert a date string to a time object
  • 4. 4 string to time string to time Search Log
  • 5. 5 string to time Java string to time Search Log
  • 6. 6 string to time using java string to time Search Log
  • 7. 7 string to time using java Search Log string to time using java string to time
  • 8. 8 string to time using java Search Log string to time using java string to time
  • 9. 9 string to time using java DateTime Search Log string to time using java string to time
  • 10. 10 date string to DateTime using java string to time using java string to time Search Log
  • 11. 11 date string to DateTime using java Search Log string to time using java date string to DateTime using java string to time
  • 12. 12 date string to DateTime using java Joda Time library Search Log string to time using java date string to DateTime using java string to time
  • 13. 13 date string to DateTime using Joda Time library string to time using java date string to DateTime using java string to time Search Log
  • 14. 14 date string to DateTime using Joda Time library Search Log string to time using java date string to DateTime using java date string to DateTime using Joda… string to time
  • 15. 15 date string to DateTime using Joda Time library
  • 16. 16 date string to DateTime using Joda Time library X X X
  • 17. 17 world cup fixtures Search Log string to time using java date string to DateTime using java date string to DateTime using Joda … string to time
  • 18. 18 world cup fixtures string to time using java date string to DateTime using java date string to DateTime using Joda … world cup fixtures string to time
  • 19. 19 place to visit in gothenburg Search Log string to time using java date string to DateTime using java date string to DateTime using Joda … world cup fixtures string to time
  • 20. 20 Search Log string to time using java date string to DateTime using java date string to DateTime using Joda … world cup fixtures place to visit in gothenburg string to time place to visit in gothenburg
  • 21. Code Query Code Query string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library 21
  • 22. Search Task Code Query string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library 22 Convert a date string to a DateTime object using Joda Time library Search Task
  • 23. Code vs Non-code Code Non-Code Query world cup fixtures place to visit in gothenburg hotel in gothenburg 23 Query string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library
  • 24. General Purpose Search Engine for Code Retrieval Code Non-Code Query world cup fixtures place to visit in gothenburg hotel in gothenburg 24 Query string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library
  • 25. Research Goal Code Non-Code Query world cup fixtures place to visit in gothenburg hotel in gothenburg 25 ๏ Query characteristics ๏ User behaviorQuery string to time string to time using java date string to DateTime using java date string to DateTime using Joda Time library
  • 26. Dataset 26 Query Search Log string to time using java date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures place to visit in gothenburg string to time Users: 310 (mostly developer) Consist of code and non-code queries Total query: 150K Chrome plugin hotel in gothenburg
  • 27. Dataset 27 Query Search Log ? No label Code or Non-code string to time using java date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures place to visit in gothenburg string to time hotel in gothenburg
  • 28. Dataset 28 Query Search Log ? No label Code or Non-code Query Classifier string to time using java date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures place to visit in gothenburg string to time hotel in gothenburg
  • 30. Code Intent Analysis 30 Query: javascript function to get mp3 play length
  • 31. Code Intent Analysis 31 Query: javascript function to get mp3 play length CodeScore ?
  • 32. Code Intent Analysis 32 Token Code Intent S = set of code related tags n = popularity of a tag Query: javascript function to get mp3 play length CodeScore 17 7 0 6 5 8 3 ?
  • 33. Code Intent Analysis 33 Query: javascript function to get mp3 play length CodeScore 17 7 0 6 5 8 3 46 Token Code Intent Query Code Intent
  • 34. Query Code Score 34 Query Code Score string to time 12 string to time using java 20 date string to DateTime using java 22.5 world cup fixtures 0 messi curly goal 2.6 place to visit in gothenburg 0
  • 35. Query Code Score 35 Query Code Score Label string to time 12 ? string to time using java 20 ? date string to DateTime using java 22.5 ? world cup fixtures 0 ? messi curly goal 2.6 ? place to visit in gothenburg 0 ?
  • 36. Query Code Score 36 Query Code Score Label string to time 12 ? string to time using java 20 ? date string to DateTime using java 22.5 ? world cup fixtures 0 ? messi curly goal 2.6 ? place to visit in gothenburg 0 ? Classifier Evaluation Precision: 87% Recall: 86% F1-score: 87% Threshold = 10 Manually annotated 380 queries
  • 37. Query Code Score 37 Query Code Score Label string to time 12 Code string to time using java 20 Code date string to DateTime using java 22.5 Code world cup fixtures 0 Non-code messi curly goal 2.6 Non-code place to visit in gothenburg 0 Non-code Classifier Evaluation Precision: 87% Recall: 86% F1-score: 87% Threshold = 10 Manually annotated 380 queries
  • 38. Query Code Score 38 Query Code Score Label string to time 12 Code string to time using java 20 Code date string to DateTime using java 22.5 Code world cup fixtures 0 Non-code messi curly goal 2.6 Non-code place to visit in gothenburg 0 Non-code Code : 89K (59%) Non-code : 61K (41%) Annotated Data Classifier Evaluation Precision: 87% Recall: 86% F1-score: 87% Threshold = 10 Manually annotated 380 queries
  • 39. Research Questions 39 Query Characteristics User Behavior RQ1. How do query characteristics differ for code and non-code queries? RQ2. How do search behaviors vary for code and non-code related queries? RQ3. How do task sessions vary for code and non- code related search tasks?
  • 41. RQ1: Query Characteristics 41 Code queries often longer (more tokens) than non-code date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures messi curly goal hotel in gothenburgjavascript function to get mp3 play length Code Non-code
  • 42. RQ1: Query Characteristics 42 date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures messi curly goal hotel in gothenburgjavascript function to get mp3 play length Code Non-code
  • 43. RQ1: Query Characteristics 43 date string to DateTime using java date string to DateTime using Joda Time library world cup fixtures messi curly goal hotel in gothenburgjavascript function to get mp3 play length Code Non-code
  • 44. RQ1: Query Characteristics 44 Code Non-code 16K 12K 33K Code queries contain less vocabulary (unique tokens) than non-code
  • 45. 45 RQ2: Query Search Behavior Query # term added # term deleted Code string to time - - string to time using java 2 - date string to DateTime using Joda Time library 4 2 Non-code hotel in gothenburg - - best hotel in gothenburg 1 -
  • 46. 46 RQ2: Query Search Behavior Query # term added # term deleted Code string to time - - string to time using java 2 - date string to DateTime using Joda Time library 4 2 Non-code hotel in gothenburg - - best hotel in gothenburg 1 - Edited query
  • 47. 47 User often add/delete more terms (avg. 2) to a code compared to non- code (avg. 1) RQ2: Query Search Behavior Query # term added # term deleted Code string to time - - string to time using java 2 - date string to DateTime using Joda Time library 4 2 Non-code hotel in gothenburg - - best hotel in gothenburg 1 -
  • 48. 48 RQ2: Query Search Behavior Query # term added # term deleted Code Score Code string to time - - 12 string to time using java 2 - 20 date string to DateTime using Joda Time library 4 2 30.5
  • 49. 49 RQ2: Query Search Behavior Query # term added # term deleted Code Score Code string to time - - 12 string to time using java 2 - 20 date string to DateTime using Joda Time library 4 2 30.5 Edit query to increase code intent
  • 50. 50 RQ3: Task Search Behavior Query # query Task intent Code Task string to time 4 Converting a date string to a Time object string to time using java date string to DateTime using Joda Time library Non-code Task hotel in gothenburg 2 Hotel booking in Gothenburgbest hotel in gothenburg More queries required to complete a code task
  • 51. 51 RQ3: Task Search Behavior Query Task intent Search duration (minute) # web visit Code Task string to time Converting a date string to a Time object 6 15 string to time using java date string to DateTime using Joda Time library Non-code Task hotel in gothenburg Hotel booking in Sweden 2 5 hotel in stockholm More time and website visit required to complete code related tasks
  • 52. Summary Code Non-Code 52 Code queries are linguistically different Users modify code queries more often Users give significantly more effort for code task General Search Engine
  • 53. Summary Code Non-Code General Search Engine 53 Code queries are linguistically different Users modify code queries more often Users spend significantly more effort for code task Code search is less effective
  • 54. Summary Code Non-Code General Search Engine 54 Code queries are linguistically different Users modify code queries more often Users spend significantly more effort for code task Code search is less effective Special treatment required to improve code retrieval
  • 55. Question? Code Non-Code General Search Engine 55 Code queries are linguistically different Users modify code queries more often Users spend significantly more effort for code task Code search is less effective Special treatment required to improve code retrieval