Code-tagging and similarity-based retrieval with myCBR - Presentation Transcript
CAMBRIDGE, UK, 10 DEC 2008
Code-tagging and similarity-
based retrieval with myCBR
Thomas Roth-Berghofer & Daniel Bahls
Senior researcher, trb@dfki.de
German Research Centre for Artificial Intelligence DFKI GmbH
Samstag, 18. Juli 2009
Programmer‘s dilemma
Samstag, 18. Juli 2009
Programmer‘s dilemma
Samstag, 18. Juli 2009
Programmer‘s dilemma
• Where is the code fragment I used to solve a
similar problem in the past?
• Is this piece of code still available?
• Is it worth the effort to search for it?
• If so, what would be the right search term?
Samstag, 18. Juli 2009
Personalised approach
Samstag, 18. Juli 2009
Personalised approach
• Personal
vocabulary: tags
Samstag, 18. Juli 2009
Personalised approach
• Personal
vocabulary: tags
• Linking tags
Samstag, 18. Juli 2009
Personalised approach
• Personal
vocabulary: tags
• Linking tags
• Case-based
retrieval
Samstag, 18. Juli 2009
Personalised approach
• Personal
vocabulary: tags
• Linking tags
• Case-based
retrieval
• Work context
Samstag, 18. Juli 2009
Personalised approach
• Personal
vocabulary: tags
• Linking tags
• Case-based
retrieval
• Work context
• Social dimension:
tag exchange
Samstag, 18. Juli 2009
CBR cycle
Agnar Aamodt and Enric Plaza. Case-based reasoning: Foundational issues,
methodological variations, and system approaches. AI Communications, 7(1):39–59, 1994.
Samstag, 18. Juli 2009
CBR cycle myCBR
CBR
Agnar Aamodt and Enric Plaza. Case-based reasoning: Foundational issues,
methodological variations, and system approaches. AI Communications, 7(1):39–59, 1994.
Samstag, 18. Juli 2009
Case structure
Attribute Value type category
Tags String (multiple) Problem description
Context items String (multiple) Problem description
Code snippet String Solution
Document type String Provenance
Project name String Provenance
File path String Provenance
Author ID String Provenance
Creation date Long Provenance
Rating Float Maintenance
Rating count Integer Maintenance
Samstag, 18. Juli 2009
Case structure Set by user
Set by coTag
Attribute Value type category
Tags String (multiple) Problem description
Context items String (multiple) Problem description
Code snippet String Solution
Document type String Provenance
Project name String Provenance
File path String Provenance
Author ID String Provenance
Creation date Long Provenance
Rating Float Maintenance
Rating count Integer Maintenance
Samstag, 18. Juli 2009
Acquiring case
Samstag, 18. Juli 2009
Acquiring case
Samstag, 18. Juli 2009
Query view
• Search for tags: init,
logging config
• Include context
=> regard currently
selected code
Samstag, 18. Juli 2009
Retrieval
• Result for: init, logging,
config
• Ranked list of code
snippets
Samstag, 18. Juli 2009
Presentation of cases
Samstag, 18. Juli 2009
Situations in which
explanations play a role
• Instructing explanations:
• Novice users want to know about how tagging and (similarity-based)
retrieval works.
• Convincing explanations:
• Regular users want to check when the retrieval does not meet their
expectations.
• Improving explanations
• Regular users want to correct coTag‘s behaviour.
Samstag, 18. Juli 2009
Explanation of matching
• Search terms:
• init, logging, config
• Case tags:
• init, Logger
Samstag, 18. Juli 2009
Graphical explanation of
trigram matching
• Syntactical similarity
• Typos
• Stemming
Samstag, 18. Juli 2009
Similarity customisation
• Tag similarities:
unsimilar 0%
partly similar 25%
similar 50%
very similar 75%
identical 100%
• Updates personal and
community similarity
measure
Samstag, 18. Juli 2009
Similarity customisation
• Tag similarities:
unsimilar 0%
partly similar 25%
similar 50%
very similar 75%
identical 100%
• Updates personal and
community similarity
measure
Samstag, 18. Juli 2009
Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
Three levels of similarity
calculation
Personal
Imported
Trigram
Samstag, 18. Juli 2009
Customised (personal)
and imported similarity
Samstag, 18. Juli 2009
Client-side architecture
Samstag, 18. Juli 2009
Client-side architecture
Samstag, 18. Juli 2009
Client-side architecture
Samstag, 18. Juli 2009
Tag and exchange code
snippets
Samstag, 18. Juli 2009
Samstag, 18. Juli 2009
Samstag, 18. Juli 2009
Take home messages
Samstag, 18. Juli 2009
Take home messages
• Re-finding information is a quite
typical task in knowledge-work.
Samstag, 18. Juli 2009
Take home messages
• Re-finding information is a quite
typical task in knowledge-work.
• Tagging is a helpful and well-
known technique.
Samstag, 18. Juli 2009
Take home messages
• Re-finding information is a quite
typical task in knowledge-work.
• Tagging is a helpful and well-
known technique.
• Similarity-based retrieval can
improve searches.
Samstag, 18. Juli 2009
Take home messages
• Re-finding information is a quite
typical task in knowledge-work.
• Tagging is a helpful and well-
known technique.
• Similarity-based retrieval can
improve searches.
• Explanation-aware development of
applications help you deal with
increased complexity of similarity-
based retrieval.
Samstag, 18. Juli 2009
Thank you!
CAMBRIDGE, UK, 10 DEC 2008
Code-tagging and similarity-
based retrieval with myCBR
Thomas Roth-Berghofer & Daniel Bahls
Senior researcher, trb@dfki.de
German Research Centre for Artificial Intelligence DFKI GmbH
Samstag, 18. Juli 2009
This paper describes the code tagging plug-in coTag more
This paper describes the code tagging plug-in coTag, which allows annotating code snippets in the integrated development environment eclipse. coTag offers an easy-to-use interface for tagging and searching. Using the similarity-based search engine of the open-source tool myCBR, the user can search not only for exactly the same tags as offered by other code tagging extensions, but also for similar tags and, thus, for similar code snippets. coTag provides means for context-based adding of new as well as changing of existing similarity links between tags, supported by myCBR’s explanation component.
less
0 comments
Post a comment