Live API Documentation

Live API Documentation
Subramanian, S., Inozemtseva, L., & Holmes, R. (2014, May). Live API documentation.
In ICSE (pp. 643-652).
Presenter: Hossein Mobasher
Course: Software Evolution

Contents
• Introduction
• Previous works
• What’s new?
• Scenario
• Oracle Generation
• Problems
• Approach
• Example
• Evaluation
• Conclusion
2 / 20

Introduction
• APIs enable complex functionality to be used by client programs 
• Understanding how to use an API can be difficult 
• API documentation is often insufficient on its own.
• Writing documentation and keeping it up to date is very difficult 
• Developers ignore the documentation that does exist and declare that “code
is king”
3 / 20

Introduction (continue)
• Online sites fill the gap between traditional API documentation and
more example-based resources 
• StackOverflow
• Github Gists
• Unfortunately, these two important classes of documentation are
independent 
• Baker links source code examples to API documentation 
4 / 20

Previous works
• Identify source code references within non-code resources.
• These approaches have several limitations:
• Some systems explicitly ignored external references.
• Others only returned partially qualified names (PQN).
• Are insufficient for documentation linking.
• None of them worked for dynamically-typed languages.
5 / 20

What’s new?
• A constraint-based, iterative approach for determining the fully
qualified names of code elements in source code snippets.
• A prototype tool that implements this approach and uses the results
to automatically create bidirectional links between documentation
and source code examples.
6 / 20

Scenario 1
• Code snippet posted to StackOverflow to assist a developer who
didn’t understand how to manipulate the state of History objects.
• Baker can uniquely link bolded elements to the API.
• The elements for which it can determine a fully qualified name. (FQN)
7

Scenario 2
• Code snippet that a developer is trying to make a web app that can
take a photo and inject it into an element in an HTML documents.
• Baker also can identify the API that bolded elements are from.
8

Oracle Generation
• Baker’s oracle is key to its success.
• It is generally impossible to identify FQN of the code elements in a
snippets.
• FQN is essential to documentation linking tasks.
• The oracle are implemented as web services.
• Allowing Java/JS to be updated dynamically by any user or program.
9 / 20

Oracle Generation (continue)
• Java Oracle
• Containing class, method and field signatures.
• Using Neo4j to represent the hierarchies between code elements that an
object-oriented language like Java offers.
• Oracle includes full type information in the database.
• The types of classes, fields, return types, and parameters.
• The Java oracle can be dynamically updated by adding an appropriate JAR file.
10 / 20

Oracle Generation (continue)
• JS Oracle
• Is built by statically analyzing the source files of the libraries to be included.
• Using ESPRIMA to parse the source code of each library.
• ESPRIMA returns a JSON representation of the AST.
• JS oracle identifies all of the ‘Function Expressions’ and ‘Function Declaration’
nodes.
• Parsing problems:
• JavaScript is dynamically typed language. It is difficult to identify all method declarations
by static analysis of source code.
• JavaScript is not annotated with visibility (e.g. public and private)
11 / 20

Problems
• Parsing code snippets is more difficult than full files.
• Code snippets can be ambiguous.
• Kinds of ambiguity:
• Declaration Ambiguity
• External Reference Ambiguity
12 / 20

Approach
• Deductive Linking
• Handles declaration and external reference ambiguity.
• The goal is identifying the sole FQN that a given identifier can represent.
• Generating AST for code snippets.
• Uses information from the oracle to deduce facts about the AST.
13 / 20

Example (JavaBaker)
• History: 58 candidate types are recorded for History in oracle.
• addHistoryListener: Test 58 candidate types to see which ones contain a
method called addHistoryListener(…) that take a single object parameter.
• This results in 4 candidate methods.
• History node is also updated (reduced from 58 to 4)
• History.getToken: Test 4 candidates, reduced to from 4 to 2
• …
• At the end, Baker iterates again.
• Baker can be identified History as
com.google.gwt.user.client.History
14 / 20

Example (JSBaker)
• $(…).on(…)
• $ matches only with jQuery.
• on matches with jQuery’s on method. (Even though there are three on
method in the oracle)
• useGetPicture
• Oracle doesn’t contain a result for that.
• Baker records that this function as locally
defined, rather than being an external function.
• …
15 / 20

Evaluation
• Two research questions:
• Can Baker accurately identify API elements in code snippets?
• Does Baker work on a variety of systems, or is it limited to just a few libraries?
16 / 20

Linker Accuracy
• Precision is much more important than recall.
• Choose five Java systems (libraries) for analyzing.
• Android/ GWT/ Hibernate/ Joda Time/ XStream
• Manually examined 50 code elements for each system to determine if
the result returned by Baker:
• True Positive (TP)
• False Positive (FP)
• False Negative (FN)
17 / 20

Linker Accuracy (continue)
• Baker’s overall Java precision (0.98) and recall (0.83). Only exact
matches (cardinality = 1) were considered.
18 / 20
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 98% 𝑅𝑒𝑐𝑎𝑙𝑙 = 83%

Linker Accuracy (continue)
19 / 20
• Baker’s overall JavaScript precision (0.97) and recall (0.96). Only exact
matches were considered. (cardinality = 1)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 97% 𝑅𝑒𝑐𝑎𝑙𝑙 = 96%

Example Diversity
• JavaBaker parsed 4000 source code examples.
• Identified over 30000 links to 4500 unique elements.
• JSBaker parsed 1000 source code examples.
• Identified over 10000 links to 500 unique elements.

Qualifying High-cardinality Match
• Linking methods may return more than one match when there isn’t
enough information to FQN of a method or type.
• This is relatively rare.
• Graph shows the cardinality of the result
for each of the 4,000 snippets.
• JDK types and methods have been removed.
• The majority (69%) of elements can be
precisely identified.
21 / 20

Conclusion
• Maintaining API documentation is challenging, time-consuming task.
• The documentation is frequently out of date.
• Baker automatically generates links between API documentation and
source code examples.
• Baker has high precision. (0.97)
22 / 20

Live API Documentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (13)

Similar to Live API Documentation

Similar to Live API Documentation (20)

Live API Documentation