SlideShare a Scribd company logo
1 of 23
Download to read offline
Live API Documentation
Subramanian, S., Inozemtseva, L., & Holmes, R. (2014, May). Live API documentation.
In ICSE (pp. 643-652).
Presenter: Hossein Mobasher
Course: Software Evolution
Contents
• Introduction
• Previous works
• What’s new?
• Scenario
• Oracle Generation
• Problems
• Approach
• Example
• Evaluation
• Conclusion
2 / 20
Introduction
• APIs enable complex functionality to be used by client programs 
• Understanding how to use an API can be difficult 
• API documentation is often insufficient on its own.
• Writing documentation and keeping it up to date is very difficult 
• Developers ignore the documentation that does exist and declare that “code
is king”
3 / 20
Introduction (continue)
• Online sites fill the gap between traditional API documentation and
more example-based resources 
• StackOverflow
• Github Gists
• Unfortunately, these two important classes of documentation are
independent 
• Baker links source code examples to API documentation 
4 / 20
Previous works
• Identify source code references within non-code resources.
• These approaches have several limitations:
• Some systems explicitly ignored external references.
• Others only returned partially qualified names (PQN).
• Are insufficient for documentation linking.
• None of them worked for dynamically-typed languages.
5 / 20
What’s new?
• A constraint-based, iterative approach for determining the fully
qualified names of code elements in source code snippets.
• A prototype tool that implements this approach and uses the results
to automatically create bidirectional links between documentation
and source code examples.
6 / 20
Scenario 1
• Code snippet posted to StackOverflow to assist a developer who
didn’t understand how to manipulate the state of History objects.
• Baker can uniquely link bolded elements to the API.
• The elements for which it can determine a fully qualified name. (FQN)
7
Scenario 2
• Code snippet that a developer is trying to make a web app that can
take a photo and inject it into an element in an HTML documents.
• Baker also can identify the API that bolded elements are from.
8
Oracle Generation
• Baker’s oracle is key to its success.
• It is generally impossible to identify FQN of the code elements in a
snippets.
• FQN is essential to documentation linking tasks.
• The oracle are implemented as web services.
• Allowing Java/JS to be updated dynamically by any user or program.
9 / 20
Oracle Generation (continue)
• Java Oracle
• Containing class, method and field signatures.
• Using Neo4j to represent the hierarchies between code elements that an
object-oriented language like Java offers.
• Oracle includes full type information in the database.
• The types of classes, fields, return types, and parameters.
• The Java oracle can be dynamically updated by adding an appropriate JAR file.
10 / 20
Oracle Generation (continue)
• JS Oracle
• Is built by statically analyzing the source files of the libraries to be included.
• Using ESPRIMA to parse the source code of each library.
• ESPRIMA returns a JSON representation of the AST.
• JS oracle identifies all of the ‘Function Expressions’ and ‘Function Declaration’
nodes.
• Parsing problems:
• JavaScript is dynamically typed language. It is difficult to identify all method declarations
by static analysis of source code.
• JavaScript is not annotated with visibility (e.g. public and private)
11 / 20
Problems
• Parsing code snippets is more difficult than full files.
• Code snippets can be ambiguous.
• Kinds of ambiguity:
• Declaration Ambiguity
• External Reference Ambiguity
12 / 20
Approach
• Deductive Linking
• Handles declaration and external reference ambiguity.
• The goal is identifying the sole FQN that a given identifier can represent.
• Generating AST for code snippets.
• Uses information from the oracle to deduce facts about the AST.
13 / 20
Example (JavaBaker)
• History: 58 candidate types are recorded for History in oracle.
• addHistoryListener: Test 58 candidate types to see which ones contain a
method called addHistoryListener(…) that take a single object parameter.
• This results in 4 candidate methods.
• History node is also updated (reduced from 58 to 4)
• History.getToken: Test 4 candidates, reduced to from 4 to 2
• …
• At the end, Baker iterates again.
• Baker can be identified History as
com.google.gwt.user.client.History
14 / 20
Example (JSBaker)
• $(…).on(…)
• $ matches only with jQuery.
• on matches with jQuery’s on method. (Even though there are three on
method in the oracle)
• useGetPicture
• Oracle doesn’t contain a result for that.
• Baker records that this function as locally
defined, rather than being an external function.
• …
15 / 20
Evaluation
• Two research questions:
• Can Baker accurately identify API elements in code snippets?
• Does Baker work on a variety of systems, or is it limited to just a few libraries?
16 / 20
Linker Accuracy
• Precision is much more important than recall.
• Choose five Java systems (libraries) for analyzing.
• Android/ GWT/ Hibernate/ Joda Time/ XStream
• Manually examined 50 code elements for each system to determine if
the result returned by Baker:
• True Positive (TP)
• False Positive (FP)
• False Negative (FN)
17 / 20
Linker Accuracy (continue)
• Baker’s overall Java precision (0.98) and recall (0.83). Only exact
matches (cardinality = 1) were considered.
18 / 20
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 98% 𝑅𝑒𝑐𝑎𝑙𝑙 = 83%
Linker Accuracy (continue)
19 / 20
• Baker’s overall JavaScript precision (0.97) and recall (0.96). Only exact
matches were considered. (cardinality = 1)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 97% 𝑅𝑒𝑐𝑎𝑙𝑙 = 96%
Example Diversity
• JavaBaker parsed 4000 source code examples.
• Identified over 30000 links to 4500 unique elements.
• JSBaker parsed 1000 source code examples.
• Identified over 10000 links to 500 unique elements.
Qualifying High-cardinality Match
• Linking methods may return more than one match when there isn’t
enough information to FQN of a method or type.
• This is relatively rare.
• Graph shows the cardinality of the result
for each of the 4,000 snippets.
• JDK types and methods have been removed.
• The majority (69%) of elements can be
precisely identified.
21 / 20
Conclusion
• Maintaining API documentation is challenging, time-consuming task.
• The documentation is frequently out of date.
• Baker automatically generates links between API documentation and
source code examples.
• Baker has high precision. (0.97)
22 / 20
Questions?

More Related Content

What's hot

Introduction of Java 8 with emphasis on Lambda Expressions and Streams
Introduction of Java 8 with emphasis on Lambda Expressions and StreamsIntroduction of Java 8 with emphasis on Lambda Expressions and Streams
Introduction of Java 8 with emphasis on Lambda Expressions and StreamsEmiel Paasschens
 
How to improve your Tizen native program
How to improve your Tizen native programHow to improve your Tizen native program
How to improve your Tizen native programRyo Jin
 
Introducing object oriented programming (oop)
Introducing object oriented programming (oop)Introducing object oriented programming (oop)
Introducing object oriented programming (oop)Hemlathadhevi Annadhurai
 
Mule soft meetup_4_mty_online_oct_2020
Mule soft meetup_4_mty_online_oct_2020Mule soft meetup_4_mty_online_oct_2020
Mule soft meetup_4_mty_online_oct_2020Veyra Celina
 
java 8 new features
java 8 new features java 8 new features
java 8 new features Rohit Verma
 
Kelis king - introduction to software design
Kelis king -  introduction to software designKelis king -  introduction to software design
Kelis king - introduction to software designKelisKing
 
CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)
CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)
CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)Sam Bowne
 
input/ output in java
input/ output  in javainput/ output  in java
input/ output in javasharma230399
 
Object Oriented Programming with Laravel - Session 3
Object Oriented Programming with Laravel - Session 3Object Oriented Programming with Laravel - Session 3
Object Oriented Programming with Laravel - Session 3Shahrzad Peyman
 
Object Oriented Programming with Laravel - Session 2
Object Oriented Programming with Laravel - Session 2Object Oriented Programming with Laravel - Session 2
Object Oriented Programming with Laravel - Session 2Shahrzad Peyman
 

What's hot (20)

Introduction of Java 8 with emphasis on Lambda Expressions and Streams
Introduction of Java 8 with emphasis on Lambda Expressions and StreamsIntroduction of Java 8 with emphasis on Lambda Expressions and Streams
Introduction of Java 8 with emphasis on Lambda Expressions and Streams
 
How to improve your Tizen native program
How to improve your Tizen native programHow to improve your Tizen native program
How to improve your Tizen native program
 
Introducing object oriented programming (oop)
Introducing object oriented programming (oop)Introducing object oriented programming (oop)
Introducing object oriented programming (oop)
 
Slides
SlidesSlides
Slides
 
C Sharp Course 101.5
C Sharp Course 101.5C Sharp Course 101.5
C Sharp Course 101.5
 
Mule soft meetup_4_mty_online_oct_2020
Mule soft meetup_4_mty_online_oct_2020Mule soft meetup_4_mty_online_oct_2020
Mule soft meetup_4_mty_online_oct_2020
 
AAC Room
AAC RoomAAC Room
AAC Room
 
32.java input-output
32.java input-output32.java input-output
32.java input-output
 
java 8 new features
java 8 new features java 8 new features
java 8 new features
 
Kelis king - introduction to software design
Kelis king -  introduction to software designKelis king -  introduction to software design
Kelis king - introduction to software design
 
CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)
CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)
CNIT 129S Ch 9: Attacking Data Stores (Part 2 of 2)
 
Chapter one
Chapter oneChapter one
Chapter one
 
input/ output in java
input/ output  in javainput/ output  in java
input/ output in java
 
Java 8 Lambda and Streams
Java 8 Lambda and StreamsJava 8 Lambda and Streams
Java 8 Lambda and Streams
 
Java 8 new features
Java 8 new featuresJava 8 new features
Java 8 new features
 
Object Oriented Programming with Laravel - Session 3
Object Oriented Programming with Laravel - Session 3Object Oriented Programming with Laravel - Session 3
Object Oriented Programming with Laravel - Session 3
 
Linq in C#
Linq in C#Linq in C#
Linq in C#
 
Introducing LINQ
Introducing LINQIntroducing LINQ
Introducing LINQ
 
Lambdas
LambdasLambdas
Lambdas
 
Object Oriented Programming with Laravel - Session 2
Object Oriented Programming with Laravel - Session 2Object Oriented Programming with Laravel - Session 2
Object Oriented Programming with Laravel - Session 2
 

Viewers also liked (13)

WMA Risk Survey 2016
WMA Risk Survey 2016WMA Risk Survey 2016
WMA Risk Survey 2016
 
UCloud
UCloudUCloud
UCloud
 
CodeIgniter
CodeIgniterCodeIgniter
CodeIgniter
 
Database
DatabaseDatabase
Database
 
Desarrollo de la unidad didáctica
Desarrollo de la unidad didácticaDesarrollo de la unidad didáctica
Desarrollo de la unidad didáctica
 
Intimacy
IntimacyIntimacy
Intimacy
 
Shive
ShiveShive
Shive
 
ISR
ISRISR
ISR
 
Makalah psikologi
Makalah psikologiMakalah psikologi
Makalah psikologi
 
Presentation
PresentationPresentation
Presentation
 
Association Rule Mining in Social Network Data
Association Rule Mining in Social Network DataAssociation Rule Mining in Social Network Data
Association Rule Mining in Social Network Data
 
Advanced Java
Advanced JavaAdvanced Java
Advanced Java
 
Illustration and Graphic Design Portfolio
Illustration and Graphic Design PortfolioIllustration and Graphic Design Portfolio
Illustration and Graphic Design Portfolio
 

Similar to Live API Documentation

Eloquent workflow: delivering data from database to client in a right way
Eloquent workflow: delivering data from database to client in a right wayEloquent workflow: delivering data from database to client in a right way
Eloquent workflow: delivering data from database to client in a right wayРоман Кинякин
 
Building Software Backend (Web API)
Building Software Backend (Web API)Building Software Backend (Web API)
Building Software Backend (Web API)Alexander Goida
 
Ontology-based multi-domain metadata for research data management using tripl...
Ontology-based multi-domain metadata for research data management using tripl...Ontology-based multi-domain metadata for research data management using tripl...
Ontology-based multi-domain metadata for research data management using tripl...João Rocha da Silva
 
Microservices for Systematic Profiling and Monitoring of the Refactoring
Microservices for Systematic Profiling and Monitoring of the RefactoringMicroservices for Systematic Profiling and Monitoring of the Refactoring
Microservices for Systematic Profiling and Monitoring of the RefactoringAlexander Mazurov
 
Behat bdd training (php) course slides pdf
Behat bdd training (php) course slides pdfBehat bdd training (php) course slides pdf
Behat bdd training (php) course slides pdfseleniumbootcamp
 
Restful风格ž„web服务架构
Restful风格ž„web服务架构Restful风格ž„web服务架构
Restful风格ž„web服务架构Benjamin Tan
 
Rapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxRapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxMichael Hackstein
 
The WorldCat Search API
The WorldCat Search APIThe WorldCat Search API
The WorldCat Search APIOCLC Research
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)Igor Talevski
 
Best Java Online Training in India
Best Java Online Training in IndiaBest Java Online Training in India
Best Java Online Training in IndiaNagendra Kumar
 
Composable Software Architecture with Spring
Composable Software Architecture with SpringComposable Software Architecture with Spring
Composable Software Architecture with SpringSam Brannen
 
GNAT Pro User Day: Latest Advances in AdaCore Static Analysis Tools
GNAT Pro User Day: Latest Advances in AdaCore Static Analysis ToolsGNAT Pro User Day: Latest Advances in AdaCore Static Analysis Tools
GNAT Pro User Day: Latest Advances in AdaCore Static Analysis ToolsAdaCore
 
JSR 335 / java 8 - update reference
JSR 335 / java 8 - update referenceJSR 335 / java 8 - update reference
JSR 335 / java 8 - update referencesandeepji_choudhary
 
An Open Source Workbench for Prototyping Multimodal Interactions Based on Off...
An Open Source Workbench for Prototyping Multimodal Interactions Based on Off...An Open Source Workbench for Prototyping Multimodal Interactions Based on Off...
An Open Source Workbench for Prototyping Multimodal Interactions Based on Off...Jean Vanderdonckt
 
Iterator - a powerful but underappreciated design pattern
Iterator - a powerful but underappreciated design patternIterator - a powerful but underappreciated design pattern
Iterator - a powerful but underappreciated design patternNitin Bhide
 

Similar to Live API Documentation (20)

Java9to19Final.pptx
Java9to19Final.pptxJava9to19Final.pptx
Java9to19Final.pptx
 
Eloquent workflow: delivering data from database to client in a right way
Eloquent workflow: delivering data from database to client in a right wayEloquent workflow: delivering data from database to client in a right way
Eloquent workflow: delivering data from database to client in a right way
 
Building Software Backend (Web API)
Building Software Backend (Web API)Building Software Backend (Web API)
Building Software Backend (Web API)
 
Ontology-based multi-domain metadata for research data management using tripl...
Ontology-based multi-domain metadata for research data management using tripl...Ontology-based multi-domain metadata for research data management using tripl...
Ontology-based multi-domain metadata for research data management using tripl...
 
Java Online Training
Java Online TrainingJava Online Training
Java Online Training
 
Microservices for Systematic Profiling and Monitoring of the Refactoring
Microservices for Systematic Profiling and Monitoring of the RefactoringMicroservices for Systematic Profiling and Monitoring of the Refactoring
Microservices for Systematic Profiling and Monitoring of the Refactoring
 
Behat bdd training (php) course slides pdf
Behat bdd training (php) course slides pdfBehat bdd training (php) course slides pdf
Behat bdd training (php) course slides pdf
 
Restful风格ž„web服务架构
Restful风格ž„web服务架构Restful风格ž„web服务架构
Restful风格ž„web服务架构
 
Rapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxRapid API Development ArangoDB Foxx
Rapid API Development ArangoDB Foxx
 
The WorldCat Search API
The WorldCat Search APIThe WorldCat Search API
The WorldCat Search API
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)
 
Best Java Online Training in India
Best Java Online Training in IndiaBest Java Online Training in India
Best Java Online Training in India
 
Composable Software Architecture with Spring
Composable Software Architecture with SpringComposable Software Architecture with Spring
Composable Software Architecture with Spring
 
Globe seminar
Globe seminarGlobe seminar
Globe seminar
 
Java Online Training
Java Online TrainingJava Online Training
Java Online Training
 
GNAT Pro User Day: Latest Advances in AdaCore Static Analysis Tools
GNAT Pro User Day: Latest Advances in AdaCore Static Analysis ToolsGNAT Pro User Day: Latest Advances in AdaCore Static Analysis Tools
GNAT Pro User Day: Latest Advances in AdaCore Static Analysis Tools
 
ppt_on_java.pptx
ppt_on_java.pptxppt_on_java.pptx
ppt_on_java.pptx
 
JSR 335 / java 8 - update reference
JSR 335 / java 8 - update referenceJSR 335 / java 8 - update reference
JSR 335 / java 8 - update reference
 
An Open Source Workbench for Prototyping Multimodal Interactions Based on Off...
An Open Source Workbench for Prototyping Multimodal Interactions Based on Off...An Open Source Workbench for Prototyping Multimodal Interactions Based on Off...
An Open Source Workbench for Prototyping Multimodal Interactions Based on Off...
 
Iterator - a powerful but underappreciated design pattern
Iterator - a powerful but underappreciated design patternIterator - a powerful but underappreciated design pattern
Iterator - a powerful but underappreciated design pattern
 

Live API Documentation

  • 1. Live API Documentation Subramanian, S., Inozemtseva, L., & Holmes, R. (2014, May). Live API documentation. In ICSE (pp. 643-652). Presenter: Hossein Mobasher Course: Software Evolution
  • 2. Contents • Introduction • Previous works • What’s new? • Scenario • Oracle Generation • Problems • Approach • Example • Evaluation • Conclusion 2 / 20
  • 3. Introduction • APIs enable complex functionality to be used by client programs  • Understanding how to use an API can be difficult  • API documentation is often insufficient on its own. • Writing documentation and keeping it up to date is very difficult  • Developers ignore the documentation that does exist and declare that “code is king” 3 / 20
  • 4. Introduction (continue) • Online sites fill the gap between traditional API documentation and more example-based resources  • StackOverflow • Github Gists • Unfortunately, these two important classes of documentation are independent  • Baker links source code examples to API documentation  4 / 20
  • 5. Previous works • Identify source code references within non-code resources. • These approaches have several limitations: • Some systems explicitly ignored external references. • Others only returned partially qualified names (PQN). • Are insufficient for documentation linking. • None of them worked for dynamically-typed languages. 5 / 20
  • 6. What’s new? • A constraint-based, iterative approach for determining the fully qualified names of code elements in source code snippets. • A prototype tool that implements this approach and uses the results to automatically create bidirectional links between documentation and source code examples. 6 / 20
  • 7. Scenario 1 • Code snippet posted to StackOverflow to assist a developer who didn’t understand how to manipulate the state of History objects. • Baker can uniquely link bolded elements to the API. • The elements for which it can determine a fully qualified name. (FQN) 7
  • 8. Scenario 2 • Code snippet that a developer is trying to make a web app that can take a photo and inject it into an element in an HTML documents. • Baker also can identify the API that bolded elements are from. 8
  • 9. Oracle Generation • Baker’s oracle is key to its success. • It is generally impossible to identify FQN of the code elements in a snippets. • FQN is essential to documentation linking tasks. • The oracle are implemented as web services. • Allowing Java/JS to be updated dynamically by any user or program. 9 / 20
  • 10. Oracle Generation (continue) • Java Oracle • Containing class, method and field signatures. • Using Neo4j to represent the hierarchies between code elements that an object-oriented language like Java offers. • Oracle includes full type information in the database. • The types of classes, fields, return types, and parameters. • The Java oracle can be dynamically updated by adding an appropriate JAR file. 10 / 20
  • 11. Oracle Generation (continue) • JS Oracle • Is built by statically analyzing the source files of the libraries to be included. • Using ESPRIMA to parse the source code of each library. • ESPRIMA returns a JSON representation of the AST. • JS oracle identifies all of the ‘Function Expressions’ and ‘Function Declaration’ nodes. • Parsing problems: • JavaScript is dynamically typed language. It is difficult to identify all method declarations by static analysis of source code. • JavaScript is not annotated with visibility (e.g. public and private) 11 / 20
  • 12. Problems • Parsing code snippets is more difficult than full files. • Code snippets can be ambiguous. • Kinds of ambiguity: • Declaration Ambiguity • External Reference Ambiguity 12 / 20
  • 13. Approach • Deductive Linking • Handles declaration and external reference ambiguity. • The goal is identifying the sole FQN that a given identifier can represent. • Generating AST for code snippets. • Uses information from the oracle to deduce facts about the AST. 13 / 20
  • 14. Example (JavaBaker) • History: 58 candidate types are recorded for History in oracle. • addHistoryListener: Test 58 candidate types to see which ones contain a method called addHistoryListener(…) that take a single object parameter. • This results in 4 candidate methods. • History node is also updated (reduced from 58 to 4) • History.getToken: Test 4 candidates, reduced to from 4 to 2 • … • At the end, Baker iterates again. • Baker can be identified History as com.google.gwt.user.client.History 14 / 20
  • 15. Example (JSBaker) • $(…).on(…) • $ matches only with jQuery. • on matches with jQuery’s on method. (Even though there are three on method in the oracle) • useGetPicture • Oracle doesn’t contain a result for that. • Baker records that this function as locally defined, rather than being an external function. • … 15 / 20
  • 16. Evaluation • Two research questions: • Can Baker accurately identify API elements in code snippets? • Does Baker work on a variety of systems, or is it limited to just a few libraries? 16 / 20
  • 17. Linker Accuracy • Precision is much more important than recall. • Choose five Java systems (libraries) for analyzing. • Android/ GWT/ Hibernate/ Joda Time/ XStream • Manually examined 50 code elements for each system to determine if the result returned by Baker: • True Positive (TP) • False Positive (FP) • False Negative (FN) 17 / 20
  • 18. Linker Accuracy (continue) • Baker’s overall Java precision (0.98) and recall (0.83). Only exact matches (cardinality = 1) were considered. 18 / 20 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 98% 𝑅𝑒𝑐𝑎𝑙𝑙 = 83%
  • 19. Linker Accuracy (continue) 19 / 20 • Baker’s overall JavaScript precision (0.97) and recall (0.96). Only exact matches were considered. (cardinality = 1) 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = 97% 𝑅𝑒𝑐𝑎𝑙𝑙 = 96%
  • 20. Example Diversity • JavaBaker parsed 4000 source code examples. • Identified over 30000 links to 4500 unique elements. • JSBaker parsed 1000 source code examples. • Identified over 10000 links to 500 unique elements.
  • 21. Qualifying High-cardinality Match • Linking methods may return more than one match when there isn’t enough information to FQN of a method or type. • This is relatively rare. • Graph shows the cardinality of the result for each of the 4,000 snippets. • JDK types and methods have been removed. • The majority (69%) of elements can be precisely identified. 21 / 20
  • 22. Conclusion • Maintaining API documentation is challenging, time-consuming task. • The documentation is frequently out of date. • Baker automatically generates links between API documentation and source code examples. • Baker has high precision. (0.97) 22 / 20