Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Analyzing APIs Documentation and Codeto Detect Directive Defects

344 views

Published on

Application Programming Interface (API) documents represent one of the most important references for API users. However, it is frequently reported that the documentation is inconsistent with the source code and deviates from the API itself. Such inconsistencies in the documents inevitably confuse the API users hampering considerably their API comprehension and the quality of software built from such APIs. In this paper, we propose an automated approach to detect defects of API documents by leveraging techniques from program comprehension and natural language processing. Particularly, we focus on the directives of the API documents which are related to parameter constraints and exception throwing declarations. A first-order logic based constraint solver is employed to detect such defects based on the obtained analysis results. We evaluate our approach on parts of well documented JDK 1.8 APIs. Experiment results show that, out of around 2000 API usage constraints, our approach can detect 1146 defective document directives, with a precision rate of 83.1%, and a recall rate of 81.2%, which demonstrates its practical feasibility.

  • Be the first to comment

Analyzing APIs Documentation and Codeto Detect Directive Defects

  1. 1. Analyzing APIs Documentation and Code to Detect Directive Defects Sebastiano Panichella Ruihang Gu Taolue Chen Harald Gall Yu Zhou Zhiqiu Huang
  2. 2. Outline 2 Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics DRONE Context: Proposed Solution based on: Case Study: APIs Usage in OSS and Industrial Projects - NLP Approaches; - Static Analysis Techniques. Assessment of DRONE on Documentation and Code 8 Java Libraries Documentation Code
  3. 3. Open Source (OS) and Industrial Projects 3 “Social networks like Facebook or Pinterest, or utilities like  Google Maps or Dropbox are popular examples of APIs providers.”
  4. 4. Open Source (OS) and Industrial Projects 4 “Social networks like Facebook or Pinterest, or utilities like  Google Maps or Dropbox are popular examples of APIs providers.”
  5. 5. Open Source (OS) and Industrial Projects 5“APIs are great time savers for Developers…”
  6. 6. 6 Difficult to Understand describe Information Needed Ideally…. “API documents represent one of the most important references for developers…” Source Code APIs Documents Application Programming Interface (API) Documents
  7. 7. 7 Software Changes over the Time “…as consequence the original documentation tend to be incomplete and inconsistent with the source code…” Insufficient Information Source Code Difficult to Understand APIs Documents Inconsistent withComing back to the reality... Inconsistent/ Incomplete
  8. 8. 8 Source Code APIs Documents Inconsistent/ Incomplete API Document Defect: Example 1 Class: TextLayout. Method: getBlackBoxBounds(int firstEndpoint, int secondEndpoint) API Document JDK-1.8 https://docs.oracle.com/javase/8/docs/api/java/awt/font/TextLayout.html
  9. 9. 9 Source Code APIs Documents Inconsistent/ Incomplete API Document Defect: Example 1 API Document JDK-1.8 Class: TextLayout. Method: getBlackBoxBounds(int firstEndpoint, int secondEndpoint) https://docs.oracle.com/javase/8/docs/api/java/awt/font/TextLayout.html
  10. 10. 10 Source Code APIs Documents Inconsistent/ Incomplete API Document Defect: Example 1 API Document JDK-1.8 Class: TextLayout. Method: getBlackBoxBounds(int firstEndpoint, int secondEndpoint) ————————————-—————————————- ————————————-—————————————- https://docs.oracle.com/javase/8/docs/api/java/awt/font/TextLayout.html
  11. 11. 11 Source Code APIs Documents Inconsistent/ Incomplete API Document Defect: Example 1 API Document JDK-1.8 Class: TextLayout. Method: getBlackBoxBounds(int firstEndpoint, int secondEndpoint) ————————————-—————————————- ————————————-—————————————- https://docs.oracle.com/javase/8/docs/api/java/awt/font/TextLayout.html
  12. 12. 12 Source Code APIs Documents Incomplete API Document Defect: Example 1 API Document JDK-1.8 Class: TextLayout. Method: getBlackBoxBounds(int firstEndpoint, int secondEndpoint) ————————————-—————————————- ————————————-—————————————- https://docs.oracle.com/javase/8/docs/api/java/awt/font/TextLayout.html
  13. 13. ————————————-—————————————- ————————————-—————————————- 13 Source Code APIs Documents Inconsistent API Document Defect: Example 2 API Document JDK-1.8 Class: InputEvent Method: getMaskForButton(int button) https://docs.oracle.com/javase/8/docs/api/java/awt/event/InputEvent.html
  14. 14. 14 Source Code APIs Documents Inconsistent API Document Defect: Example 2 API Document JDK-1.8 Class: InputEvent Method: getMaskForButton(int button) ————————————-—————————————- ————————————-—————————————- https://docs.oracle.com/javase/8/docs/api/java/awt/event/InputEvent.html
  15. 15. 15 Source Code APIs Documents Inconsistent API Document Defects are Frequent API Document JDK-1.8 Class: InputEvent Method: getMaskForButton(int button) ————————————-—————————————- ————————————-—————————————- https://docs.oracle.com/javase/8/docs/api/java/awt/event/InputEvent.html “…and tend to be discovered and fixed after long time…” http://stackoverflow.com/questions/2967303/inconsistency-in-java-util-concurrent-future
  16. 16. 16 DRONE DetectoR of dOcumentatioN dEfects Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics
  17. 17. 17 DRONE DetectoR of dOcumentatioN dEfects Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics
  18. 18. 18 Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics DRONE DetectoR of dOcumentatioN dEfects “…we consider 4 cases of parameter usage constraints”
  19. 19. 19 Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics DRONE DetectoR of dOcumentatioN dEfects “…we consider 4 cases of parameter usage constraints” 1) “Nullness not allowed” 2) “Nullness allowed” 3) “Type restriction” 4) “Range limitation” NULL T
  20. 20. 20 Step 1. Construct AST Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics
  21. 21. 21 Step 1. Construct AST Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics Source Code
  22. 22. 22 Step 1. Construct AST Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics Source Code c c(i+1) ……. …….
  23. 23. 23 Step 1. Construct AST Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics 1) for each method “m” the 2) its code call graph G c(i+1); c = { call(m,c) } c(i+1) ……. ……. “…and extracts”
  24. 24. 24 Step 1. Construct AST Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics 1) for each method “m” the 2) its code call graph G c(i+1); c = { call(m,c) } c(i+1) ……. ……. “…and extracts”
  25. 25. 25 Step 2. Extract the Exception Information Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics c(i+1);c(i+1) ……. ……. c For each
  26. 26. 26 Step 2. Extract the Exception Information Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics c(i+1);c(i+1) ……. ……. c For each we collect
  27. 27. 27 Step 2. Extract the Exception Information Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics c(i+1);c(i+1) ……. ……. c For each in the form of (m; P; t; c) tuples we collect
  28. 28. 28 Step 2. Extract the Exception Information Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics c(i+1);c(i+1) ……. ……. c For each we collect
  29. 29. 29 Step 2. Extract the Exception Information Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics c(i+1);c(i+1) ……. ……. c For each we collect
  30. 30. 30 Step 2. Extract the Exception Information Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics c(i+1);c(i+1) ……. ……. c For each we collected in the form of (m; P; t; c) tuples
  31. 31. 31 Step 3. Classify the Exception Information Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics c(i+1);c(i+1) ……. ……. c For each we classify the in the form of (m; P; t; c) tuples 1) “Nullness not allowed” 2) “Nullness allowed” 3) “Type restriction” 4) “Range limitation” NULL T
  32. 32. 32 Step 4. Constraints Generation Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics c(i+1);c(i+1) ……. ……. in the form of (m; P; t; c) tuples 1) “Nullness not allowed” 2) “Nullness allowed” 3) “Type restriction” 4) “Range limitation” NULL T
  33. 33. 33 Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics Extract Constraints from Directives in API Documents
  34. 34. 34 Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics Extract Constraints from Directives in API Documents Natural Language Parsing
  35. 35. 35 Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics Extract Constraints from Directives in API Documents Natural Language Parsing “Recurrent Linguistic Patterns (LPs)…” Di Sorbo et al. ASE 2015
  36. 36. 36 Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics Extract Constraints from Directives in API Documents Natural Language Parsing “Recurrent Linguistic Patterns (LPs)…” 1) Manual analysis of LPs 2) Definition of NLP Heuristics for each LP Di Sorbo et al. ASE 2015
  37. 37. 37 Example “ @exception NullPointerException if the item’s value is equal to < code> null< /code>” “Recurrent Linguistic Patterns (LPs)…” Di Sorbo et al. ASE 2015 Class: InputEvent java.awt.Choice Method: addItem(String item)
  38. 38. 38 Example “ @exception NullPointerException if the item’s value is equal to < code> null< /code>” “Recurrent Linguistic Patterns (LPs)…” Di Sorbo et al. ASE 2015 Class: InputEvent java.awt.Choice Method: addItem(String item)
  39. 39. 39 Example “ @exception NullPointerException if the item’s value is equal to < code> null< /code>” “Recurrent Linguistic Patterns (LPs)…” Di Sorbo et al. ASE 2015 {@exception NullPointerException } Class: InputEvent java.awt.Choice Method: addItem(String item)
  40. 40. 40 Example “ @exception NullPointerException if the item’s value is equal to < code> null< /code>” “Recurrent Linguistic Patterns (LPs)…” Di Sorbo et al. ASE 2015 {@exception NullPointerException } Class: InputEvent java.awt.Choice Method: addItem(String item)
  41. 41. 41 Example “ @exception NullPointerException if the item’s value is equal to < code> null< /code>” “Recurrent Linguistic Patterns (LPs)…” Di Sorbo et al. ASE 2015 {@exception NullPointerException } Class: InputEvent java.awt.Choice Method: addItem(String item)
  42. 42. 42 Example “ @exception NullPointerException if the item’s value is equal to < code> null< /code>” Di Sorbo et al. ASE 2015 {@exception NullPointerException } Class: InputEvent java.awt.Choice Method: addItem(String item) “Recurrent Linguistic Patterns (LPs)…”
  43. 43. 43 Example “ @exception NullPointerException if the item’s value is equal to < code> null< /code>” {@exception NullPointerException } Class: InputEvent java.awt.Choice Method: addItem(String item) “Linguistic Pattern (LP)…” “item equal to null” Di Sorbo et al. ASE 2015
  44. 44. 44 Definition of NLP Heuristic {@exception NullPointerException } “Linguistic Pattern (LP)…” “item equal to null” Steps: 1) Considering the relevant details; 2) Generalizing some information; 3) Ignoring useless information NLP Heuristic: Di Sorbo et al. ASE 2015
  45. 45. 45 Definition of NLP Heuristic {@exception NullPointerException } “Linguistic Pattern (LP)…” “item equal to null” Steps: 1) Considering the relevant details; 2) Generalizing some information; 3) Ignoring useless information NLP Heuristic: 1) if the item’s value is equal to null” Di Sorbo et al. ASE 2015
  46. 46. 46 Definition of NLP Heuristic {@exception NullPointerException } “Linguistic Pattern (LP)…” “item equal to null” Steps: 1) Considering the relevant details; 2) Generalizing some information; 3) Ignoring useless information NLP Heuristic: 1) if the item’s value is equal to null” 2) if the (subj)’s value is equal to null” Di Sorbo et al. ASE 2015
  47. 47. 2) if the (subj)’s value is equal to null” 47 Definition of NLP Heuristic {@exception NullPointerException } “Linguistic Pattern (LP)…” “item equal to null” Steps: 1) Considering the relevant details; 2) Generalizing some information; 3) Ignoring useless information NLP Heuristic: 1) if the item’s value is equal to null” 3) if the (subj)’s value is equal to null” Di Sorbo et al. ASE 2015
  48. 48. 2) if the (subj)’s value is equal to null” 48 Definition of NLP Heuristic “Linguistic Pattern (LP)…” “item equal to null” NLP Heuristic: 1) if the item’s value is equal to null” 3) if the (subj)’s value is equal to null” 64 NLP Heuristics Steps: 1) Considering the relevant details; 2) Generalizing some information; 3) Ignoring useless information Di Sorbo et al. ASE 2015
  49. 49. 3) if the (subj)’s value is equal to null” 2) if the (subj)’s value is equal to null” 49 NLP Heuristic: 1) if the item’s value is equal to null” 64 NLP Heuristics 3) if the (subj)’s value is equal to null” Steps: 1) Considering the relevant details; 2) Generalizing some information; 3) Ignoring useless information Generation of FOL Constraints
  50. 50. 3) if the (subj)’s value is equal to null” 2) if the (subj)’s value is equal to null” 50 Generation of FOL Constraints NLP Heuristic: 1) if the item’s value is equal to null” 64 NLP Heuristics 3) if the (subj)’s value is equal to null” (subj) = null Steps: 1) Considering the relevant details; 2) Generalizing some information; 3) Ignoring useless information
  51. 51. 51 Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics SMT Solver
  52. 52. 52 Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics ? SMT Solver
  53. 53. 53 Case Study Documentation Code
  54. 54. 54 Two Experiments Experiment I Experiment II java.awt javax.swing Other six JDK Libraries
  55. 55. 55 Experiment I java.awt javax.swing 0.5 million LOC16,379 Javadoc Tags
  56. 56. 56 Experiment I java.awt javax.swing DRONE 0.5 million LOC16,379 Javadoc Tags ? 1379 Potential Defects
  57. 57. 57 Experiment I java.awt javax.swing DRONE 0.5 million LOC16,379 Javadoc Tags ? 1379 Potential Defects 3 Validators
  58. 58. 58 Experiment I java.awt javax.swing DRONE 0.5 million LOC16,379 Javadoc Tags ? 1379 Potential Defects 3 Validators 1146 Real Defects
  59. 59. 59 Experiment I java.awt javax.swing DRONE 0.5 million LOC16,379 Javadoc Tags ? 1379 Potential Defects 1146 Real Defects Precision and Recall > 0.81
  60. 60. 60 Experiment II DRONE ? 2057 Potential Defects 1106 Real Defects Other six JDK Libraries
  61. 61. 61 Experiment II DRONE ? 2057 Potential Defects 1106 Real Defects Precision > 0.58 Recall > 0.84 Other six JDK Libraries
  62. 62. Conclusion & Future Work “API documents represent one of the most important references for developers…”
  63. 63. Conclusion & Future Work Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics DRONE Documentation Code “API documents represent one of the most important references for developers…” “Analyzing APIs Documentation and Code to Detect Directive Defects”. ICSE 2017
  64. 64. Conclusion & Future Work Code API Document Software Artifacts AST Parsing Pre-Process and POS Tagging Defect Reports Control Flow- Based Constraint Analysis SMT Solver Dependency Parsing and Pattern Analysis Code Constraint FOL Generating Doc Constraint FOL Generating HeuristicsHeuristics DRONE Documentation Code With DRONE we analyzed over 1 million of LOC and more than 30,000 Javadoc documents belonging to 8 java libraries detecting around 2000 of API documentation defects. with high precision (values between 0.58 - 0.83) and an high recall (values > 0.81) results. “API documents represent one of the most important references for developers…” “Analyzing APIs Documentation and Code to Detect Directive Defects”. ICSE 2017

×