Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Automating the Generation of Benchmark Suites

183 views

Published on

Creation, Assessment, and Management of Effective Test Corpora Presented at the National Java Resource Workshop at SPLASH'17 in Vancouver, Canada

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Automating the Generation of Benchmark Suites

  1. 1. SOFTWARE TECHNIK Automating the Generation of Benchmark Suites Creation, Assessment, and Management of Effective Test Corpora Ben Hermann @benhermann Joint work Lisa Nguyen Quang Do, Michael Eichberg, Karim Ali, and Eric Bodden National Java Resource Workshop @ SPLASH, Vancouver October 23rd, 2017
  2. 2. @benhermannABM @ NJR 2017 Evaluation of Code Analyses 2
  3. 3. @benhermannABM @ NJR 2017 Evaluation of Code Analyses • Compare results of an analysis against • A ground truth show soundness • A previous analysis show improvement (e.g., in precision) 3 New analysis Ground truthPrevious analyses
  4. 4. @benhermannABM @ NJR 2017 Evaluation of Code Analyses • Compare results of an analysis against • A ground truth show soundness • A previous analysis show improvement (e.g., in precision) 3 New analysis Ground truthPrevious analyses Evaluation corpus analyzesanalyzes is based on
  5. 5. @benhermannABM @ NJR 2017 Construction of a Corpus 4
  6. 6. @benhermannABM @ NJR 2017 Construction of a Corpus 4 Size
  7. 7. @benhermannABM @ NJR 2017 Construction of a Corpus 4 Size Content
  8. 8. @benhermannABM @ NJR 2017 Construction of a Corpus 4 Size Content Representativeness
  9. 9. @benhermannABM @ NJR 2017 Construction of a Corpus 4 Size Content Representativeness Permanence Criteria from Tempero et al. 2010
  10. 10. @benhermannABM @ NJR 2017 Construction of a Corpus 4 Size Content Representativeness Permanence Criteria from Tempero et al. 2010 Sources
  11. 11. @benhermannABM @ NJR 2017 Construction of a Corpus 4 Size Content Representativeness Permanence Criteria from Tempero et al. 2010 Sources Purpose
  12. 12. @benhermannABM @ NJR 2017 Construction of a Corpus 4 Size Content Representativeness Permanence Criteria from Tempero et al. 2010 Sources Purpose How to determine this?
  13. 13. @benhermannABM @ NJR 2017 Construction of a Corpus 4 Size Content Representativeness Permanence Criteria from Tempero et al. 2010 Sources Purpose How to determine this? How to achieve this?
  14. 14. @benhermannABM @ NJR 2017 Sourcing Projects 
 for the Corpus 5 ABM Size Content
  15. 15. @benhermannABM @ NJR 2017 Sourcing Projects 
 for the Corpus 5 ABM GitHub BitBucket … collect Size Content
  16. 16. @benhermannABM @ NJR 2017 Sourcing Projects 
 for the Corpus 5 ABM GitHub BitBucket … collect Criteria such as size, license, or programming language apply Size Content
  17. 17. @benhermannABM @ NJR 2017 Sourcing Projects 
 for the Corpus 5 ABM GitHub BitBucket … collect build Compiled Projects Criteria such as size, license, or programming language apply Size Content
  18. 18. @benhermannABM @ NJR 2017 Sourcing Projects 
 for the Corpus 5 ABM GitHub BitBucket … collect build Compiled Projects Criteria such as size, license, or programming language apply We currently support maven and sbt, but are expanding (e.g., gradle) Size Content
  19. 19. @benhermannABM @ NJR 2017 How can we achieve representativeness for a corpus? 6
  20. 20. @benhermannABM @ NJR 2017 Representativeness in Custom Collections 7
  21. 21. @benhermannABM @ NJR 2017 Representativeness in Custom Collections 7 We used the three algorithms to construct respective call graphs for a large set of libraries: the 100 most used distinct Java related libraries from Maven Central Repository. The set is representative for a wide range of libraries.
  22. 22. @benhermannABM @ NJR 2017 Representativeness in Custom Collections 7 We used the three algorithms to construct respective call graphs for a large set of libraries: the 100 most used distinct Java related libraries from Maven Central Repository. The set is representative for a wide range of libraries. It contains very small (e.g., JUnit) to very large (e.g., Scala Library) libraries; libraries developed primarily in an industrial context (e.g., Guava) or in an open-source setting (e.g., Apache Commons); libraries from very different domains: testing (e.g., Hamcrest, Mockito), databases (e.g., HSQLDB), bytecode engineering (e.g., cglib), runtime environments (e.g., Scala Runtime), containers (e.g., Netty), and also general utility libraries (e.g., osgi.core).
  23. 23. @benhermannABM @ NJR 2017 Representativeness in Custom Collections 7 We used the three algorithms to construct respective call graphs for a large set of libraries: the 100 most used distinct Java related libraries from Maven Central Repository. The set is representative for a wide range of libraries. Additionally, it contains two libraries that have unusual properties: jsr305 and easymockclassextesion both do not contain a single instance method call. The jsr305 project is just a collection of annotations and easymockclassextesion only contains interface definitions and a few classes with static methods. It contains very small (e.g., JUnit) to very large (e.g., Scala Library) libraries; libraries developed primarily in an industrial context (e.g., Guava) or in an open-source setting (e.g., Apache Commons); libraries from very different domains: testing (e.g., Hamcrest, Mockito), databases (e.g., HSQLDB), bytecode engineering (e.g., cglib), runtime environments (e.g., Scala Runtime), containers (e.g., Netty), and also general utility libraries (e.g., osgi.core).
  24. 24. @benhermannABM @ NJR 2017 Representativeness in Custom Collections 7 We used the three algorithms to construct respective call graphs for a large set of libraries: the 100 most used distinct Java related libraries from Maven Central Repository. The set is representative for a wide range of libraries. Additionally, it contains two libraries that have unusual properties: jsr305 and easymockclassextesion both do not contain a single instance method call. The jsr305 project is just a collection of annotations and easymockclassextesion only contains interface definitions and a few classes with static methods. It contains very small (e.g., JUnit) to very large (e.g., Scala Library) libraries; libraries developed primarily in an industrial context (e.g., Guava) or in an open-source setting (e.g., Apache Commons); libraries from very different domains: testing (e.g., Hamcrest, Mockito), databases (e.g., HSQLDB), bytecode engineering (e.g., cglib), runtime environments (e.g., Scala Runtime), containers (e.g., Netty), and also general utility libraries (e.g., osgi.core). Lastly, the set also contains libraries that are written in other languages, such as Scala (e.g., ScalaTest), whose compilers only use a subset of the JVM’s concepts. The Scala compiler, e.g., does not use package and protected visibility. This significantly limits our possibilities to identify the library-private implementation (recall that LibCHACPA identifies a library’s private implementation based on the evaluation of the code elements’ visibilities). For each library, we also downloaded all of its dependencies to build complete class hierarchies for them.
  25. 25. @benhermannABM @ NJR 2017 Representativeness in Custom Collections 7 We used the three algorithms to construct respective call graphs for a large set of libraries: the 100 most used distinct Java related libraries from Maven Central Repository. The set is representative for a wide range of libraries. Additionally, it contains two libraries that have unusual properties: jsr305 and easymockclassextesion both do not contain a single instance method call. The jsr305 project is just a collection of annotations and easymockclassextesion only contains interface definitions and a few classes with static methods. It contains very small (e.g., JUnit) to very large (e.g., Scala Library) libraries; libraries developed primarily in an industrial context (e.g., Guava) or in an open-source setting (e.g., Apache Commons); libraries from very different domains: testing (e.g., Hamcrest, Mockito), databases (e.g., HSQLDB), bytecode engineering (e.g., cglib), runtime environments (e.g., Scala Runtime), containers (e.g., Netty), and also general utility libraries (e.g., osgi.core). Lastly, the set also contains libraries that are written in other languages, such as Scala (e.g., ScalaTest), whose compilers only use a subset of the JVM’s concepts. The Scala compiler, e.g., does not use package and protected visibility. This significantly limits our possibilities to identify the library-private implementation (recall that LibCHACPA identifies a library’s private implementation based on the evaluation of the code elements’ visibilities). For each library, we also downloaded all of its dependencies to build complete class hierarchies for them. Michael Reif, Michael Eichberg, Ben Hermann, Johannes Lerch, and Mira Mezini. 2016. Call graph construction for Java libraries. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016) Description of the Darmstadt Library Corpus (DLC) from:
  26. 26. @benhermannABM @ NJR 2017 Representativeness 
 in ABM 8 ABM build Compiled Projects Representativeness
  27. 27. @benhermannABM @ NJR 2017 Representativeness 
 in ABM 8 ABM build Compiled Projects Representativeness Hermes inspect select
  28. 28. @benhermannABM @ NJR 2017 How Hermes Works 9 Corpus candidates
  29. 29. @benhermannABM @ NJR 2017 How Hermes Works 9 Corpus candidates Hermes
  30. 30. @benhermannABM @ NJR 2017 How Hermes Works 9 Corpus candidates Hermes Optimal corpus
  31. 31. @benhermannABM @ NJR 2017 How Hermes Works 9 Corpus candidates Hermes Optimal corpus Feature Queries
  32. 32. @benhermannABM @ NJR 2017 How Hermes Works 9 Corpus candidates Hermes Optimal corpus Feature Queries Manual or Automatic Selection
  33. 33. @benhermannABM @ NJR 2017 OPAL How Hermes Works 9 Corpus candidates Hermes Optimal corpus Feature Queries Manual or Automatic Selection
  34. 34. @benhermannABM @ NJR 2017 OPAL How Hermes Works 9 Corpus candidates Hermes Optimal corpus Feature Queries Manual or Automatic Selection Introduced at 
 SOAP 2014 Introduced at 
 SOAP 2017
  35. 35. @benhermannABM @ NJR 2017 Feature Queries 10 trait FeatureQuery { // … def apply[S]( projectConfiguration: ProjectConfiguration, project: Project[S], rawClassFiles: Traversable[(da.ClassFile, S)] ): TraversableOnce[Feature[S]] // … }
  36. 36. @benhermannABM @ NJR 2017 Feature Queries 10 trait FeatureQuery { // … def apply[S]( projectConfiguration: ProjectConfiguration, project: Project[S], rawClassFiles: Traversable[(da.ClassFile, S)] ): TraversableOnce[Feature[S]] // … } Identifier, Project JAR Files, Library JAR Files, Statistics
  37. 37. @benhermannABM @ NJR 2017 Feature Queries 10 trait FeatureQuery { // … def apply[S]( projectConfiguration: ProjectConfiguration, project: Project[S], rawClassFiles: Traversable[(da.ClassFile, S)] ): TraversableOnce[Feature[S]] // … } Identifier, Project JAR Files, Library JAR Files, Statistics Complete reified project information (classes, fields, methods, bodys, etc.)
  38. 38. @benhermannABM @ NJR 2017 Feature Queries 10 trait FeatureQuery { // … def apply[S]( projectConfiguration: ProjectConfiguration, project: Project[S], rawClassFiles: Traversable[(da.ClassFile, S)] ): TraversableOnce[Feature[S]] // … } Identifier, Project JAR Files, Library JAR Files, Statistics Complete reified project information (classes, fields, methods, bodys, etc.) Raw class file information (e.g., for extracting information from the constant pool)
  39. 39. @benhermannABM @ NJR 2017 Feature Queries 10 trait FeatureQuery { // … def apply[S]( projectConfiguration: ProjectConfiguration, project: Project[S], rawClassFiles: Traversable[(da.ClassFile, S)] ): TraversableOnce[Feature[S]] // … } Identifier, Project JAR Files, Library JAR Files, Statistics Complete reified project information (classes, fields, methods, bodys, etc.) Raw class file information (e.g., for extracting information from the constant pool)List of detected features in the codebase (id, frequency of occurrence, (opt.) locations)
  40. 40. @benhermannABM @ NJR 2017 Already Implemented Queries 11
  41. 41. @benhermannABM @ NJR 2017 Already Implemented Queries 11 Existence of 
 Bytecode Instructions Class File Versions Class Types Trivial Reflection Fan-In/Fan-Out Field Access Method w/o Returns Method Types Various Metrics Recursive 
 Data Structures Size of
 Inheritance Tree API Usage
  42. 42. @benhermannABM @ NJR 2017 Feature Queries for 
 API Usage 12
  43. 43. @benhermannABM @ NJR 2017 Feature Queries for 
 API Usage 12 Bytecode 
 Instrumentation Class Loader GUI Crypto JDBC Reflection System Thread Unsafe
  44. 44. @benhermannABM @ NJR 2017 Constructing a Minimal Corpus • Dead-Path Analysis [FSE15] • Original evaluation conducted on the complete Qualitas Corpus • Minimal corpus only consists of 5 out of the 100 projects in the Qualitas Corpus • Evaluation cut down from 16.77 minutes to 2.82 minutes (~6x faster) while coverage is only 1.06% below the original corpus 13
  45. 45. @benhermannABM @ NJR 2017 Collection Permanence 14 Permanence ABM
  46. 46. @benhermannABM @ NJR 2017 Collection Permanence 14 Permanence ABM We store and retain collection definitions
  47. 47. @benhermannABM @ NJR 2017 Collection Permanence 14 Permanence ABM Download corpus and 
 provide on your infrastructure Collected Projects We store and retain collection definitions
  48. 48. @benhermannABM @ NJR 2017 Collection Permanence 14 Permanence ABM Publish 
 complete corpus Download corpus and 
 provide on your infrastructure Collected Projects We store and retain collection definitions
  49. 49. @benhermannABM @ NJR 2017 Collection Permanence 14 Permanence ABM Publish 
 complete corpus use DOI 
 for papers Download corpus and 
 provide on your infrastructure Collected Projects We store and retain collection definitions
  50. 50. @benhermannABM @ NJR 2017 Collection Permanence 14 Permanence ABM Publish 
 complete corpus use DOI 
 for papers Download corpus and 
 provide on your infrastructure Collected Projects We store and retain collection definitions We would love to see more services like this
  51. 51. @benhermannABM @ NJR 2017 Bringing it all together 15 ABM Hermes inspect GitHub BitBucket … collect build publish 
 complete corpus use DOI 
 for papers
  52. 52. SOFTWARE TECHNIK Automating the Generation of Benchmark Suites Creation, Assessment, and Management of Effective Test Corpora Ben Hermann @benhermann Joint work Michael Reif, Michael Eichberg, and Mira Mezini Thank you!

×