Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
What to Upload to SlideShare
Loading in …3
×
1 of 26

Analyzing the Evolution of Testing Library Usage in Open Source Java Projects

0

Share

Download to read offline

A presentation that was given in SANER 2017

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Analyzing the Evolution of Testing Library Usage in Open Source Java Projects

  1. 1. Analyzing the Evolution of Testing Library Usage in Open Source Java Projects 1 Ahmed ZEROUALI, Tom MENS Software Engineering Lab, University of Mons, Belgium SANER 2017 Early Research Achievements — Klagenfurt, Austria, February 2017
  2. 2. Motivation 2 http://blog.takipi.com/we-analyzed-60678-libraries-on-github-here-are-the-top-100/
  3. 3. Motivation 3 • Improving the design. • Reducing the cost of bugs. • More fun to code. • Demonstrating concrete progress. …
  4. 4. Motivation 4 Library Maintainer Software Developer
  5. 5. Dataset 5 The most popular programming language [1]. > 900k of open source Java projects. [1] http://www.tiobe.com/tiobe-index/ The most popular testing and mocking libraries in .
  6. 6. Dataset 6 Testing libraries: Mocking libraries: Matching and asserting libraries: AssertJ
  7. 7. Dataset 7 - Testing related technologies: • Detected by looking at Import statements and Project Object Model files • 13 different technologies considered - GitHub Java Corpus: • 4,532 Java projects retained
  8. 8. Dataset 8 Metrics Value All projects 20,688 Projects using at least one of the considered libraries 4,532 Commits 125,580 Source files 10,033,726 Testing related import statements 31,264,586
  9. 9. RQ1 Which are the most frequently used testing-related libraries? 9 JUnit is the Undisputed King of Testing Java Libraries . Total number of project = 4,532
  10. 10. RQ1 Which are the most frequently used testing-related libraries? 10 JUnit is the Undisputed King of Testing Java Libraries .
  11. 11. RQ1 Which Are The Most Frequently Used Testing-related Libraries? 11 Total number of project = 4,532 JUnit is the Undisputed King of Testing Java Libraries .
  12. 12. RQ2 When Are Libraries Introduced in a Github Project’s Lifetime? 12
  13. 13. RQ2 When Are Libraries Introduced in a Github Project’s Lifetime? 13
  14. 14. RQ3 Which Libraries Are Used Over a Project’s Lifetime? 14
  15. 15. RQ4 Which Libraries Are Used Simultaneously in Projects? 15 • All projects that used Hamcrest, AsseretJ, Spring, Mockito or PowerMock used Junit (>94%). • JUnit is used much less frequently with its competitor TestNG (7%) • More than 40% of projects that use AssertJ also use Hamcrest simultaneously • PowerMock is mostly used as an extension of Mockito (86.5%)
  16. 16. RQ5 How Frequently Are Libraries Used Over Time? 16 Testing and matching libraries -> <- Mocking libraries
  17. 17. RQ6 How Frequently do Libraries Co-occur at File Level? 17 Proportional distribution of Java files (in all projects) relating to pairs of mocking libraries.
  18. 18. RQ6 How Frequently do Libraries Co-occur at File Level? 18 Proportional distribution of Java files (in all projects) relating to pairs of testing and matching libraries.
  19. 19. RQ7 Do projects migrate to competing libraries? 19 # Permanent Migrations
  20. 20. RQ7 Do projects migrate to competing libraries? 20 # Temporary Migrations
  21. 21. Limitations 21  Open source Java Projects using Maven as the build automation tool.  Project’s lifetime two years.  Import statements.
  22. 22. Main Findings 22 • Many libraries are used simultaneously. • Junit is the the most prominent testing library. • Many of the considered libraries complement one another. • 5% of the considered Java projects were subject to library migrations.
  23. 23. Future Work 23 • Specific library version changes. • Usage of functionalities of different libraries. • Effort of migrating between different libraries and major versions.
  24. 24. 24
  25. 25. Questions ? 25
  26. 26. RQ3 Which Libraries Are Used Over a Project’s Lifetime? 26

Editor's Notes

  • According to a recent blog post, testing and mocking libraries, including JUnit, TestNG, Mockito and others are among the most popular Java libraries on GitHub.
    And that’s certainly due to the importance of testing in software development.
  • There are many important reasons to write unit tests. Just type unit testing reasons in Google and you will be amazed by the number developers that advice you to use unit testing. With reasons like:
    Testing improves the design.
    It demonstrates concrete progress
    It reduces the cost of bugs
    It’s more fun to code..
    etc
  • So, in order to improve the way in which software developers use testing libraries, it is useful to understand how testing libraries are used in other projects, how the library evolves, and when one should upgrade to a new version or migrate to a competing library.

    For testing library developers, it is useful to assess the popularity of their libraries and take this into account when developing new versions of their library.
  • That’s why we decided to analyze the usage of testing related libraries.

    ---We decided to study open source Java projects extracted from GitHub. We chose for Java projects because it is the most popular programming language.
    ---And we chose for GitHub as because we require full access to the project source code history in order to carry out our analysis, and because it’s the largest host of Java source code.
    ---For the libraries, we decided to study the most popular testing and mocking libraries based on their number of usages in the Maven Central Repository.
  • And we came up with the next list :
    For testing libraries we considered…
  • So, based on a monthly analysis of the import statements in each Java file of each project.
    we retained( found) 4,532 Java projects that used at least one of the considered Java libraries, and that use Maven for their automation build, and that have an active lifetime of at least 2years.

  • For these projects, we analysed in total
    More than 125 thousand commits,
    More than 10 million Java source code files
    And more than 31 million import statements for the considered libraries.
  • Based on an analysis of the usage of each of the considered Java projects that used at least one of the
    considered libraries at least once during their lifetime.

    We found that Junit was by far the most popular library. If a Java project uses a testing library, it is very likely to be Junit.
  • In comparison, the competing library TestNG is used in only 12% of projects.

    Mockito was by far the most used Mocking library.

    The matching library Hamcrest is considerably more popular than its competitor AssertJ, but this can be explained by the fact that AssertJ is much more recent.
  • So, based on what we have got here, we decided to focus for the rest of study , only on the first 8 libraries that are used by a sufficient number of projects.

  • So we analyzed,...

    We found that the considered libraries were introduced early. We found that 56% of all projects have started using these libraries as early as the first commit.
    Which can be explained either by the fact that these projects were already in development before coming to GitHub, or by the fact that they follow a test-driven development process from the beginning.



    STOP

    .””, implying that tests are already introduced in the very beginning of the Project. » »
  • So we analyzed,...

    We found that the considered libraries were introduced early. We found that 56% of all projects have started using these libraries as early as the first commit.
    Which can be explained either by the fact that these projects were already in development before coming to GitHub, or by the fact that they follow a test-driven development process from the beginning.



    STOP

    .””, implying that tests are already introduced in the very beginning of the Project. » »
  • We analysed if projects use different libraries over their lifetime.

    We found that JUnit occurs as the only testing library in 61% of all projects.

    TestNG is used as the only testing library in only 2%.

    And All projects that used either Hamcrest, Spring or AssertJ also used at least one other library during their lifetime.
    In the majority of the cases, Hamcrest and AssertJ are used in projects that have used JUnit in their lifetime.
  • Of all projects that used at least two of the considered libraries somewhere during their lifetime, we computed which pairs of libraries were actually being used simultaneously, in the same moment.

    We found that nearly all projects that use Hamcrest, AssertJ, Spring, Mockito or PowerMock also used Junit simultaneously.
    Unsurprisingly, JUnit is used much less frequently with its competitor TestNG.

    Also, we found that more than 40% of projects that use AssertJ also use Hamcrest simultaneously. Which can be a sign that many projects that use Hamcrest are in the process of migrating to AssertJ.

    For the mocking libraries, we observed that PowerMock is mostly used as an extension of Mockito , and much less as an extension of EasyMock.
  • And then we analyzed how frequently are libraries used over time.

    The figure on right shows that The usage of JUnit appears to be decreasing, where TestNG and Spring have a stable proportion of projects using them, while The proportional usage of Hamcrest and AssertJ is increasing over time.

    For mocking libraries, the figure on left shows that the proportional usage of Mockito and PowerMock has remarkably increased, whereas the usage of EasyMock is slightly declining over time.
  • For projects using certain pairs of libraries simultaneously, we explored if these library pairs are also used together within the same Java file.

    The following figure shows for the pair of libraries, a violin plot with the distribution across projects of the ratio between the number
    of files that relate to each or both libraries, and the total number of files that relate to any of them.

    We found that for the pairs of mocking libraries, Mockito and Easymock are rarely used together . While the other two pairs can be used together at file level.
  • Also, for the other pairs of testing and matching libraries, we found that the considered libraries are rarely used together at file level.
  • For the last research question, we checked whether the projects that we studied replaced one of the testing related libraries by an other one, in other words we checked whether they have performed a testing related library migration.

    And we found that The highest observed number of migrations is from Hamcrest to AssertJ, even if these two libraries were used together in only 90 of all considered projects.
    No migrations were observed from AssertJ to Hamcrest, indicating the increasing use of AssertJ as a competing library.

    We also found a high number of migrations from Junit to TestNG, while only half of this number of projects migrated from TestNG to JUnit. Which means that half of the migrated projects didn’t use both libraries simultaneously while performing a migration from JUnit to TestNG or conversely.

    For the mocking libraries, we observe that most migrations go from EasyMock to Mockito, and that’s most likely because it offers more functionality.
  • We also found of temporary library migration.
    Nine projects temporarily migrated from JUnit to TestNG and returned to JUnit after some time.
    Four other projects performed the opposite temporary migration.
    And four projects migrated from Hamcrest to AssertJ and returned to using Hamcrest.
  • of course our analysis suffers from some limitations.
    Our results should not be generalized beyond Java projects or to private industrial projects or to projects that do not rely on the build automation tool Maven, or to projects that have a lifetime of less than two years.

    Also, The analysis that we have conducted may lead to false positives, since imported classes and interfaces are not necessarily used in the source code.
  • Finally, We studied the usage of eight popular testing, matching and mocking libraries in a large corpus of Java projects hosted on GitHub. And we found:
    …….

    Such empirical Analysis can be useful to project developers desiring to introduce an additional library, or to replace an
    existing library by another one, as our analysis reveals which competing libraries are available to migrate to.

    It’s also usefull for library maintainers so they could assess the popularity of their libraries and take this into account when developing new versions of their library.
  • In future work we plan to take into account the effect of the specific library version on the migration phenomenon.

    And We aim also to conduct a more analysis of how frequently these libraries are used, how this evolves over time, and whether certain combinations of functionalities of different libraries are frequently used together.

    And also we would like to analyse the effort of migrating between different libraries, as well as the effort of upgrading to a new major version of a library.
  • In case someone asks about it.
  • ×