Advertisement

Analyzing the Evolution of Testing Library Usage in Open Source Java Projects

Postdoc Researcher
Jun. 25, 2019
Advertisement

More Related Content

Advertisement

Analyzing the Evolution of Testing Library Usage in Open Source Java Projects

  1. Analyzing the Evolution of Testing Library Usage in Open Source Java Projects 1 Ahmed Zerouali, Tom Mens Software Engineering Lab BENEVOL 2016 Research Seminar, Utrecht University
  2. Motivation 2 http://blog.takipi.com/we-analyzed-60678-libraries-on-github-here-are-the-top-100/
  3. Motivation 3 • Improving the design. • Reducing the cost of bugs. • More fun to code. • Demonstrating concrete progress. …
  4. Motivation 4 Library Maintainer Software Developer
  5. Dataset 5 The most popular programming language [1]. > 900k of open source Java projects. [1] http://www.tiobe.com/tiobe-index/ The most popular testing and mocking libraries in .
  6. Dataset 6 Testing libraries: Mocking libraries: Matching and asserting libraries: AssertJ
  7. Dataset 7 Metrics Value All projects 20,688 Projects using at least one of the considered libraries 4,532 Commits 125,580 Source files 10,033,726 Testing related import statements 31,264,586
  8. RQ1 Which are the most frequently used testing-related libraries? 8 JUnit is the Undisputed King of Testing Java Libraries . Total number of project = 4,532
  9. RQ1 Which are the most frequently used testing-related libraries? 9 Total number of project = 4,532 JUnit is the Undisputed King of Testing Java Libraries .
  10. RQ2 When are libraries introduced in a Github project’s lifetime? 10
  11. RQ3 Which libraries are introduced first? 11
  12. RQ4 Which combinations of libraries “co-occur” in the projects in which they are used? 12 In lifetime :
  13. RQ4 Which combinations of libraries “co-occur” in the projects in which they are used? 13 Simultaneously : • All projects that used Hamcrest, AsseretJ, Spring, Mockito or PowerMock used Junit. • JUnit is used much less frequently with its competitor TestNG • More than 40% of projects that use AssertJ also use Hamcrest simultaneously
  14. RQ5 How does the usage frequency of libraries evolve overtime? 14 Testing and matching libraries -> <- Mocking libraries
  15. RQ6 How frequently do different libraries “co-occur” at file level? 15 Proportional distribution of Java files (in all projects) relating to pairs of mocking libraries.
  16. RQ6 How frequently do different libraries “co-occur” at file level? 16 Proportional distribution of Java files (in all projects) relating to pairs of testing and matching libraries.
  17. RQ7 Do projects migrate to competing libraries? 17
  18. Limitations 18  Maven as the build automation tool.  Project’s lifetime two years.  Import statements.
  19. Conclusion 19 • Many libraries are used simultaneously. • Junit is the the most prominent testing library. • Many of the considered libraries complement one another. • 5% of the considered Java projects were subject to library migrations.
  20. Future work 20 • Specific library version changes. • Usage of functionalities of different libraries. • Effort of migrating between different libraries and major versions.
  21. 21
  22. 22
  23. Questions ? 23
  24. RQ3 Which combinations of libraries “co-occur” in the projects in which they are used? 24

Editor's Notes

  1. Hello everyone, My name is ahmed zerouali, i’m a phd student at the University of Mons and today i’d like to present to you my research, on the evolution of testing library usage in Open source java projects.
  2. According to a recent blog post, testing and mocking libraries, including JUnit, TestNG, Mockito and others are among the most popular Java libraries on GitHub. And that’s certainly due to the importance of testing in software development.
  3. There are many important reasons to write unit tests. Just type unit testing reasons in Google and you will be amazed by the number developers that advice you to use unit testing. With reasons like: Testing improves the design. It demonstrates concrete progress It reduces the cost of bugs It’s more fun to code.. etc
  4. So, in order to improve the way in which software developers use testing libraries, it is useful to understand how a testing library has been used in other projects, how the library evolves, and when one should upgrade to a new version or migrate to a competing library. For testing library developers, it is useful to assess the popularity of their libraries and take this into account when developing new versions of their library.
  5. That’s why we decided to analyze the usage of testing related libraries. ---We decided to study open source Java projects extracted from GitHub. We chose for Java projects because it is the most popular programming language. ---And we chose for GitHub as because we require full access to the project source code history in order to carry out our analysis, and because it’s the largest host of Java source code. ---For the libraries, we decided to study the most popular testing and mocking libraries based on their number of usages in the Maven Central Repository.
  6. And we came up with the next list : For testing libraries we selected …
  7. So, based on a monthly analysis of the import statements in each Java file of each project. we found 4,532 Java projects that used at least one of the considered Java libraries, and that use Maven for their automation build, and that have an active lifetime of at least 2years. For these projects, we analysed in total More than 125 thousand commits, More than 10 million Java source code files And more than 31 million import statements for the considered libraries.
  8. Based on an analysis of the usage of each of the considered Java projects that used at least one of the considered libraries at least once during their lifetime. We found that Junit was by far the most popular library. If a Java project uses a testing library, it is very likely to be Junit. In comparison, the competing TestNG library is used in only 12% of projects. Mockito was by far the most used Mocking library, it’s being used by 32% of all projects. The matching library Hamcrest is considerably more popular than its competitor AssertJ, but this can be explained by the fact that AssertJ is much more recent.
  9. So, based on what we have got here, we decided to focus for the rest of study , only on the first 8 libraries that are used by a sufficient number of projects.
  10. For each project, we analyzed after how long each used library got introduced. We found that the considered libraries were introduced early. We found that 56% of all projects have started using these libraries as early as the first commit. Which can be explained either by the fact that these projects were already in development before coming to GitHub, or by the fact that they follow a test-driven development process, implying that tests are already introduced in the very beginning of the Project.
  11. And Unsurprisingly, we observed that JUnit and TestNG are the first libraries to be introduced in the projects in which they occur with other libraries. AssertJ was never found to be introduced first, probably because it is a much more recent library.
  12. We analysed if projects use different libraries over their lifetime. We found that JUnit occurs as the only testing library in 61% of all projects. TestNG is used as the only testing library in only 2%. And All projects that used either Hamcrest, Spring or AssertJ also used at least one other library during their lifetime. In the majority of the cases, Hamcrest and AssertJ are used in projects that have used JUnit in their lifetime.
  13. Of all projects that used at least two of the considered libraries somewhere during their lifetime, we computed which pairs of libraries were actually being used simultaneously, in the same moment. We found that nearly all projects that use Hamcrest, AssertJ, Spring, Mockito or PowerMock also used Junit simultaneously. Unsurprisingly, JUnit is used much less frequently with its competitor TestNG. Also, we found that more than 40% of projects that use AssertJ also use Hamcrest simultaneously. This could be a sign that many projects that use Hamcrest are in the process of migrating to AssertJ. For the mocking libraries, we observed that PowerMock is mostly used as an extension of Mockito , and much less as an extension of EasyMock.
  14. And now, how does….? The figure on right shows that The usage of JUnit appears to be decreasing, where TestNG and Spring have a stable proportion of projects using them, while The proportional usage of Hamcrest and AssertJ is increasing over time. For mocking libraries, the figure on left shows that the proportional usage of Mockito and PowerMock has remarkably increased, whereas the usage of EasyMock is slightly declining over time.
  15. For projects using certain pairs of libraries simultaneously, we explored if these library pairs are also used together within the same Java files belonging to the project. The following figure shows for the pair of libraries, a violin plot with the distribution across projects of the ratio between the number of files that relate to each or both libraries, and the total number of files that relate to any of them. We found that for the pairs of mocking libraries, Mockito and Easymock are rarely used together . While the other two pairs can be used together at file level.
  16. Also, for the other pairs of testing and matching libraries, we found that the considered libraries are rarely used together at file level.
  17. For the last research question, we checked whether the projects that we studied replaced one of the testing related libraries by an other one, in other words we checked whether they have performed a testing related library migration. And we found that The highest observed number of migrations is from Hamcrest to AssertJ, even if these two libraries were used together in only 90 of all considered projects. No migrations were observed from AssertJ to Hamcrest, indicating the increasing use of AssertJ as a competing library. We also found a high number of migrations from Junit to TestNG, while only half of this number of projects migrated from TestNG to JUnit. Which means that half of the migrated projects didn’t use both libraries simultaneously while performing a migration from JUnit to TestNG or conversely. For the mocking libraries, we observe that most migrations go from EasyMock to Mockito, and that’s most likely because it offers more functionality.
  18. of course our analysis suffers from some limitations. Our results should not be generalized beyond Java projects or to projects that do not rely on the build automation tool Maven, or to projects that have a lifetime of less than two years. Also, The analysis that we have conducted may lead to false positives, since imported classes and interfaces are not necessarily used in the source code.
  19. Finally, We studied the usage of eight popular testing, matching and mocking libraries in a large corpus of Java projects hosted on GitHub. . . . Such empirical Analysis can be useful to project developers desiring to introduce an additional library, or to replace an existing library by another one, as our analysis reveals which competing libraries are available to migrate to. It’s also usefull for library maintainers so they could assess the popularity of their libraries and take this into account when developing new versions of their library.
  20. Our findings about how projects migrate between used libraries is promising, but require a more in-depth analysis. In future work we plan to take into account the effect of the specific library version on the migration phenomenon. In many cases, major releases of a library may imply significant functional changes, potentially leading to an increased migration towards this specific version. We already observed a case where the project first used Junit 3, then started using TestNG after a transition period, and then returned to using JUnit 4. And We aim also to conduct a more analysis of how frequently these libraries are used, how this evolves over time, and whether certain combinations of functionalities of different libraries are frequently used together. And also we would like to analyse the effort of migrating between different libraries, as well as the effort of upgrading to a Ne w major version of a library.
Advertisement