0
Welcome! http://www.sevajug.org
Assessing Unit Test Quality  Matt Harrah Southeast Virginia Java Users Group 20 Nov 2007
About this presentation <ul><li>This presentation is more like a case study than a lecture </li></ul><ul><li>Several diffe...
Before we begin: Some definitions <ul><li>Unit testing – testing a single class to ensure that it performs according to it...
Part I: The Problem
Restating the obvious <ul><li>The Brass Ring:  We generally want our code to be as free of bugs as is economically feasibl...
Restating the obvious <ul><li>The better your test suite, the more confidence you can have in your code’s correctness </li...
Brief Digression <ul><li>Junit can also be used to do integration, functional, and regression testing </li></ul><ul><ul><l...
So…if the better your test suite, the better the code, the real question is: How good are your tests?
Mock Objects <ul><li>Are you isolating your objects under test? </li></ul><ul><ul><li>If a test uses two objects and the o...
Code Coverage <ul><li>Do you have enough tests?  What’s tested and what isn’t? </li></ul><ul><ul><li>Well-known problem wi...
JUnit Fallacy # 1 <ul><li>“ The code is just fine – all our tests pass” </li></ul><ul><li>Test success does not mean the c...
What the world  really  needs <ul><li>Some way of measuring how rigorous each test is </li></ul><ul><ul><li>A test that ma...
“Assertion Density” <ul><li>Assertion Density for a test is defined by the equation shown, where </li></ul><ul><ul><li>A i...
Junit Fallacy #2 <ul><li>“ Our code is thoroughly tested – Cobertura says we have 95% code coverage” </li></ul><ul><li>Cov...
Indirect Testing Test A Class A Class B Class C Tests Calls Calls <ul><li>Class A, Class B, and Class C all execute as Tes...
What the world  really  needs <ul><li>Some way of measuring how directly a class is tested </li></ul><ul><ul><li>A class t...
“ Testedness” <ul><li>Testedness is defined by the formula shown, where </li></ul><ul><ul><li>t   is the testedness </li><...
Part II: Solving the Problem (or, at least attempting to…)
Project PEA <ul><li>Project PEA  </li></ul><ul><ul><li>Named after  The Princess and the Pea </li></ul></ul><ul><li>Primar...
Project PEA <ul><li>Requirements: </li></ul><ul><ul><li>No modifications to source code or tests required </li></ul></ul><...
Approach #1: Static Code Analysis <ul><li>From looking at a test’s imports, determine which classes are referenced directl...
Approach #1: Static Code Analysis <ul><li>Doesn’t work </li></ul><ul><li>Reflective calls defeat static detection of what ...
Approach #1: Static Code Analysis <ul><li>Consider: class TestMyMap extends AbstractMapTestCase; private void testFoo() { ...
Approach #2: Byte Code Instrumentation <ul><li>Modify the compiled .class files to call a routine on each method’s entry <...
Approach #2: Byte Code Instrumentation <ul><li>There are several libraries out there for modifying .class files </li></ul>...
Approach #2: Byte Code Instrumentation <ul><li>It did work, but it was unbearably slow – typically 1000x slower and someti...
Approach #3: Aspect-Oriented Programming <ul><li>Use AOP to accomplish similar tasks as the byte code instrumentation </li...
Approach #3: Aspect-Oriented Programming <ul><li>Unsatisfactory – in fact, a complete failure </li></ul><ul><li>Method exi...
Approach #3: Aspect-Oriented Programming <ul><li>Expanding methods by 30% can (and did) cause them to bump into Java’s 64K...
Approach #4: Debugger <ul><li>The idea is to write a debugger that monitors the tests in one JVM as they run in another </...
Approach #4: Debugger <ul><li>Java includes in the SE JDK an architecture called JPDA </li></ul><ul><ul><li>Java Platform ...
Approach #4: Debugger <ul><li>JPDA allows you to </li></ul><ul><ul><li>Specify which events you want to be notified about ...
Putting JPDA to work <ul><li>First I wrote a debugger process to attach to an already running JVM </li></ul><ul><ul><li>By...
Putting JPDA to work <ul><li>As methods enter and exit, push and pop entries onto a stack maintained in the debugger </li>...
Putting JPDA to work <ul><li>As methods are entered, calculate and record its distance in the stack from a test  </li></ul...
Putting JPDA to work <ul><li>Remember – an Ant Junit task can fork multiple JVMs – one per test if you want, so we need to...
Results of using JPDA <ul><li>Performance is way better than using byte code instrumentation </li></ul><ul><ul><li>Running...
Results of using JPDA <ul><li>The byte code being monitored is completely unchanged  </li></ul><ul><ul><li>No special inst...
Results file example
Report Sample
Code Review Let’s roll that beautiful Java footage!
Future Plans <ul><li>Implement assertion density tracking </li></ul><ul><li>Tweak performance </li></ul><ul><li>Make easie...
Lessons Learned (so far) <ul><li>What seems impossible sometimes isn’t </li></ul><ul><li>Creativity is absolutely crucial ...
Wanna help? <ul><li>http://pea-coverage.sourceforge.net </li></ul><ul><li>I’d welcome anyone who wants to participate </li...
Resources <ul><li>Java Platform Debugger Architecture http://java.sun.com/javase/technologies/core/toolsapi/jpda </li></ul...
Resources <ul><li>Emma – code coverage tool http://emma.sourceforge.net </li></ul><ul><li>Cobertura – code coverage tool h...
Upcoming SlideShare
Loading in...5
×

Assessing Unit Test Quality

2,074

Published on

Published in: Travel, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,074
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
60
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Assessing Unit Test Quality"

  1. 1. Welcome! http://www.sevajug.org
  2. 2. Assessing Unit Test Quality Matt Harrah Southeast Virginia Java Users Group 20 Nov 2007
  3. 3. About this presentation <ul><li>This presentation is more like a case study than a lecture </li></ul><ul><li>Several different technologies will be discussed </li></ul><ul><li>In particular, we will discuss the suitability of various technologies and approaches to solve a computing problem </li></ul><ul><li>We will also discuss the computing problem itself </li></ul><ul><li>Suggestions and questions are welcomed throughout </li></ul>
  4. 4. Before we begin: Some definitions <ul><li>Unit testing – testing a single class to ensure that it performs according to its API specs (i.e., its Javadoc) </li></ul><ul><li>Integration testing – testing that the units interact appropriately </li></ul><ul><li>Functional testing – testing that the integrated units meet the system requirements </li></ul><ul><li>Regression testing – testing that changes to code have not (re-)introduced unexpected changes in performance, inputs, or outputs </li></ul>
  5. 5. Part I: The Problem
  6. 6. Restating the obvious <ul><li>The Brass Ring: We generally want our code to be as free of bugs as is economically feasible </li></ul><ul><li>Testing is the only way to know how bug-free your code is </li></ul><ul><li>All four kinds of testing mentioned a minute ago can be automated with repeatable suites of tests </li></ul>
  7. 7. Restating the obvious <ul><li>The better your test suite, the more confidence you can have in your code’s correctness </li></ul><ul><li>Junit is the most commonly used way to automate unit tests for Java code </li></ul><ul><ul><li>Suites of repeatable tests are commonly built up over time </li></ul></ul><ul><ul><li>QA involves running these suites of tests on a regular basis </li></ul></ul>
  8. 8. Brief Digression <ul><li>Junit can also be used to do integration, functional, and regression testing </li></ul><ul><ul><li>Integration tests theoretically should create two objects, and test their interactions </li></ul></ul><ul><ul><li>Functional tests can simulate the user interacting with the system and verifying its outcomes </li></ul></ul><ul><ul><li>Regression testing is typically making sure that changes do not introduce test failures in the growing suite of automated tests </li></ul></ul>
  9. 9. So…if the better your test suite, the better the code, the real question is: How good are your tests?
  10. 10. Mock Objects <ul><li>Are you isolating your objects under test? </li></ul><ul><ul><li>If a test uses two objects and the objects interact, a test failure can be attributed to either of the two objects, or because they were not meant to interact </li></ul></ul><ul><ul><li>Mock objects are a common solution </li></ul></ul><ul><ul><ul><li>One and only one real code object is tested – the other objects are “mock objects” which simulate the real objects for test purposes </li></ul></ul></ul><ul><ul><ul><li>Allows test writer to simulate conditions that might be otherwise difficult to create </li></ul></ul></ul><ul><ul><li>This problem is well-known and amply addressed by several products (e.g., EasyMock) </li></ul></ul>
  11. 11. Code Coverage <ul><li>Do you have enough tests? What’s tested and what isn’t? </li></ul><ul><ul><li>Well-known problem with numerous tools to help, such as Emma, Jcoverage, Cobertura, and Clover. These tools monitor which pieces of code under test get executed during the test suite. </li></ul></ul><ul><ul><li>All the code that executed during the test is considered covered, and the other code is considered uncovered. </li></ul></ul><ul><ul><li>This provides a numeric measurement of test coverage (e.g., “Package x has 49% class coverage”) </li></ul></ul>
  12. 12. JUnit Fallacy # 1 <ul><li>“ The code is just fine – all our tests pass” </li></ul><ul><li>Test success does not mean the code is fine </li></ul><ul><li>Consider the following test: </li></ul><ul><li>public void testMethod() { </li></ul><ul><li>;// Do absolutely nothing </li></ul><ul><li>} </li></ul><ul><li>This test will pass every time. </li></ul>
  13. 13. What the world really needs <ul><li>Some way of measuring how rigorous each test is </li></ul><ul><ul><li>A test that makes more assertions about the behaviour of the class under test is presumably more rigorous than one that makes fewer assertions </li></ul></ul><ul><ul><li>If only we had some sort of measure of how many assertions are made per something-or-other </li></ul></ul>
  14. 14. “Assertion Density” <ul><li>Assertion Density for a test is defined by the equation shown, where </li></ul><ul><ul><li>A is the assertion density </li></ul></ul><ul><ul><li>a is the number of assertions made during the execution of the test </li></ul></ul><ul><ul><li>m is the number of method calls made during the execution of the test </li></ul></ul><ul><li>Yep, I just made this up </li></ul>
  15. 15. Junit Fallacy #2 <ul><li>“ Our code is thoroughly tested – Cobertura says we have 95% code coverage” </li></ul><ul><li>Covered is not the same as tested </li></ul><ul><li>Many modules call other modules which call other modules. </li></ul>
  16. 16. Indirect Testing Test A Class A Class B Class C Tests Calls Calls <ul><li>Class A, Class B, and Class C all execute as Test A runs </li></ul><ul><li>Code coverage tools will register Class A, Class B, and class C as all covered, even though there was no test specifically written for Class B or Class C </li></ul>&quot;Covered&quot; &quot;Covered&quot; &quot;Covered&quot;
  17. 17. What the world really needs <ul><li>Some way of measuring how directly a class is tested </li></ul><ul><ul><li>A class that is tested directly and explicitly by a test designed for that class is better-tested than one that only gets run when some other class is tested </li></ul></ul><ul><ul><li>If only we had some sort of “test directness” measure… </li></ul></ul><ul><ul><li>Perhaps a reduced quality rating the more indirectly a class is tested? </li></ul></ul>
  18. 18. “ Testedness” <ul><li>Testedness is defined by the formula shown, where </li></ul><ul><ul><li>t is the testedness </li></ul></ul><ul><ul><li>d is the test distance </li></ul></ul><ul><ul><li>n d is the number of calls at test distance d </li></ul></ul><ul><li>Yep, I made this one up too </li></ul>
  19. 19. Part II: Solving the Problem (or, at least attempting to…)
  20. 20. Project PEA <ul><li>Project PEA </li></ul><ul><ul><li>Named after The Princess and the Pea </li></ul></ul><ul><li>Primary Goals: </li></ul><ul><ul><li>Collect and report test directness / testedness of code </li></ul></ul><ul><ul><li>Collect and report assertion density of tests </li></ul></ul><ul><li>Start with test directness </li></ul><ul><li>Add assertion density later </li></ul>
  21. 21. Project PEA <ul><li>Requirements: </li></ul><ul><ul><li>No modifications to source code or tests required </li></ul></ul><ul><ul><li>Test results not affected by data gathering </li></ul></ul><ul><ul><li>XML result set </li></ul></ul><ul><li>Ideals: </li></ul><ul><ul><li>Fast </li></ul></ul><ul><ul><li>Few restrictions on how the tests can be run </li></ul></ul>
  22. 22. Approach #1: Static Code Analysis <ul><li>From looking at a test’s imports, determine which classes are referenced directly by tests </li></ul><ul><li>From each class, look what calls what </li></ul><ul><li>Assemble a call network graph </li></ul><ul><li>From each node in graph, count steps to a test case </li></ul>
  23. 23. Approach #1: Static Code Analysis <ul><li>Doesn’t work </li></ul><ul><li>Reflective calls defeat static detection of what is being called </li></ul><ul><li>Polymorphism defeats static detection of what is being called </li></ul>
  24. 24. Approach #1: Static Code Analysis <ul><li>Consider: class TestMyMap extends AbstractMapTestCase; private void testFoo() { // Map m defined in superclass m.put(“bar”,”baz”); } </li></ul><ul><li>What concrete class’ put() method is being called? </li></ul><ul><li>Java’s late binding makes this static code analysis unsuitable </li></ul>
  25. 25. Approach #2: Byte Code Instrumentation <ul><li>Modify the compiled .class files to call a routine on each method’s entry </li></ul><ul><li>This routine gets a dump of the stack and looks through the it until it finds a class that is a subclass of TestCase </li></ul><ul><li>Very similar to what other code coverage tools like Emma do (except those tools don’t examine the stack) </li></ul>
  26. 26. Approach #2: Byte Code Instrumentation <ul><li>There are several libraries out there for modifying .class files </li></ul><ul><ul><li>BCEL from Apache </li></ul></ul><ul><ul><li>ASM from ObjectWeb </li></ul></ul><ul><li>ASM was much easier to use than BCEL </li></ul><ul><li>Wrote an Ant task to go through class files and add call to a tabulating routine that examined the stack </li></ul>
  27. 27. Approach #2: Byte Code Instrumentation <ul><li>It did work, but it was unbearably slow – typically 1000x slower and sometimes even slower </li></ul><ul><li>This is because getting a stack dump is inherently slow </li></ul><ul><ul><li>Stack dumps are on threads </li></ul></ul><ul><ul><li>Threads are implemented natively </li></ul></ul><ul><ul><li>To get a dump of the stack, the thread needs to be stopped and the JVM has to pause </li></ul></ul><ul><li>To be viable, PEA cannot use thread stack dumps </li></ul>
  28. 28. Approach #3: Aspect-Oriented Programming <ul><li>Use AOP to accomplish similar tasks as the byte code instrumentation </li></ul><ul><ul><li>Track method entry/exit and maintain a mirror of the stack in the app </li></ul></ul><ul><ul><li>Calculate and record distance from a test at every method entry </li></ul></ul><ul><li>Avoids the overhead of getting stack dumps from the thread </li></ul>
  29. 29. Approach #3: Aspect-Oriented Programming <ul><li>Unsatisfactory – in fact, a complete failure </li></ul><ul><li>Method exits are all over the place in a method </li></ul><ul><ul><li>Method exits in the byte code do not always correspond to the source structure, particularly where exceptions are concerned </li></ul></ul><ul><ul><li>Introducing an aspect behavior at each exit point can increase the size of the byte code by up to 30% </li></ul></ul>
  30. 30. Approach #3: Aspect-Oriented Programming <ul><li>Expanding methods by 30% can (and did) cause them to bump into Java’s 64K limit on method bytecode </li></ul><ul><ul><li>Instrumented classes would not load </li></ul></ul><ul><li>In addition, AspectJ required you to either: </li></ul><ul><ul><li>Recompile the source and tests using AspectJ’s compiler; or </li></ul></ul><ul><ul><li>Create and use your own aspecting-on-the-fly classloader </li></ul></ul><ul><ul><li>Either way you still hit the 64K barrier </li></ul></ul>
  31. 31. Approach #4: Debugger <ul><li>The idea is to write a debugger that monitors the tests in one JVM as they run in another </li></ul><ul><li>The debugger can track method entries and exits as they happen and keep the stack straight </li></ul><ul><li>The code being tested, and the tests themselves, do not need to be aware that the debugger is watching them </li></ul>
  32. 32. Approach #4: Debugger <ul><li>Java includes in the SE JDK an architecture called JPDA </li></ul><ul><ul><li>Java Platform Debugger Architecture </li></ul></ul><ul><li>This architecture allows one JVM to debug another JVM over sockets, shared files, etc. </li></ul><ul><li>It provides an Object-Oriented API for the debugging JVM </li></ul><ul><ul><li>Models the debugged JVM as a POJO </li></ul></ul><ul><ul><li>Provides call-backs for events as they occur in the debugged JVM </li></ul></ul>
  33. 33. Approach #4: Debugger <ul><li>JPDA allows you to </li></ul><ul><ul><li>Specify which events you want to be notified about </li></ul></ul><ul><ul><li>Specify which packages, etc. should be monitored </li></ul></ul><ul><ul><li>Pause and restart the other process </li></ul></ul><ul><ul><li>Inspect the variables, call stacks, etc. of the other process (as long as it’s paused) </li></ul></ul><ul><li>No additional libraries required! </li></ul>
  34. 34. Putting JPDA to work <ul><li>First I wrote a debugger process to attach to an already running JVM </li></ul><ul><ul><li>By using the <parallel> task in Ant, I can simultaneously launch the debugger and the Junit tests </li></ul></ul><ul><ul><li>The debugger will attach to and monitor the other JVM </li></ul></ul><ul><ul><li>Register interest in callbacks on method entry and exit, exceptions, and JVM death </li></ul></ul>
  35. 35. Putting JPDA to work <ul><li>As methods enter and exit, push and pop entries onto a stack maintained in the debugger </li></ul><ul><ul><li>This effectively mirrors the real stack in the tests </li></ul></ul><ul><ul><li>Ignore certain packages beyond developer control (such as the JDK itself) </li></ul></ul>
  36. 36. Putting JPDA to work <ul><li>As methods are entered, calculate and record its distance in the stack from a test </li></ul><ul><li>Shut down when the other JVM dies </li></ul><ul><li>Just before shutdown, write all the recorded data to a file </li></ul>
  37. 37. Putting JPDA to work <ul><li>Remember – an Ant Junit task can fork multiple JVMs – one per test if you want, so we need to monitor each one </li></ul><ul><li>Multiple JVMs mean multiple files of recorded data that need to be accumulated after all the tests are complete </li></ul><ul><li>Produce XML file of accumulated results </li></ul>
  38. 38. Results of using JPDA <ul><li>Performance is way better than using byte code instrumentation </li></ul><ul><ul><li>Running with monitoring on slows execution by 100x or less, depending on the code </li></ul></ul><ul><li>Ant script is kind of complicated </li></ul><ul><ul><li>JUnit tests and PEA must be run with forked JVMs </li></ul></ul><ul><ul><li>Special JVM parameters for the debugged process are required </li></ul></ul><ul><ul><li>JUnit and PEA must be started simultaneously using the <parallel> task (which many people don’t know about) </li></ul></ul>
  39. 39. Results of using JPDA <ul><li>The byte code being monitored is completely unchanged </li></ul><ul><ul><li>No special instrumentation or preparatory build step is required </li></ul></ul><ul><li>XML file comes out with details about how many method calls were made at what test distance </li></ul>
  40. 40. Results file example
  41. 41. Report Sample
  42. 42. Code Review Let’s roll that beautiful Java footage!
  43. 43. Future Plans <ul><li>Implement assertion density tracking </li></ul><ul><li>Tweak performance </li></ul><ul><li>Make easier to run the tests with PEA running </li></ul><ul><ul><li>Perhaps subclass Ant’s <junit> task to run with PEA? </li></ul></ul><ul><li>Documentation (ick) </li></ul><ul><li>Eat my own dog food and use the tool to measure my own JUnit tests </li></ul><ul><li>Sell for $2.5billion to Scott McNeely and Jonathan Schwartz and retire to Jamaica </li></ul>
  44. 44. Lessons Learned (so far) <ul><li>What seems impossible sometimes isn’t </li></ul><ul><li>Creativity is absolutely crucial to solving problems (as opposed to just implementing solutions) </li></ul><ul><li>JDPA is cool – and I had never heard of it </li></ul><ul><ul><li>I never thought I’d be able to write a debugger but JDPA made it easy </li></ul></ul><ul><li>ASM library is also cool – much nicer than BCEL </li></ul>
  45. 45. Wanna help? <ul><li>http://pea-coverage.sourceforge.net </li></ul><ul><li>I’d welcome anyone who wants to participate </li></ul><ul><li>Contributing to an open-source project looks good on your resume hint hint… </li></ul>
  46. 46. Resources <ul><li>Java Platform Debugger Architecture http://java.sun.com/javase/technologies/core/toolsapi/jpda </li></ul><ul><li>ASM – Byte code processing library http://asm.objectweb.org </li></ul><ul><li>BCEL – Byte code processing library http://jakarta.apache.org/bcel </li></ul><ul><li>AspectJ – Aspect-oriented Java extension http://eclipse.org/aspectj </li></ul><ul><li>JUnit – unit testing framework http://junit.org </li></ul>
  47. 47. Resources <ul><li>Emma – code coverage tool http://emma.sourceforge.net </li></ul><ul><li>Cobertura – code coverage tool http://cobertura.sourceforge.net </li></ul><ul><li>JCoverage – code coverage tool http://www.jcoverage.com </li></ul><ul><li>Clover – code coverage tool http://www.atlassian.com/software/clover </li></ul>Thanks for listening!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×