Oscon 2012 tdd_cassandra


Published on

Test driven development with maven, JUnit and Apache Cassandra (and distributed systems in general)

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • \n
  • I’m Nate McCall, I’m platform development lead for the usergrid product at Apigee\n
  • I’m here to talk about just two things really...\n
  • \n
  • \n
  • \n
  • What is it? Well, by itself it can (and does) take up it’s own talk. But for a gross generalization, well start here.\n
  • When we talk test driven devlopment, we are really talking about unit testing code. We’ll talk about integration testing too, since that’s really what we care about for our case, but we’ll treat them as separate for now\n
  • Again, we don’t have much time, so I’m not going to go into this except to emphasize some of the core rules of unit testing. Another way to think about this is that unit testing is a method by which the smallest testable part of an application is validated. \n
  • Basically we want tests to be: fast, Isolated, Repeatable , Self-Validated (in that it asserts a condition ) and Timely (which really means we write them first so they stay in sync with the code).\n
  • For integration testing, we mean interfacing with external systems or processes really, to validate that components integrate correctly. pretty straight forward.\n
  • Generally there are fewer of them and, ideally, we run them less often because they are far more expensive\n
  • \n
  • \n
  • \n
  • For us, and are requirements here, there are 2 problems with that\n
  • Inherently makes it difficutlt to test locally. One of the things i really want to avoid is having developers needing to manage their own internal or external clusters of cassandra\n
  • It’s a service. Thus it’s an integration test. \n
  • To specific, here are some details.\n
  • Every operation sends whats called a ConsistencyLevel - essentially the level of sfety of an op\n
  • \n
  • \n
  • Writing for the key “P” then the primary node is down. The remaining replicas capture the a hint which will be “replayed” to the down node when it comes back. In a smaller cluster where a quorum is not possible, these operations would return an exception to the caller. Obviously this become difficult to test for\n\n\n
  • So we’re here to talk about applying test driven development to apache cassandra. I’m going to do this in a way that’s a little different\n
  • \n
  • I’m doing it wrong. \n
  • Here’s some output from a run of ‘mvn test’ on one module of the Usergrid codebase. Seems Typical. Tests ran, the passed, etc. But if we look a little closer...\n
  • We see some serious ugly.\n
  • I’ll be frank, that sucks.\n
  • This \n
  • \n
  • \n
  • \n
  • The worst result of all of this is that you end up in this situation in order just to get things completed. \n
  • If you do this, you will miss bugs. You will tie up cycles hunting down ‘blames’ on continous integration failures. The worst part about this is that it discourages test driven development in the first place. I’ve done this and it’s turned around to bite me.\n
  • So we’re in this situation - how do we get out of it? Taking this to the suits and trying to get time for it is extremely difficult.\n
  • Say that at your next priority meeting\n
  • This is a shockingly easy way to disquise a ton of refactoring work in “preparing for a conference”\n
  • \n
  • The first thing I needed to do was get a handle again on the test framework landscape without causing too much trauma to the code. \n
  • There’s really two way to approach this other than EmbeddedCassandraHelper. These two have been around for a year and are relatively stable and track recent cassandra releases.\n
  • \n
  • I should know better. I really should. \n
  • So let’s take a look at cassandra unit\n
  • The benefits are pretty straight forward\n
  • A couple of cons though. External instances is not going to work with one of my main requirements\n
  • Now cassandra maven plugin. Precisely what it says on the tin really, it’s a maven plugin designed to control cassandra instances. \n
  • Essentially, you can use maven to fork one or more cassandra instances in their own jvms. If you’ve never soon JVM forking in the wild, cassandra-maven-plugin is a decent example of using the commons-exec utility from the Apache Commons project\n
  • Thats great, but there’s some things missing\n
  • We need to speed up the test cycle so we dont throw the skipTest flag and miss bugs.\n
  • There is some low hanging fruit to address here that could speed things up for utility and mock-based test cases. \n
  • Basically things that are not integration tests. \nBy doing this, we can isolate those different requirements, inherently cutting time out of the test process.\n
  • There are two primary plugins used for test integration with maven. One is designed for unit testing, the other for integration testing.\n
  • If you get these confused your not alone. Who knows which is which?\n
  • \n
  • the names are not terribly descriptive. The documentation does not help much.\n
  • Even better: here is the surefire page. Note the URL, as this next slide is not very different.\n
  • Here is the failsafe page. Note the typos. They did manage one of out 3 find/replaces though. Note: I plan on submitting a patch to the site docs this evening to fix this. \n
  • Basically. The failsafe plugin is for integration testing.\n
  • Because failsafe is designed to continue after failures, here’s an easy way to remember that failsafe is for integration testing. \n
  • \n
  • Spend some time renaming to facilitate the Unit/integration test separation\n
  • Once you do this, there are some additional tweaks to really boost performance on test designed along FIRST guidelines\n
  • \n
  • The easy stuff’s out of the way, and we’ve shave some time off - particularly for the utility code. What else can we do?\n
  • Let’s look at our goals again... \n
  • \n
  • No. We can combine them.\n
  • \n
  • T\n
  • \n
  • Will see an example of this in a minute, but we need to attach a listener to the the build process to get this to fire off correct.\n
  • Here we are binding failsafe to the integration and verify phase\n
  • This configures our naming scheme - opposite of the surefire configuration - to run only the unit tests. We also have a regex on the bottom to exclude our RunListener from being invoked as a test case.\n
  • \n
  • Really straight forward. All configurations are handled by default for the simple cases, we only specify start and stop in order to bind to the integration and verify phases above.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Oscon 2012 tdd_cassandra

    1. 1. TDD With Apache CassandraNate Mccall@zznate
    2. 2. Hi.
    3. 3. We are talking abouttwo things...
    4. 4. 1. Test DrivenDevelopment(TDD)
    5. 5. 2. Apache Cassandra
    6. 6. ...and how difficult itis to put themtogether
    7. 7. TDD
    8. 8. Unit testing (Integration Testing)
    9. 9. Unit Testing: F.I.R.S.T.
    10. 10. F astI solatedR epeatableS elf-validatingT imely
    11. 11. Integration Testing
    12. 12. Run less oftenFewer of them
    13. 13. Cassandra
    14. 14. (not an acronym)
    15. 15. Distributed, highperformancedatabase
    16. 16. Two problems withthat...
    17. 17. 1. “Distributed”
    18. 18. 2. “Database”
    19. 19. Specifically
    20. 20. Configurations andbehavior tied toenvironment
    21. 21. Heavy weightresourcerequirements
    22. 22. Failures difficult tosimulate
    23. 23. Applying TDD
    24. 24. But first: a dirtysecret to share...
    25. 25. I’m Doing it wrong
    26. 26. 3 minutes for onemodule
    27. 27. The Culprit:EmbeddedCassandraHelper
    28. 28. Easy to setupKnown state per testClient setup is easy
    29. 29. SlowwwwwLots of fixture codeHacking cassandra lifecycle
    30. 30. mvn install -DskipTests
    31. 31. mvn install -DskipTestsFail.
    32. 32. How can I solve this?
    33. 33. “Refactoring tests”
    34. 34. Pro tip:propose it as aconference topic :)
    35. 35. So some time hasbeen weaseled...
    36. 36. What other tools areavailable?
    37. 37. cassandra-maven-plugincassandra-unit
    38. 38. 2nd dirty secret...
    39. 39. I’ve contributed to bothprojects :(
    40. 40. Cassandra Unit
    41. 41. Easy data loading (no fixtures)Multiple formatsClient setup provided
    42. 42. Cassandra instance in processSlow on large data setsExternal instance support(not a pro for my use case)
    43. 43. Cassandra Maven Plugin
    44. 44. Cassandra(s) in forkedJVM(s)Setup/teardown (still) inmaven lifecycleMulti-node support FTW!
    45. 45. No easy means of loading bulk dataCode level fixturesManual client setup
    46. 46. Back to original issue
    47. 47. What to focus on first
    48. 48. 1. Separate unit testsfrom integration tests
    49. 49. Maven test plugins
    50. 50. Maven test plugins:surefirefailsafe
    51. 51. Maven test plugins:failfiresuresafe
    52. 52. Maven test plugins:surefailfiresafe?
    53. 53. Failsafe:For integration tests.Keeps going even if there are failures
    54. 54. Failsafe:Fails to make it easy tofind your mistake
    55. 55. Configuringexclusions
    56. 56. While i’m there...
    57. 57. There’s got to bemore to do
    58. 58. Remember the Goals:Avoid fixture codeNo per-developer configsNot in process
    59. 59. Is it just choosingbetween differenttools?
    60. 60. You got your fixturesin my plugin!
    61. 61. Cassandra-unit +cassandra-maven-plugin = WIN
    62. 62. Benefits:Maven lifecycleNo fixturesRunListeners@Rule for special case-ing
    63. 63. Maven configuration: failsafe
    64. 64. Maven configuration: Cassandra
    65. 65. RunListener code
    66. 66. Parallel *allthethings*
    67. 67. I can has cluster!
    68. 68. $ sudo ifconfig lo0 alias up$ sudo ifconfig lo0 alias up$ sudo ifconfig lo0 alias up
    69. 69. $ mvn cassandra:start-cluster cassandra:run
    70. 70. Next Steps...
    71. 71. Next Steps...(am i really this psyched abouttesting?)
    72. 72. Instrumentation...
    73. 73. Gradle(or: How I sat next to TimBerglund at lunch)
    74. 74. Gradle... found my next proposal :)
    75. 75. aws-java-sdk.jar +cassandra launcher =AWESOME
    76. 76. ... yep, this one too.
    77. 77. Continue to contribute
    78. 78. ContinuetoContribute
    79. 79. Questions?