Oscon 2012 tdd_cassandra

  • 862 views
Uploaded on

Test driven development with maven, JUnit and Apache Cassandra (and distributed systems in general)

Test driven development with maven, JUnit and Apache Cassandra (and distributed systems in general)

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
862
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
17
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • \n
  • I’m Nate McCall, I’m platform development lead for the usergrid product at Apigee\n
  • I’m here to talk about just two things really...\n
  • \n
  • \n
  • \n
  • What is it? Well, by itself it can (and does) take up it’s own talk. But for a gross generalization, well start here.\n
  • When we talk test driven devlopment, we are really talking about unit testing code. We’ll talk about integration testing too, since that’s really what we care about for our case, but we’ll treat them as separate for now\n
  • Again, we don’t have much time, so I’m not going to go into this except to emphasize some of the core rules of unit testing. Another way to think about this is that unit testing is a method by which the smallest testable part of an application is validated. \n
  • Basically we want tests to be: fast, Isolated, Repeatable , Self-Validated (in that it asserts a condition ) and Timely (which really means we write them first so they stay in sync with the code).\n
  • For integration testing, we mean interfacing with external systems or processes really, to validate that components integrate correctly. pretty straight forward.\n
  • Generally there are fewer of them and, ideally, we run them less often because they are far more expensive\n
  • \n
  • \n
  • \n
  • For us, and are requirements here, there are 2 problems with that\n
  • Inherently makes it difficutlt to test locally. One of the things i really want to avoid is having developers needing to manage their own internal or external clusters of cassandra\n
  • It’s a service. Thus it’s an integration test. \n
  • To specific, here are some details.\n
  • Every operation sends whats called a ConsistencyLevel - essentially the level of sfety of an op\n
  • \n
  • \n
  • Writing for the key “P” then the primary node is down. The remaining replicas capture the a hint which will be “replayed” to the down node when it comes back. In a smaller cluster where a quorum is not possible, these operations would return an exception to the caller. Obviously this become difficult to test for\n\n\n
  • So we’re here to talk about applying test driven development to apache cassandra. I’m going to do this in a way that’s a little different\n
  • \n
  • I’m doing it wrong. \n
  • Here’s some output from a run of ‘mvn test’ on one module of the Usergrid codebase. Seems Typical. Tests ran, the passed, etc. But if we look a little closer...\n
  • We see some serious ugly.\n
  • I’ll be frank, that sucks.\n
  • This \n
  • \n
  • \n
  • \n
  • The worst result of all of this is that you end up in this situation in order just to get things completed. \n
  • If you do this, you will miss bugs. You will tie up cycles hunting down ‘blames’ on continous integration failures. The worst part about this is that it discourages test driven development in the first place. I’ve done this and it’s turned around to bite me.\n
  • So we’re in this situation - how do we get out of it? Taking this to the suits and trying to get time for it is extremely difficult.\n
  • Say that at your next priority meeting\n
  • This is a shockingly easy way to disquise a ton of refactoring work in “preparing for a conference”\n
  • \n
  • The first thing I needed to do was get a handle again on the test framework landscape without causing too much trauma to the code. \n
  • There’s really two way to approach this other than EmbeddedCassandraHelper. These two have been around for a year and are relatively stable and track recent cassandra releases.\n
  • \n
  • I should know better. I really should. \n
  • So let’s take a look at cassandra unit\n
  • The benefits are pretty straight forward\n
  • A couple of cons though. External instances is not going to work with one of my main requirements\n
  • Now cassandra maven plugin. Precisely what it says on the tin really, it’s a maven plugin designed to control cassandra instances. \n
  • Essentially, you can use maven to fork one or more cassandra instances in their own jvms. If you’ve never soon JVM forking in the wild, cassandra-maven-plugin is a decent example of using the commons-exec utility from the Apache Commons project\n
  • Thats great, but there’s some things missing\n
  • We need to speed up the test cycle so we dont throw the skipTest flag and miss bugs.\n
  • There is some low hanging fruit to address here that could speed things up for utility and mock-based test cases. \n
  • Basically things that are not integration tests. \nBy doing this, we can isolate those different requirements, inherently cutting time out of the test process.\n
  • There are two primary plugins used for test integration with maven. One is designed for unit testing, the other for integration testing.\n
  • If you get these confused your not alone. Who knows which is which?\n
  • \n
  • the names are not terribly descriptive. The documentation does not help much.\n
  • Even better: here is the surefire page. Note the URL, as this next slide is not very different.\n
  • Here is the failsafe page. Note the typos. They did manage one of out 3 find/replaces though. Note: I plan on submitting a patch to the site docs this evening to fix this. \n
  • Basically. The failsafe plugin is for integration testing.\n
  • Because failsafe is designed to continue after failures, here’s an easy way to remember that failsafe is for integration testing. \n
  • \n
  • Spend some time renaming to facilitate the Unit/integration test separation\n
  • Once you do this, there are some additional tweaks to really boost performance on test designed along FIRST guidelines\n
  • \n
  • The easy stuff’s out of the way, and we’ve shave some time off - particularly for the utility code. What else can we do?\n
  • Let’s look at our goals again... \n
  • \n
  • No. We can combine them.\n
  • \n
  • T\n
  • \n
  • Will see an example of this in a minute, but we need to attach a listener to the the build process to get this to fire off correct.\n
  • Here we are binding failsafe to the integration and verify phase\n
  • This configures our naming scheme - opposite of the surefire configuration - to run only the unit tests. We also have a regex on the bottom to exclude our RunListener from being invoked as a test case.\n
  • \n
  • Really straight forward. All configurations are handled by default for the simple cases, we only specify start and stop in order to bind to the integration and verify phases above.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Transcript

  • 1. TDD With Apache CassandraNate Mccall@zznate
  • 2. Hi.
  • 3. We are talking abouttwo things...
  • 4. 1. Test DrivenDevelopment(TDD)
  • 5. 2. Apache Cassandra
  • 6. ...and how difficult itis to put themtogether
  • 7. TDD
  • 8. Unit testing (Integration Testing)
  • 9. Unit Testing: F.I.R.S.T.
  • 10. F astI solatedR epeatableS elf-validatingT imely
  • 11. Integration Testing
  • 12. Run less oftenFewer of them
  • 13. Cassandra
  • 14. (not an acronym)
  • 15. Distributed, highperformancedatabase
  • 16. Two problems withthat...
  • 17. 1. “Distributed”
  • 18. 2. “Database”
  • 19. Specifically
  • 20. Configurations andbehavior tied toenvironment
  • 21. Heavy weightresourcerequirements
  • 22. Failures difficult tosimulate
  • 23. Applying TDD
  • 24. But first: a dirtysecret to share...
  • 25. I’m Doing it wrong
  • 26. 3 minutes for onemodule
  • 27. The Culprit:EmbeddedCassandraHelper
  • 28. Easy to setupKnown state per testClient setup is easy
  • 29. SlowwwwwLots of fixture codeHacking cassandra lifecycle
  • 30. mvn install -DskipTests
  • 31. mvn install -DskipTestsFail.
  • 32. How can I solve this?
  • 33. “Refactoring tests”
  • 34. Pro tip:propose it as aconference topic :)
  • 35. So some time hasbeen weaseled...
  • 36. What other tools areavailable?
  • 37. cassandra-maven-plugincassandra-unit
  • 38. 2nd dirty secret...
  • 39. I’ve contributed to bothprojects :(
  • 40. Cassandra Unit
  • 41. Easy data loading (no fixtures)Multiple formatsClient setup provided
  • 42. Cassandra instance in processSlow on large data setsExternal instance support(not a pro for my use case)
  • 43. Cassandra Maven Plugin
  • 44. Cassandra(s) in forkedJVM(s)Setup/teardown (still) inmaven lifecycleMulti-node support FTW!
  • 45. No easy means of loading bulk dataCode level fixturesManual client setup
  • 46. Back to original issue
  • 47. What to focus on first
  • 48. 1. Separate unit testsfrom integration tests
  • 49. Maven test plugins
  • 50. Maven test plugins:surefirefailsafe
  • 51. Maven test plugins:failfiresuresafe
  • 52. Maven test plugins:surefailfiresafe?
  • 53. Failsafe:For integration tests.Keeps going even if there are failures
  • 54. Failsafe:Fails to make it easy tofind your mistake
  • 55. Configuringexclusions
  • 56. While i’m there...
  • 57. There’s got to bemore to do
  • 58. Remember the Goals:Avoid fixture codeNo per-developer configsNot in process
  • 59. Is it just choosingbetween differenttools?
  • 60. You got your fixturesin my plugin!
  • 61. Cassandra-unit +cassandra-maven-plugin = WIN
  • 62. Benefits:Maven lifecycleNo fixturesRunListeners@Rule for special case-ing
  • 63. Maven configuration: failsafe
  • 64. Maven configuration: Cassandra
  • 65. RunListener code
  • 66. Parallel *allthethings*
  • 67. I can has cluster!
  • 68. $ sudo ifconfig lo0 alias 127.0.0.2 up$ sudo ifconfig lo0 alias 127.0.0.3 up$ sudo ifconfig lo0 alias 127.0.0.4 up
  • 69. $ mvn cassandra:start-cluster cassandra:run
  • 70. Next Steps...
  • 71. Next Steps...(am i really this psyched abouttesting?)
  • 72. Instrumentation...
  • 73. Gradle(or: How I sat next to TimBerglund at lunch)
  • 74. Gradle... found my next proposal :)
  • 75. aws-java-sdk.jar +cassandra launcher =AWESOME
  • 76. ... yep, this one too.
  • 77. Continue to contribute
  • 78. ContinuetoContribute
  • 79. Questions?