Your SlideShare is downloading. ×
0
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Big datalittletests heintz
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Big datalittletests heintz

148

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
148
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Big Data Little Tests John Heintz Founder, Gist Labs Technical Consultant, Cutter Consortium john@gistlabs.com @jheintz http://gistlabs.com
  • 2. About John Heintz •  Developer since 1995 •  Agilist since 1999 •  Founded Gist Labs in 2008 •  Developer, Mentor, Consultant •  Intuitive, Abstract, Precise Kool-Aids I’ve drank: Agile/Lean/Kanban, OO, TDD, REST, Mentoring, Craftsmanship, Emergent/Progressive Design, InnovationGames®, Systems and Complexity Theory 2 © 2012 Gist Labs, LLC
  • 3. My Goals for You •  Demystify test automation for Big Data •  Provide executable examples 3 © 2012 Gist Labs, LLC
  • 4. What you shouldn’t expect… •  Barely introduce Big Data concepts •  No performance tuning 4 © 2012 Gist Labs, LLC
  • 5. Simple Code, Config •  I went as simple and clear as possible •  Java, JUnit4 •  Maven… okay maybe not simple :- 5 © 2012 Gist Labs, LLC
  • 6. Mostly Code •  Remember the Law of Two Feet •  If code isn’t what you were looking for I totally respect you finding something better for your time J 6 © 2012 Gist Labs, LLC
  • 7. •  Everything available from http://gistlabs.com/2012/08/big-data-little-tests/ •  The entire command script is there… so you can take notes assuming that’s available 7 © 2012 Gist Labs, LLC
  • 8. My Soapboxes… These are topics I’ll repeat myself on •  Fast test execution •  One-click build 8 © 2012 Gist Labs, LLC
  • 9. Big Data •  Too much •  Too fast •  Not trivially structured 9 © 2012 Gist Labs, LLC
  • 10. Map Reduce •  Map from one input to one output •  Reduce from many inputs to one output •  Can be run in parallel •  Crude, but massive 10 © 2012 Gist Labs, LLC
  • 11. CAP Theorem •  Consistency •  Availability •  Partition Tolerance 11 © 2012 Gist Labs, LLC
  • 12. Big Data Ecosystem •  Hadoop: A giant among giants (Tons of projects on this platform!!) •  Cassandra: Feels like a weird RDBMS •  Riak: An elegant key/value/search store •  MongoDB: Document store 12 © 2012 Gist Labs, LLC
  • 13. Let’s Run Some Code 13 © 2012 Gist Labs, LLC
  • 14. Hadoop Tests 14 © 2012 Gist Labs, LLC
  • 15. Riak tests 15 © 2012 Gist Labs, LLC
  • 16. Other Frameworks •  CassandraUnit https://github.com/jsevellec/cassandra-unit •  PigUnit, Hadoop Query Language http://pig.apache.org/docs/r0.8.1/pigunit.html 16 © 2012 Gist Labs, LLC
  • 17. Code Questions? •  Fast test execution? •  One-click build? 17 © 2012 Gist Labs, LLC
  • 18. What about Big Tests? •  Real test data •  Realistic cluster 18 © 2012 Gist Labs, LLC
  • 19. Real Test Data My favorite strategy is to: •  Develop with small, crafted data •  Build/test the same way •  Run another test on top of real prod data 19 © 2012 Gist Labs, LLC
  • 20. Production Continuous Integration Servers Continuous Deployment Servers Build Test1 Cluster Cluster Test2 Cluster Staging Developers Version Control Developers Virtual vs Physical Servers Private vs Public Cloud Developer Sandboxes Network Infrastructure Self-service Provisioning Storage Infrastructure20 © 2012 Gist Labs, LLC
  • 21. Realistic Cluster •  Use a CI/DevOps environment •  Virtualize, “X as a Service” •  Virtual Machines •  Virtual Infrastructure (Network, Storage) 21 © 2012 Gist Labs, LLC
  • 22. Jenkins CI Server •  Master/slave clusters •  Plugins for Hadoop and VMWare •  http://jenkins-ci.org/ 22 © 2012 Gist Labs, LLC
  • 23. Big Questions? 23 © 2012 Gist Labs, LLC
  • 24. Thank you! •  Everything available from: http://gistlabs.com/2012/08/big-data-little-tests/ •  John Heintz, @jheintz, http://gistlabs.com 24 © 2012 Gist Labs, LLC

×