Your SlideShare is downloading. ×
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley


Published on

The raspberry Pi is a credit-card sized $25 ARM based linux box designed to teach children the basics of programming. The machine comes with a 700MHz ARM and 512Mb of memory and boots off a SD card, …

The raspberry Pi is a credit-card sized $25 ARM based linux box designed to teach children the basics of programming. The machine comes with a 700MHz ARM and 512Mb of memory and boots off a SD card, not much power for running the likes of a Cassandra cluster. This presentation will discuss the problems of getting Cassandra up and running on the Pi and will answer the all important question: Why on Earth would you want to do this!?

Published in: Technology
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Hardware Agnostic: Cassandra on Raspberry PiAndy Cobley | Lecturer, University of Dundee, Scotland
  • 2. *  Cassandra is hardware agnostic*  So why not run it on a Raspberry Pi ?*  How hard can it be ?*  What can we do with it once it works?Cassandra on Raspberry Pi
  • 3. *  Andy Cobley*  School of Computing*  University of Dundee*  Twitter: @andycobleyWho Am I ?
  • 4. *  Single chip Linux computer*  500 Meg ram*  Boots off an SD card*  Ethernet port*  (graphics and all you need for a general purpose computer)Whats a Raspberry Pi ?
  • 5. Pi with pound coin
  • 6. *  And here’s the Cassandra cluster *And, here’s one for real* Power Permitting !
  • 7. *  Cassandra is designed to be fast, fast at writing, fast at reading.*  This laptop with one instance of Cassandra will do 12,000 writeoperations*  Raspberry Pi will do 200 !The Bad News
  • 8. *  Running a external USB drive is actually worse !*  Probably be hardware featureMore bad news !
  • 9. Raspberry Pi Schematic
  • 10. *  Oracle Java vs OpenJDKAnd then there’s Java!
  • 11. *  Raspbian is Debian for the PI*  Uses the Hard floating point accelerator*  Much faster than Debian*  Current Oracle JDK won’t run on it !And Raspbian
  • 12. **  Java SE Embedded version 6*  Cassandra might prefer 6*  But**  Preview at:* java
  • 13. *  Actually not much difference in performanceHard vs Soft Float
  • 14. *  Cassandra uses compression for performance*  Started in version 1.02x-­‐4x  reduc+on  in  data  size  25-­‐35%  performance  improvement  on  reads  5-­‐10%  performance  improvement  on  writes  The Problem with compression
  • 15. *  Two types:Google  Snappy  Compressor  (Faster  read/writes)  DeflateCompressor  (Java  zip,  slower  ,  beLer  compression)  *  Snappy Compression not available on Pi(requires  na+ve  methods,  so  someone  might  get  it  to  work!)  Compression types
  • 16. *  Startup script allocates memory*  Calculates based on number of processors*  Pi reports Zero processors !*  Boom !*  Now fixedAnd the startup script
  • 17. *  In*  JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=”*  Or else nodetool will not work between nodesJMX Config
  • 18. *  C* 1.22. added UseCondCardMark as a JVM Opt*  "for better lock handling especially on hotspot with multicoreprocessor”*  In  [  "$JVM_VERSION"  >  "1.7"  ]  ;  then                                                                                                                                                #        JVM_OPTS="$JVM_OPTS  -­‐XX:+UseCondCardMark"                                                                                                                                #fi    JVM OPT UseCondCardMark
  • 19. *  We’ve forgotten one thing*  The Pi cost £25*  You can power 4 from USB hub (no need for a power supply oneach one)*  So:The Good News !
  • 20. So, have a 64 node computer for £2000University  of  Southhampton  
  • 21. *  32 node Beowolf cluster:*  Joshua Kiepert, Boise UniversityOr this
  • 22. *  Adding nodes adds performance*  Adding nodes adds replicas of data*  BUT*  Make sure your ring is balanced,*  Pi’s don’t like to be unbalanced.Adding nodes is good
  • 23. *  Vnodes (in 1.2) would be very nice*  However at this point I haven’t got 1.2 on Pi running on a clusterVnodes
  • 24. Performance with 3/4 nodes
  • 25. Performance with 5/6 nodes
  • 26. *  ./stress -d,, -o insert -IDeflateCompressor*  Note: nodes to use*  You will get different performance if you insert to less nodes thanyou have in your ringStress test commands
  • 27. *  Adding a node (in the absence of Vnodes)Must  seed  form  a  known  node  Use  a  program  to  calculate  new  keys    Bring  up  new  node  with  the  correct  key  in  cassandra.yaml  Use  node  tool  to  move  other  nodes  Adding Nodes Procedure
  • 28. *  Python codeimport  sys  if  (len(sys.argv)  >  1):        num  =  int(sys.argv[1])  else:        num  =  int(raw_input("How  many  nodes?  :"))  for  i  in  range(0,num):        print  node  %d:  %d  %  (i,  (i*(2**127)/num))  Calculating keys
  • 29. *  Use nodetoolsudo  ./nodetool  -­‐h  move  42535295865117307932921825928971026432  *  And cleanup./nodetool  -­‐h  cleanup  Moving existing nodes
  • 30. *  On Debian, you can free memory from the graphics chipCd  /boot  sudo  cp  start.elf  start.elf.old  sudo  cp  arm224_start.elf  to  start.elf  reboot  Getting more memory
  • 31. *  Under Rasbian*  Run with a monitor plugged for the first time*  Set options for screen memory*  Perhaps disable boot to GUIGetting more Memory
  • 32. *  I prefer static network addresses*  Edit /etc/network/interfacesiface  eth0  inet  sta+c                address                netmask                network                broadcast                gateway  * Network address
  • 33. *  Make a master SD card*  Copy it !*  Make sure the master version has no data on it.*  Consider ”Puppet” (though I don’t use it)Multiple nodes
  • 34. *  See*  Put the file in /etc/init.d*  update-rc.d cassandra defaultsStarting as a service
  • 35. *  So for £200 we get an 8 node C* cluster*  It can be reconfigured, blown away, stress tested and generallyabused*  We can simulate data racks, data centers and I hope even longnetwork delays.*  Hopefully our upcoming MSc in Data Science will use these clustersPi is for teaching
  • 36. *  We know C* can be configured to be aware of:Network  racks  Data  Centers  *  We know we can have replicas are stored across these racks*  How can we play with this cheaply ?C* is network aware
  • 37. Proposed teaching tool10mbs  Hubb  Noise  injec+on  Switch  2  Switch  1  Pi  1  Pi  2  Pi  3  Pi  1  Pi  2  Pi  3  
  • 38. *  Cassandra wouldn’t run on a PI*  It does now.*  Running it on a Pi shook out some Cassandra bugs*  You can run it in a secure labPi is discovery
  • 39. *  Most important, this was pure Geeky FunPi is for fun
  • 40. *  Data Science:* Plug
  • 41. *  Raspberry Pi is cheap*  C* needs some work to run on it*  You can make clusters cheaply for experimentation*  It’s fun !C* is Hardware Agnostic
  • 42. THANK YOU