Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Hardware Agnostic: Cassandra on Raspberry PiAndy Cobley | Lecturer, University of Dundee, Scotland
*  Cassandra is hardware agnostic*  So why not run it on a Raspberry Pi ?*  How hard can it be ?*  What can we do with it ...
*  Andy Cobley*  School of Computing*  University of Dundee*  Twitter: @andycobleyWho Am I ?
*  Single chip Linux computer*  500 Meg ram*  Boots off an SD card*  Ethernet port*  (graphics and all you need for a gene...
Pi with pound coin
*  And here’s the Cassandra cluster *And, here’s one for real* Power Permitting !
*  Cassandra is designed to be fast, fast at writing, fast at reading.*  This laptop with one instance of Cassandra will d...
*  Running a external USB drive is actually worse !*  Probably be hardware featureMore bad news !
Raspberry Pi Schematic
*  Oracle Java vs OpenJDKAnd then there’s Java!
*  Raspbian is Debian for the PI*  Uses the Hard floating point accelerator*  Much faster than Debian*  Current Oracle JDK...
*  http://www.oracle.com/technetwork/java/embedded/downloads/javase/index.html*  Java SE Embedded version 6*  Cassandra mi...
*  Actually not much difference in performanceHard vs Soft Float
*  Cassandra uses compression for performance*  Started in version 1.02x-­‐4x	  reduc+on	  in	  data	  size	  25-­‐35%	  p...
*  Two types:Google	  Snappy	  Compressor	  (Faster	  read/writes)	  DeflateCompressor	  (Java	  zip,	  slower	  ,	  beLer	...
*  Startup script allocates memory*  Calculates based on number of processors*  Pi reports Zero processors !*  Boom !*  No...
*  In Cassandra-env.sh*  JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=192.168.1.15”*  Or else nodetool will not work bet...
*  C* 1.22. added UseCondCardMark as a JVM Opt*  "for better lock handling especially on hotspot with multicoreprocessor”*...
*  We’ve forgotten one thing*  The Pi cost £25*  You can power 4 from USB hub (no need for a power supply oneach one)*  So...
So, have a 64 node computer for £2000University	  of	  Southhampton	  
*  32 node Beowolf cluster:*  Joshua Kiepert, Boise UniversityOr this
*  Adding nodes adds performance*  Adding nodes adds replicas of data*  BUT*  Make sure your ring is balanced,*  Pi’s don’...
*  Vnodes (in 1.2) would be very nice*  However at this point I haven’t got 1.2 on Pi running on a clusterVnodes
Performance with 3/4 nodes
Performance with 5/6 nodes
*  ./stress -d 192.168.1.10,192.168.1.11,192.168.1.12 -o insert -IDeflateCompressor*  Note: nodes to use*  You will get di...
*  Adding a node (in the absence of Vnodes)Must	  seed	  form	  a	  known	  node	  Use	  a	  program	  to	  calculate	  ne...
*  Python codeimport	  sys	  if	  (len(sys.argv)	  >	  1):	  	  	  	  num	  =	  int(sys.argv[1])	  else:	  	  	  	  num	  ...
*  Use nodetoolsudo	  ./nodetool	  -­‐h	  192.168.1.10	  move	  42535295865117307932921825928971026432	  *  And cleanup./n...
*  On Debian, you can free memory from the graphics chipCd	  /boot	  sudo	  cp	  start.elf	  start.elf.old	  sudo	  cp	  a...
*  Under Rasbian*  Run with a monitor plugged for the first time*  Set options for screen memory*  Perhaps disable boot to...
*  I prefer static network addresses*  Edit /etc/network/interfacesiface	  eth0	  inet	  sta+c	  	  	  	  	  	  	  	  addr...
*  Make a master SD card*  Copy it !*  Make sure the master version has no data on it.*  Consider ”Puppet” (though I don’t...
*  See https://github.com/acobley/CassandraStartup*  Put the file in /etc/init.d*  update-rc.d cassandra defaultsStarting ...
*  So for £200 we get an 8 node C* cluster*  It can be reconfigured, blown away, stress tested and generallyabused*  We ca...
*  We know C* can be configured to be aware of:Network	  racks	  Data	  Centers	  *  We know we can have replicas are stor...
Proposed teaching tool10mbs	  Hubb	  Noise	  injec+on	  Switch	  2	  Switch	  1	  Pi	  1	  Pi	  2	  Pi	  3	  Pi	  1	  Pi	 ...
*  Cassandra wouldn’t run on a PI*  It does now.*  Running it on a Pi shook out some Cassandra bugs*  You can run it in a ...
*  Most important, this was pure Geeky FunPi is for fun
*  Data Science:*  http://www.computing.dundee.ac.uk/study/postgrad/degreedetails.asp?17Obligatory Plug
*  Raspberry Pi is cheap*  C* needs some work to run on it*  You can make clusters cheaply for experimentation*  It’s fun ...
THANK YOU
C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley
Upcoming SlideShare
Loading in …5
×

C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley

3,629 views

Published on

The raspberry Pi is a credit-card sized $25 ARM based linux box designed to teach children the basics of programming. The machine comes with a 700MHz ARM and 512Mb of memory and boots off a SD card, not much power for running the likes of a Cassandra cluster. This presentation will discuss the problems of getting Cassandra up and running on the Pi and will answer the all important question: Why on Earth would you want to do this!?

Published in: Technology
  • Be the first to comment

C* Summit 2013: Hardware Agnostic - Cassandra on Raspberry Pi by Andy Cobley

  1. 1. Hardware Agnostic: Cassandra on Raspberry PiAndy Cobley | Lecturer, University of Dundee, Scotland
  2. 2. *  Cassandra is hardware agnostic*  So why not run it on a Raspberry Pi ?*  How hard can it be ?*  What can we do with it once it works?Cassandra on Raspberry Pi
  3. 3. *  Andy Cobley*  School of Computing*  University of Dundee*  Twitter: @andycobleyWho Am I ?
  4. 4. *  Single chip Linux computer*  500 Meg ram*  Boots off an SD card*  Ethernet port*  (graphics and all you need for a general purpose computer)Whats a Raspberry Pi ?
  5. 5. Pi with pound coin
  6. 6. *  And here’s the Cassandra cluster *And, here’s one for real* Power Permitting !
  7. 7. *  Cassandra is designed to be fast, fast at writing, fast at reading.*  This laptop with one instance of Cassandra will do 12,000 writeoperations*  Raspberry Pi will do 200 !The Bad News
  8. 8. *  Running a external USB drive is actually worse !*  Probably be hardware featureMore bad news !
  9. 9. Raspberry Pi Schematic
  10. 10. *  Oracle Java vs OpenJDKAnd then there’s Java!
  11. 11. *  Raspbian is Debian for the PI*  Uses the Hard floating point accelerator*  Much faster than Debian*  Current Oracle JDK won’t run on it !And Raspbian
  12. 12. *  http://www.oracle.com/technetwork/java/embedded/downloads/javase/index.html*  Java SE Embedded version 6*  Cassandra might prefer 6*  But*  https://blogs.oracle.com/henrik/entry/oracle_releases_jdk_for_linux*  Preview at:*  https://jdk8.java.net/fxarmpreview/Oracle java
  13. 13. *  Actually not much difference in performanceHard vs Soft Float
  14. 14. *  Cassandra uses compression for performance*  Started in version 1.02x-­‐4x  reduc+on  in  data  size  25-­‐35%  performance  improvement  on  reads  5-­‐10%  performance  improvement  on  writes  The Problem with compression
  15. 15. *  Two types:Google  Snappy  Compressor  (Faster  read/writes)  DeflateCompressor  (Java  zip,  slower  ,  beLer  compression)  *  Snappy Compression not available on Pi(requires  na+ve  methods,  so  someone  might  get  it  to  work!)  Compression types
  16. 16. *  Startup script allocates memory*  Calculates based on number of processors*  Pi reports Zero processors !*  Boom !*  Now fixedAnd the startup script
  17. 17. *  In Cassandra-env.sh*  JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=192.168.1.15”*  Or else nodetool will not work between nodesJMX Config
  18. 18. *  C* 1.22. added UseCondCardMark as a JVM Opt*  "for better lock handling especially on hotspot with multicoreprocessor”*  In cassandra-env.sh#if  [  "$JVM_VERSION"  >  "1.7"  ]  ;  then                                                                                                                                                #        JVM_OPTS="$JVM_OPTS  -­‐XX:+UseCondCardMark"                                                                                                                                #fi    JVM OPT UseCondCardMark
  19. 19. *  We’ve forgotten one thing*  The Pi cost £25*  You can power 4 from USB hub (no need for a power supply oneach one)*  So:The Good News !
  20. 20. So, have a 64 node computer for £2000University  of  Southhampton  
  21. 21. *  32 node Beowolf cluster:*  Joshua Kiepert, Boise UniversityOr this
  22. 22. *  Adding nodes adds performance*  Adding nodes adds replicas of data*  BUT*  Make sure your ring is balanced,*  Pi’s don’t like to be unbalanced.Adding nodes is good
  23. 23. *  Vnodes (in 1.2) would be very nice*  However at this point I haven’t got 1.2 on Pi running on a clusterVnodes
  24. 24. Performance with 3/4 nodes
  25. 25. Performance with 5/6 nodes
  26. 26. *  ./stress -d 192.168.1.10,192.168.1.11,192.168.1.12 -o insert -IDeflateCompressor*  Note: nodes to use*  You will get different performance if you insert to less nodes thanyou have in your ringStress test commands
  27. 27. *  Adding a node (in the absence of Vnodes)Must  seed  form  a  known  node  Use  a  program  to  calculate  new  keys    Bring  up  new  node  with  the  correct  key  in  cassandra.yaml  Use  node  tool  to  move  other  nodes  Adding Nodes Procedure
  28. 28. *  Python codeimport  sys  if  (len(sys.argv)  >  1):        num  =  int(sys.argv[1])  else:        num  =  int(raw_input("How  many  nodes?  :"))  for  i  in  range(0,num):        print  node  %d:  %d  %  (i,  (i*(2**127)/num))  Calculating keys
  29. 29. *  Use nodetoolsudo  ./nodetool  -­‐h  192.168.1.10  move  42535295865117307932921825928971026432  *  And cleanup./nodetool  -­‐h  192.168.1.10  cleanup  Moving existing nodes
  30. 30. *  On Debian, you can free memory from the graphics chipCd  /boot  sudo  cp  start.elf  start.elf.old  sudo  cp  arm224_start.elf  to  start.elf  reboot  Getting more memory
  31. 31. *  Under Rasbian*  Run with a monitor plugged for the first time*  Set options for screen memory*  Perhaps disable boot to GUIGetting more Memory
  32. 32. *  I prefer static network addresses*  Edit /etc/network/interfacesiface  eth0  inet  sta+c                address  192.168.1.41                netmask  255.255.255.0                network  192.168.1.0                broadcast  192.168.1.255                gateway  192.168.1.254  * Network address
  33. 33. *  Make a master SD card*  Copy it !*  Make sure the master version has no data on it.*  Consider ”Puppet” (though I don’t use it)Multiple nodes
  34. 34. *  See https://github.com/acobley/CassandraStartup*  Put the file in /etc/init.d*  update-rc.d cassandra defaultsStarting as a service
  35. 35. *  So for £200 we get an 8 node C* cluster*  It can be reconfigured, blown away, stress tested and generallyabused*  We can simulate data racks, data centers and I hope even longnetwork delays.*  Hopefully our upcoming MSc in Data Science will use these clustersPi is for teaching
  36. 36. *  We know C* can be configured to be aware of:Network  racks  Data  Centers  *  We know we can have replicas are stored across these racks*  How can we play with this cheaply ?C* is network aware
  37. 37. Proposed teaching tool10mbs  Hubb  Noise  injec+on  Switch  2  Switch  1  Pi  1  Pi  2  Pi  3  Pi  1  Pi  2  Pi  3  
  38. 38. *  Cassandra wouldn’t run on a PI*  It does now.*  Running it on a Pi shook out some Cassandra bugs*  You can run it in a secure labPi is discovery
  39. 39. *  Most important, this was pure Geeky FunPi is for fun
  40. 40. *  Data Science:*  http://www.computing.dundee.ac.uk/study/postgrad/degreedetails.asp?17Obligatory Plug
  41. 41. *  Raspberry Pi is cheap*  C* needs some work to run on it*  You can make clusters cheaply for experimentation*  It’s fun !C* is Hardware Agnostic
  42. 42. THANK YOU

×