Hardware Agnostic: Cassandra on Raspberry PiAndy Cobley | Lecturer, University of Dundee, Scotland
*Cassandra is hardware agnostic*So why not run it on a Raspberry Pi ?*How hard can it be ?*What can we do with it once it ...
*Andy Cobley*School of Computing*University of Dundee*Twitter: @andycobleyWho Am I ?
*Single chip Linux computer*500 Meg ram*Boots off an SD card*Ethernet port*(graphics and all you need for a general purpos...
Pi with pound coin
*And here’s the Cassandra cluster *And, here’s one for real* Power Permitting !
*Cassandra is designed to be fast, fast at writing, fast at reading.*This laptop with one instance of Cassandra will do 12...
*Running a external USB drive is actually worse !*Probably be hardware featureMore bad news !
Raspberry Pi Schematic
*Oracle Java vs OpenJDKAnd then there’s Java!
*Raspbian is Debian for the PI*Uses the Hard floating point accelerator*Much faster than Debian*Current Oracle JDK won’t r...
*http://www.oracle.com/technetwork/java/embedded/downloads/javase/index.html*Java SE Embedded version 6*Cassandra might pr...
*Actually not much difference in performanceHard vs Soft Float
*Cassandra uses compression for performance*Started in version 1.02x-4x reduction in data size25-35% performance improveme...
*Two types:Google Snappy Compressor (Faster read/writes)DeflateCompressor (Java zip, slower , bettercompression)*Snappy Co...
*Startup script allocates memory*Calculates based on number of processors*Pi reports Zero processors !*Boom !*Now fixedAnd...
*In Cassandra-env.sh*JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=192.168.1.15”*Or else nodetool will not work between n...
*C* 1.22. added UseCondCardMark as a JVM Opt*"for better lock handling especially on hotspot with multicoreprocessor”*In c...
*We’ve forgotten one thing*The Pi cost £25*You can power 4 from USB hub (no need for a power supply oneach one)*So:The Goo...
So, have a 64 node computer for £2000University of Southhampton
*32 node Beowolf cluster:*Joshua Kiepert, Boise UniversityOr this
*Adding nodes adds performance*Adding nodes adds replicas of data*BUT*Make sure your ring is balanced,*Pi’s don’t like to ...
*Vnodes (in 1.2) would be very nice*However at this point I haven’t got 1.2 on Pi running on a clusterVnodes
Performance with 3/4 nodes
Performance with 5/6 nodes
*./stress -d 192.168.1.10,192.168.1.11,192.168.1.12 -o insert -IDeflateCompressor*Note: nodes to use*You will get differen...
*Adding a node (in the absence of Vnodes)Must seed form a known nodeUse a program to calculate new keysBring up new node w...
*Python codeimport sysif (len(sys.argv) > 1):num = int(sys.argv[1])else:num = int(raw_input("How many nodes? :"))for i in ...
*Use nodetoolsudo ./nodetool -h 192.168.1.10 move42535295865117307932921825928971026432*And cleanup./nodetool -h 192.168.1...
*On Debian, you can free memory from the graphics chipCd /bootsudo cp start.elf start.elf.oldsudo cp arm224_start.elf to s...
*Under Rasbian*Run with a monitor plugged for the first time*Set options for screen memory*Perhaps disable boot to GUIGett...
*I prefer static network addresses*Edit /etc/network/interfacesiface eth0 inet staticaddress 192.168.1.41netmask 255.255.2...
*Make a master SD card*Copy it !*Make sure the master version has no data on it.*Consider ”Puppet” (though I don’t use it)...
*See https://github.com/acobley/CassandraStartup*Put the file in /etc/init.d*update-rc.d cassandra defaultsStarting as a s...
*So for £200 we get an 8 node C* cluster*It can be reconfigured, blown away, stress tested and generallyabused*We can simu...
*We know C* can be configured to be aware of:Network racksData Centers*We know we can have replicas are stored across thes...
Proposed teaching tool10mbsHubbNoiseinjectionSwitch2Switch1Pi 1Pi 2Pi 3Pi 1Pi 2Pi 3
*Cassandra wouldn’t run on a PI*It does now.*Running it on a Pi shook out some Cassandra bugs*You can run it in a secure l...
*Most important, this was pure Geeky FunPi is for fun
*Data Science:*http://www.computing.dundee.ac.uk/study/postgrad/degreedetails.asp?17Obligatory Plug
*Raspberry Pi is cheap*C* needs some work to run on it*You can make clusters cheaply for experimentation*It’s fun !C* is H...
THANK YOU
Data stax cassandra_summit_2013_cassandra_raspberrypi-rc1
Upcoming SlideShare
Loading in …5
×

Data stax cassandra_summit_2013_cassandra_raspberrypi-rc1

1,811 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,811
On SlideShare
0
From Embeds
0
Number of Embeds
26
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • (Picture Credits below: Glenn Harris 2012) http://www.southampton.ac.uk/~sjc/raspberrypi/pi_supercomputer_southampton.htm
  • Data stax cassandra_summit_2013_cassandra_raspberrypi-rc1

    1. 1. Hardware Agnostic: Cassandra on Raspberry PiAndy Cobley | Lecturer, University of Dundee, Scotland
    2. 2. *Cassandra is hardware agnostic*So why not run it on a Raspberry Pi ?*How hard can it be ?*What can we do with it once it works?Cassandra on Raspberry Pi
    3. 3. *Andy Cobley*School of Computing*University of Dundee*Twitter: @andycobleyWho Am I ?
    4. 4. *Single chip Linux computer*500 Meg ram*Boots off an SD card*Ethernet port*(graphics and all you need for a general purpose computer)Whats a Raspberry Pi ?
    5. 5. Pi with pound coin
    6. 6. *And here’s the Cassandra cluster *And, here’s one for real* Power Permitting !
    7. 7. *Cassandra is designed to be fast, fast at writing, fast at reading.*This laptop with one instance of Cassandra will do 12,000 writeoperations*Raspberry Pi will do 200 !The Bad News
    8. 8. *Running a external USB drive is actually worse !*Probably be hardware featureMore bad news !
    9. 9. Raspberry Pi Schematic
    10. 10. *Oracle Java vs OpenJDKAnd then there’s Java!
    11. 11. *Raspbian is Debian for the PI*Uses the Hard floating point accelerator*Much faster than Debian*Current Oracle JDK won’t run on it !And Raspbian
    12. 12. *http://www.oracle.com/technetwork/java/embedded/downloads/javase/index.html*Java SE Embedded version 6*Cassandra might prefer 6*But*https://blogs.oracle.com/henrik/entry/oracle_releases_jdk_for_linux*Preview at:*https://jdk8.java.net/fxarmpreview/Oracle java
    13. 13. *Actually not much difference in performanceHard vs Soft Float
    14. 14. *Cassandra uses compression for performance*Started in version 1.02x-4x reduction in data size25-35% performance improvement on reads5-10% performance improvement on writesThe Problem with compression
    15. 15. *Two types:Google Snappy Compressor (Faster read/writes)DeflateCompressor (Java zip, slower , bettercompression)*Snappy Compression not available on Pi(requires native methods, so someone might get it towork!)Compression types
    16. 16. *Startup script allocates memory*Calculates based on number of processors*Pi reports Zero processors !*Boom !*Now fixedAnd the startup script
    17. 17. *In Cassandra-env.sh*JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=192.168.1.15”*Or else nodetool will not work between nodesJMX Config
    18. 18. *C* 1.22. added UseCondCardMark as a JVM Opt*"for better lock handling especially on hotspot with multicoreprocessor”*In cassandra-env.sh#if [ "$JVM_VERSION" > "1.7" ] ; then# JVM_OPTS="$JVM_OPTS -XX:+UseCondCardMark"#fiJVM OPT UseCondCardMark
    19. 19. *We’ve forgotten one thing*The Pi cost £25*You can power 4 from USB hub (no need for a power supply oneach one)*So:The Good News !
    20. 20. So, have a 64 node computer for £2000University of Southhampton
    21. 21. *32 node Beowolf cluster:*Joshua Kiepert, Boise UniversityOr this
    22. 22. *Adding nodes adds performance*Adding nodes adds replicas of data*BUT*Make sure your ring is balanced,*Pi’s don’t like to be unbalanced.Adding nodes is good
    23. 23. *Vnodes (in 1.2) would be very nice*However at this point I haven’t got 1.2 on Pi running on a clusterVnodes
    24. 24. Performance with 3/4 nodes
    25. 25. Performance with 5/6 nodes
    26. 26. *./stress -d 192.168.1.10,192.168.1.11,192.168.1.12 -o insert -IDeflateCompressor*Note: nodes to use*You will get different performance if you insert to less nodes thanyou have in your ringStress test commands
    27. 27. *Adding a node (in the absence of Vnodes)Must seed form a known nodeUse a program to calculate new keysBring up new node with the correct key incassandra.yamlUse node tool to move other nodesAdding Nodes Procedure
    28. 28. *Python codeimport sysif (len(sys.argv) > 1):num = int(sys.argv[1])else:num = int(raw_input("How many nodes? :"))for i in range(0,num):print node %d: %d % (i, (i*(2**127)/num))Calculating keys
    29. 29. *Use nodetoolsudo ./nodetool -h 192.168.1.10 move42535295865117307932921825928971026432*And cleanup./nodetool -h 192.168.1.10 cleanupMoving existing nodes
    30. 30. *On Debian, you can free memory from the graphics chipCd /bootsudo cp start.elf start.elf.oldsudo cp arm224_start.elf to start.elfrebootGetting more memory
    31. 31. *Under Rasbian*Run with a monitor plugged for the first time*Set options for screen memory*Perhaps disable boot to GUIGetting more Memory
    32. 32. *I prefer static network addresses*Edit /etc/network/interfacesiface eth0 inet staticaddress 192.168.1.41netmask 255.255.255.0network 192.168.1.0broadcast 192.168.1.255gateway 192.168.1.254*Network address
    33. 33. *Make a master SD card*Copy it !*Make sure the master version has no data on it.*Consider ”Puppet” (though I don’t use it)Multiple nodes
    34. 34. *See https://github.com/acobley/CassandraStartup*Put the file in /etc/init.d*update-rc.d cassandra defaultsStarting as a service
    35. 35. *So for £200 we get an 8 node C* cluster*It can be reconfigured, blown away, stress tested and generallyabused*We can simulate data racks, data centers and I hope even longnetwork delays.*Hopefully our upcoming MSc in Data Science will use these clustersPi is for teaching
    36. 36. *We know C* can be configured to be aware of:Network racksData Centers*We know we can have replicas are stored across these racks*How can we play with this cheaply ?C* is network aware
    37. 37. Proposed teaching tool10mbsHubbNoiseinjectionSwitch2Switch1Pi 1Pi 2Pi 3Pi 1Pi 2Pi 3
    38. 38. *Cassandra wouldn’t run on a PI*It does now.*Running it on a Pi shook out some Cassandra bugs*You can run it in a secure labPi is discovery
    39. 39. *Most important, this was pure Geeky FunPi is for fun
    40. 40. *Data Science:*http://www.computing.dundee.ac.uk/study/postgrad/degreedetails.asp?17Obligatory Plug
    41. 41. *Raspberry Pi is cheap*C* needs some work to run on it*You can make clusters cheaply for experimentation*It’s fun !C* is Hardware Agnostic
    42. 42. THANK YOU

    ×