Your SlideShare is downloading. ×
Cassandra on Ubuntu AUTOMATIC Install
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cassandra on Ubuntu AUTOMATIC Install


Published on

This will walk you through the installation of a Cassandra (1.2) node on an Ubuntu (12.04) server. …

This will walk you through the installation of a Cassandra (1.2) node on an Ubuntu (12.04) server.

It will teach you to configure LVM, XFS, /var/lib/cassandra/data and /var/lib/cassandra/commitlog directories.

It will also teach you how to install OpsCenter (FREE) from Datastax to help manage it all.

This was based on a talk given at the Toronto Cassandra Meetup on August 7th, 2013 by myself, Victor Anjos.

Published in: Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • U of T: Engineering, changed to Computer Science & Math with Physics MinorConcordia and McGill: Helping with Research Analysis on Diabetes, Cystic Fibrosis and Pulmonary Disease. Wrote Bio-Informatics Analysis software with custom datastore in C with storage schemas (key-value) on a very early implementation of XFS on GentooWrote algorithms and software to divest and acquire companies while at PwC and Nortel, acessing hundreds of thousands of employee records across Petabyte sized data when the norm was GigabytesFunny story, my own software ended up divesting me (and laying off 10s of thousands of employees) at Nortel during the major crashBuilt DW solutions, in-house (proprietary) and using SUN, Teradata, Netezza, SAP while at Telus.Ingested millions of records per second while at ATI (Engineering data for chip design)One of first companies to use Blade Servers as a compute cloudInstalled and configured largest data warehouse (at the time) in Canada with largest servers (at the time) SUN M9000sSyncapse, a Marketing/Advertising AgencyShaw purchased Alliance Atlantis and was going digital, hired me to lead them into the cloudBuilt Windows, Linux, replicate environment that survived the major AWS outages prior to “best practices” being around
  • Transcript

    • 2. 00101011110101101010100101010101101101010101000111011010010101111010110101010010101010110110101010100011101101101101010101010010101111010110101010010101010110110 SO, WHO AM I? Engage, Discover, Monetize.
    • 3. WHO AM I ANYWAY? Engage, Discover, Monetize. • Bio-Informatics Data Scientist • Employee Retentions Analytics • Data Warehouse Specialist • System Operations / DevOps • Founder & Lead Technologist • Real Estate Investor and Broker • Presenter, Speaker, Organizer • Data Scientist & Architect 00101011110101101010100101010101101101010101000111011010010101111010110101010010101010110110101010100011101101101101010101010010101111010110101010010101010110110
    • 4. Our Linux Distro of Choice THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos
    • 5. IPTables / Ports THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos
    • 6. NTP THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Clocks that are in SYNC are deeply important in a cluster. Make sure to install NTP and configure it to a known server: # sudo apt-get -y install ntp # sudo /etc/init.d/ntp stop # sudo ntpdate # sudo /etc/init.d/ntp start
    • 7. Linux Volume Manager THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Split up your disks… # sudo apt-get -y install lvm2 Find out which disks you have… # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT xvda1 202:1 0 8G 0 disk / xvdb 202:16 0 420G 0 disk xvdc 202:32 0 420G 0 disk
    • 8. Linux Volume Manager cont THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Create Physical Volume(s) # sudo pvcreate /dev/xvdb Create Volume Group(s) # sudo vgcreate cassandra /dev/xvdb Physical volume "/dev/xvdb" successfully created Volume group "cassandra" successfully created Create Logical Volume(s) # sudo lvcreate --size 100G -n data cassandra Logical volume "data" created # sudo lvcreate --size 10G -n clog cassandra Logical volume ”clog" created
    • 9. XFS THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos
    • 10. CREATE / MOUNT FS THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos # sudo apt-get -y install xfsprogs # sudo mkfs.xfs /dev/cassandra/data # sudo mkfs.xfs /dev/cassandra/clog # sudo mkdir -p /var/lib/cassandra/data # sudo mkdir -p /var/lib/cassandra/commitlog # sudo chown -R cassandra:cassandra /var/lib/cassandra/ Ensure your Ubuntu install has XFS Create the Filesystem on both the Data and Commitlog Volumes Create the containing directories Make sure the user Cassandra owns them
    • 11. CREATE / MOUNT FS cont THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos # df -lh Filesystem Size Used Avail Use% Mounted on /dev/xvda1 7.9G 1.3G 6.2G 18% / udev 3.7G 8.0K 3.7G 1% /dev tmpfs 1.5G 200K 1.5G 1% /run none 5.0M 0 5.0M 0% /run/lock none 3.7G 0 3.7G 0% /run/shm /dev/mapper/cassandra-data 504G 33M 504G 1% /var/lib/cassandra/data /dev/mapper/cassandra-clog 168G 33M 168G 1% /var/lib/cassandra/commitlog # ls -l /var/lib/cassandra/ total 4 drwxr-xr-x 2 cassandra cassandra 78 Aug 7 13:04 commitlog drwxr-xr-x 6 cassandra cassandra 73 Aug 7 13:05 data drwx------ 2 cassandra cassandra 4096 Aug 7 13:04 saved_caches Your directory listing should end up like this:
    • 12. DATASTAX INSTALL THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Enable the Datastax Repository # echo "deb stable main" | sudo tee -a /etc/apt/sources.list # curl -L | sudo apt-key add - Install DSC12 (Cassandra 1.2) --- Datastax most recent release # sudo apt-get –y install dsc12
    • 13. DATASTAX INSTALL cont THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Many Cassandra Options Available, but for Bare Minimum… # sudo sed -i -e "/^rpc_address/crpc_address:" -e "/^initial_token/c# initial_token:" /etc/cassandra/cassandra.yaml Start it all up!!! # sudo service cassandra start Check how it’s doing
    • 14. NICE TO HAVE THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos DATASTAX OPSCENTER-FREE # sudo apt-get install -y libssl0.9.8 opscenter-free # sudo sed -i -e 's/interface = =’ /etc/opscenter/opscenterd.conf # sudo service opscenterd start
    • 15. ALL TOGETHER NOW THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos A Gist to do EVERYTHING we did in this Meetup in less than 90 seconds, completely automated!
    • 16. HOW ABOUT A FULL RING? THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos CCM (for Cassandra Cluster Manager ... or something) My Wrapper to auto-build it and give it some extra disk space… # df -lh Filesystem Size Used Avail Use% Mounted on /dev/xvda1 7.9G 1.5G 6.1G 19% / udev 3.7G 8.0K 3.7G 1% /dev tmpfs 1.5G 208K 1.5G 1% /run none 5.0M 0 5.0M 0% /run/lock none 3.7G 0 3.7G 0% /run/shm /dev/mapper/cassandra-data1 168G 33M 168G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node1/data /dev/mapper/cassandra-clog1 84G 33M 84G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node1/commitlogs /dev/mapper/cassandra-data2 168G 33M 168G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node2/data /dev/mapper/cassandra-clog2 84G 33M 84G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node2/commitlogs /dev/mapper/cassandra-data3 168G 33M 168G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node3/data /dev/mapper/cassandra-clog3 84G 33M 84G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node3/commitlogs # ccm node1 status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Token Rack UN 66.95 KB 33.3% 6282dab3-a276-49de-b42a-18f552f51cc9 -9223372036854775808 rack1 UN 79.02 KB 33.3% 015a5bd4-8b58-4770-9c55-8979faba6ee9 -3074457345618258603 rack1 UN 60.58 KB 33.3% 35f82028-1aad-4bab-beee-19f0e50c82a1 3074457345618258602 rack1
    • 17. SHAMELESS PLUG THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Data for Good, in affiliation with, brings together leading data scientists with high impact social organizations through a comprehensive, collaborative approach that leads to shared insights, greater understanding, and positive action through "data in the service of humanity". leads a community of pioneering data scientists with the talent, commitment, and energy to open doors & inspire a new way of using the skills and tools of corporations & governments, to meet the needs of the NFP/NGO and social innovation sector.
    • 18. QUESTIONS? THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos