INSTALLING C*
HOW TO PROPERLY DO IT AND KEEP IT
ALL ROLLING ALONG (FAST!!!)
00101011110101101010100101010101101101010101000111011010010101111010110101010010101010110110101010100011101101101101010101...
WHO AM I ANYWAY?
Engage, Discover, Monetize.
• Bio-Informatics Data Scientist
• Employee Retentions Analytics
• Data Wareh...
Our Linux Distro of Choice
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
IPTables / Ports
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
NTP
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
Clocks that are in SYNC are deeply important in a cluster.
Mak...
Linux Volume Manager
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
Split up your disks…
# sudo apt-get -y instal...
Linux Volume Manager cont
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
Create Physical Volume(s)
# sudo pvcreat...
XFS
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
CREATE / MOUNT FS
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
# sudo apt-get -y install xfsprogs
# sudo mkfs.x...
CREATE / MOUNT FS cont
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
# df -lh
Filesystem Size Used Avail Use% Mo...
DATASTAX INSTALL
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
Enable the Datastax Repository
# echo "deb http:/...
DATASTAX INSTALL cont
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
Many Cassandra Options Available, but for Ba...
NICE TO HAVE
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
DATASTAX OPSCENTER-FREE
# sudo apt-get install -y lib...
ALL TOGETHER NOW
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
https://gist.github.com/vanjos/5481606
A Gist to ...
HOW ABOUT A FULL RING?
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
https://github.com/pcmanus/ccm
CCM (for Cas...
SHAMELESS PLUG
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
Data for Good, in affiliation with DataKind.org, br...
QUESTIONS?
THANKS
TO
@Datastax
@RyersonDMZ
@Viafoura
@VictorFAnjos
Upcoming SlideShare
Loading in …5
×

Cassandra on Ubuntu AUTOMATIC Install

2,499 views
2,264 views

Published on

This will walk you through the installation of a Cassandra (1.2) node on an Ubuntu (12.04) server.

It will teach you to configure LVM, XFS, /var/lib/cassandra/data and /var/lib/cassandra/commitlog directories.

It will also teach you how to install OpsCenter (FREE) from Datastax to help manage it all.

This was based on a talk given at the Toronto Cassandra Meetup on August 7th, 2013 by myself, Victor Anjos.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,499
On SlideShare
0
From Embeds
0
Number of Embeds
297
Actions
Shares
0
Downloads
16
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • U of T: Engineering, changed to Computer Science & Math with Physics MinorConcordia and McGill: Helping with Research Analysis on Diabetes, Cystic Fibrosis and Pulmonary Disease. Wrote Bio-Informatics Analysis software with custom datastore in C with storage schemas (key-value) on a very early implementation of XFS on GentooWrote algorithms and software to divest and acquire companies while at PwC and Nortel, acessing hundreds of thousands of employee records across Petabyte sized data when the norm was GigabytesFunny story, my own software ended up divesting me (and laying off 10s of thousands of employees) at Nortel during the major crashBuilt DW solutions, in-house (proprietary) and using SUN, Teradata, Netezza, SAP while at Telus.Ingested millions of records per second while at ATI (Engineering data for chip design)One of first companies to use Blade Servers as a compute cloudInstalled and configured largest data warehouse (at the time) in Canada with largest servers (at the time) SUN M9000sSyncapse, a Marketing/Advertising AgencyShaw purchased Alliance Atlantis and was going digital, hired me to lead them into the cloudBuilt Windows, Linux, replicate environment that survived the major AWS outages prior to “best practices” being around
  • Cassandra on Ubuntu AUTOMATIC Install

    1. 1. INSTALLING C* HOW TO PROPERLY DO IT AND KEEP IT ALL ROLLING ALONG (FAST!!!)
    2. 2. 00101011110101101010100101010101101101010101000111011010010101111010110101010010101010110110101010100011101101101101010101010010101111010110101010010101010110110 SO, WHO AM I? Engage, Discover, Monetize.
    3. 3. WHO AM I ANYWAY? Engage, Discover, Monetize. • Bio-Informatics Data Scientist • Employee Retentions Analytics • Data Warehouse Specialist • System Operations / DevOps • Founder & Lead Technologist • Real Estate Investor and Broker • Presenter, Speaker, Organizer • Data Scientist & Architect 00101011110101101010100101010101101101010101000111011010010101111010110101010010101010110110101010100011101101101101010101010010101111010110101010010101010110110
    4. 4. Our Linux Distro of Choice THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos
    5. 5. IPTables / Ports THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos
    6. 6. NTP THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Clocks that are in SYNC are deeply important in a cluster. Make sure to install NTP and configure it to a known server: # sudo apt-get -y install ntp # sudo /etc/init.d/ntp stop # sudo ntpdate pool.ntp.org # sudo /etc/init.d/ntp start
    7. 7. Linux Volume Manager THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Split up your disks… # sudo apt-get -y install lvm2 Find out which disks you have… # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT xvda1 202:1 0 8G 0 disk / xvdb 202:16 0 420G 0 disk xvdc 202:32 0 420G 0 disk
    8. 8. Linux Volume Manager cont THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Create Physical Volume(s) # sudo pvcreate /dev/xvdb Create Volume Group(s) # sudo vgcreate cassandra /dev/xvdb Physical volume "/dev/xvdb" successfully created Volume group "cassandra" successfully created Create Logical Volume(s) # sudo lvcreate --size 100G -n data cassandra Logical volume "data" created # sudo lvcreate --size 10G -n clog cassandra Logical volume ”clog" created
    9. 9. XFS THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos
    10. 10. CREATE / MOUNT FS THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos # sudo apt-get -y install xfsprogs # sudo mkfs.xfs /dev/cassandra/data # sudo mkfs.xfs /dev/cassandra/clog # sudo mkdir -p /var/lib/cassandra/data # sudo mkdir -p /var/lib/cassandra/commitlog # sudo chown -R cassandra:cassandra /var/lib/cassandra/ Ensure your Ubuntu install has XFS Create the Filesystem on both the Data and Commitlog Volumes Create the containing directories Make sure the user Cassandra owns them
    11. 11. CREATE / MOUNT FS cont THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos # df -lh Filesystem Size Used Avail Use% Mounted on /dev/xvda1 7.9G 1.3G 6.2G 18% / udev 3.7G 8.0K 3.7G 1% /dev tmpfs 1.5G 200K 1.5G 1% /run none 5.0M 0 5.0M 0% /run/lock none 3.7G 0 3.7G 0% /run/shm /dev/mapper/cassandra-data 504G 33M 504G 1% /var/lib/cassandra/data /dev/mapper/cassandra-clog 168G 33M 168G 1% /var/lib/cassandra/commitlog # ls -l /var/lib/cassandra/ total 4 drwxr-xr-x 2 cassandra cassandra 78 Aug 7 13:04 commitlog drwxr-xr-x 6 cassandra cassandra 73 Aug 7 13:05 data drwx------ 2 cassandra cassandra 4096 Aug 7 13:04 saved_caches Your directory listing should end up like this:
    12. 12. DATASTAX INSTALL THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Enable the Datastax Repository # echo "deb http://debian.datastax.com/community stable main" | sudo tee -a /etc/apt/sources.list # curl -L http://debian.datastax.com/debian/repo_key | sudo apt-key add - Install DSC12 (Cassandra 1.2) --- Datastax most recent release # sudo apt-get –y install dsc12
    13. 13. DATASTAX INSTALL cont THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Many Cassandra Options Available, but for Bare Minimum… # sudo sed -i -e "/^rpc_address/crpc_address: 0.0.0.0" -e "/^initial_token/c# initial_token:" /etc/cassandra/cassandra.yaml Start it all up!!! # sudo service cassandra start Check how it’s doing
    14. 14. NICE TO HAVE THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos DATASTAX OPSCENTER-FREE # sudo apt-get install -y libssl0.9.8 opscenter-free # sudo sed -i -e 's/interface = 127.0.0.1/interface = 0.0.0.0/’ /etc/opscenter/opscenterd.conf # sudo service opscenterd start
    15. 15. ALL TOGETHER NOW THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos https://gist.github.com/vanjos/5481606 A Gist to do EVERYTHING we did in this Meetup in less than 90 seconds, completely automated!
    16. 16. HOW ABOUT A FULL RING? THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos https://github.com/pcmanus/ccm CCM (for Cassandra Cluster Manager ... or something) My Wrapper to auto-build it and give it some extra disk space… https://gist.github.com/vanjos/6169734 # df -lh Filesystem Size Used Avail Use% Mounted on /dev/xvda1 7.9G 1.5G 6.1G 19% / udev 3.7G 8.0K 3.7G 1% /dev tmpfs 1.5G 208K 1.5G 1% /run none 5.0M 0 5.0M 0% /run/lock none 3.7G 0 3.7G 0% /run/shm /dev/mapper/cassandra-data1 168G 33M 168G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node1/data /dev/mapper/cassandra-clog1 84G 33M 84G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node1/commitlogs /dev/mapper/cassandra-data2 168G 33M 168G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node2/data /dev/mapper/cassandra-clog2 84G 33M 84G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node2/commitlogs /dev/mapper/cassandra-data3 168G 33M 168G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node3/data /dev/mapper/cassandra-clog3 84G 33M 84G 1% /home/ubuntu/.ccm/My_DEV_Cluster/node3/commitlogs # ccm node1 status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Owns Host ID Token Rack UN 127.0.0.1 66.95 KB 33.3% 6282dab3-a276-49de-b42a-18f552f51cc9 -9223372036854775808 rack1 UN 127.0.0.2 79.02 KB 33.3% 015a5bd4-8b58-4770-9c55-8979faba6ee9 -3074457345618258603 rack1 UN 127.0.0.3 60.58 KB 33.3% 35f82028-1aad-4bab-beee-19f0e50c82a1 3074457345618258602 rack1
    17. 17. SHAMELESS PLUG THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos Data for Good, in affiliation with DataKind.org, brings together leading data scientists with high impact social organizations through a comprehensive, collaborative approach that leads to shared insights, greater understanding, and positive action through "data in the service of humanity". DataKind.org leads a community of pioneering data scientists with the talent, commitment, and energy to open doors & inspire a new way of using the skills and tools of corporations & governments, to meet the needs of the NFP/NGO and social innovation sector.
    18. 18. QUESTIONS? THANKS TO @Datastax @RyersonDMZ @Viafoura @VictorFAnjos

    ×