Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Apache Cassandra, part 3 – machinery, work with Cassandra


Published on

Aim of this presentation to provide enough information for enterprise architect to choose whether Cassandra will be project data store. Presentation describes each nuance of Cassandra architecture and ways to design data and work with them.

Published in: Technology
  • Be the first to comment

Apache Cassandra, part 3 – machinery, work with Cassandra

  1. 1. Apache Cassandra, part 3 – machinery, work with Cassandra<br />
  2. 2. V. Architecture (part 2)<br />
  3. 3. SEDA Architecture<br />SEDA – Staged event-driven architecture<br />Every unit of work is split into several stages that are executed in parallel threads. <br />Each stage consist of input and output event queue, event handler and stage controller. <br />
  4. 4. SEDA Architecture advantages<br />Well conditioned system load<br />Preventing resources from being overcommitted.<br />
  5. 5. SEDA in Cassandra - Usages<br />Read<br />Mutation<br />Gossip<br />Anti – Entropy<br />….<br />
  6. 6. SEDA in Cassandra - Design <br />Stage Manager presents Map between stage names and Java 5 thread pool executers.<br />Each controller with queue is presented by ThreadPoolExecuter that can be configured through JMX.<br />
  7. 7. VI. Working with Cassandra<br />
  8. 8. Installing and launching Cassandra<br />Download from<br />
  9. 9. Installing and launching Cassandra<br />Launching server: <br />bin/cassandra.bat<br />use “-f” key to run sever in foreground, so that all of the server logs will print to standard out<br />is started with single node cluster called “Test Cluster” listening on port 9160<br />
  10. 10. Installing and launching Cassandra<br />Starting command-line client interface:<br />bin/cassandra-cli.bat<br />you see [username@keyspace] at the beginning of every line<br />
  11. 11. Creating a cluster<br />In configuration file cassandra.yaml specify: <br />seeds – the list of seeds for the cluster<br />rpc_address and listen_address – network addresses<br />
  12. 12. Creating a cluster<br />initial_token – defining the node’s token range<br />auto_bootstrap – enables auto-migration of data to the new node<br />
  13. 13. nodetool ring<br />Use nodetool for view configuration<br />~$ nodetool -h localhost -p 8080 ring<br /> Address Status State Load Owns Range Ring<br /> 850705…<br /> Up Normal 2.53 KB 50.00 0|<--|<br /> Up Normal 1.33 KB 50.00 850705…|-->|<br />
  14. 14. Connecting to server<br />Connect from command line:<br />connect <HOSTNAME>/<PORT> [<USERNAME> ‘<PASSWORD>’];<br />Examples:<br />connect localhost/9160;<br /> connect user ‘password’;<br />Connect when staring command line client:<br />cassandra-cli<br /> –h,––host <HOSTNAME><br /> –p,––port <PORT><br /> –k,––keyspace <KEYSPACE><br /> –u,––username <USERNAME><br /> –p,––password <PASSWORD><br />
  15. 15. Describing environment<br />show cluster name;<br />show keyspaces;<br />show api version;<br />describe cluster;<br />describe keyspace [<KEYSPACE>];<br />
  16. 16. Create keyspace<br />create keyspace <KEYSPACE>;<br />create keyspace <KEYSPACE> with<br /> <ATTR1> = <VAL1> and<br /> <ATTR2> = <VAL2> ...;<br />Attributes:<br />placement_strategy<br />replication_factor<br />…<br />
  17. 17. Create keyspace<br />Example:<br />create keyspace Keyspace1 with placement_strategy = ‘org.apache.cassandra.locator.RackUnawareStrategy’ and replication_factor = 4;<br />
  18. 18. Update keyspace<br />Update attributes of created keyspace:<br /> update keyspace <KEYSPACE> with<br /> <ATTR1> = <VAL1> and <br /> <ATTR2> = <VAL2> ...;<br />
  19. 19. Switch to keyspace<br />use <KEYSPACE>;<br />use <KEYSPACE> [<USERNAME> ‘<PASSWORD>’];<br />If you don’t specify username and password then credentials supplied to the ‘connect’ statement will be used<br />If the server doesn’t support authentication it will ignore credentials<br />
  20. 20. Switch to keyspace<br />Example:<br />use Keyspace1 user1 ‘qwerty123’;<br />When you use keyspace you’ll see [user1@Keyspace1] at the beginning of every line<br />
  21. 21. Create column family<br />create column family <COL_FAMILY>;<br />create column family <COL_FAMILY> with<br /> <ATTR1> = <VAL1> and<br /> <ATTR2> = <VAL1> ...;<br />Example:<br />create column family Users with column_type = Super and<br /> comparator = UTF8Type and<br />rows_cached = 1000;<br />
  22. 22. Update column family<br />When column family is created you can update its attributes:<br /> update column family <COL_FAMILY> with<br /> <ATTR1> = <VAL1> and<br /> <ATTR2> = <VAL1> ...;<br />
  23. 23. Comparators and validators<br />Comparators – compare column names<br />Validators – validate column values<br />
  24. 24. Comparators and validators<br />You can specify comparator for column family and all subcolumns in column family (one for all)<br />You can specify validators for each known column of column family<br />You can specify default validator for column family that will be used for columns for which validators aren’t specified<br />You can specify key validatorwhich will validate row keys<br />
  25. 25. Attributes of column family<br />column_type: can be Standard or Super(default - Standard)<br />comparator: specifies how column names will be compared for sort order<br />column_metadata: defines the validation and indexes for known columns<br />default_validation_class: validator to use for values in columns which are not listed in the column_metadata. (default – BytesType)<br />key_validation_class: validator for keys<br />
  26. 26. Column metadata<br />You can define validators for each known column in the family<br /> create column family User with<br />column_metadata = [<br /> {column_name: name, validation_class: UTF8Type},<br /> {column_name: age, validation_class: IntegerType}, <br /> {column_name: birth, validation_class: UTF8Type}<br /> ];<br />Columns not listed in this section are validated with default_validation_class<br />
  27. 27. Secondary indexes<br />Allows queries by value<br /> get users where name = ‘Some user';<br />Can be created in background<br />
  28. 28. Creating index<br />Define it in column metadata<br />For example in cassandra-cli:create column family users with comparator=UTF8Type and column_metadata=[{column_name: birth_date, validation_class: LongType, index_type: KEYS}];<br />
  29. 29. Some restrictions<br />Cassandra use hash indexes instead of btree indexes. Thus, in where condition at least one indexed field with operator “=“ must be presentSo, you can’t useget users where birth_date > 1970; but canget users where birth_date = 1990 and karma > 50;<br />
  30. 30. Index types<br />KEYS<br />BITMAP (will be supported in future releases)<br />
  31. 31. Writing data<br />To write data use set command:<br />set Customers[‘ivan’][‘name’] = ‘Ivan’;<br />set Customers[‘makar’][‘info’][‘age’] = 96;<br />
  32. 32. Reading data<br />To read data use get command:<br />get Customers[‘ivan’][‘name’];<br />- this will display ‘Ivan’<br />get Customers[‘makar’];<br />- this will display all columns for key ‘makar’<br />
  33. 33. Reading data<br />To list a range of rows use list command:<br />list Customers;<br />list Customers[a:];<br />list Customers[a:c] limit 40;<br />- you can specify limit of rows that will be displayed (default - 100)<br />
  34. 34. Reading data<br />To get columns number use count command:<br />count Customers[‘ivan’]<br />- this will display number of columns for key ‘ivan’<br />
  35. 35. Deleting data<br />To delete a row, a column or a subcolumn use del command:<br />del Customers[‘ivan’];<br />- this will delete all columns for key ‘ivan’<br />del Customers[‘ivan’][‘name’];<br />- this will delete column name for key ‘ivan’<br />del Customers[‘ivan’][‘accounts’][‘2312784829312343’];<br />- this will delete a subcolumn with an account number from ‘accounts’ column for key ‘ivan’<br />
  36. 36. Deleting data<br />To delete all data in a column family use truncate command:<br />truncate Customers;<br />
  37. 37. Drop column family or keyspace<br /> drop column family Customers;<br /> drop keyspace Keyspace1;<br />
  38. 38. Q&A<br />
  39. 39. Resources<br />Home of Apache Cassandra Project<br />Apache Cassandra Wiki<br />Documentation provided by DataStax<br />Good explanation of creation secondary indexes<br />Eben Hewitt “Cassandra: The Definitive Guide”, O’REILLY, 2010, ISBN: 978-1-449-39041-9<br />
  40. 40. Authors<br />Lev Sivashov-<br />Andrey Lomakin -, twitter: @Andrey_LomakinLinkedIn:<br />Artem Orobets – enisher@gmail.comtwitter: @Dr_EniSh<br />Anton Veretennik -<br />