SlideShare a Scribd company logo
Cassandra Hands On
Niall Milton, CTO, DigBigData
Examples courtesy of Patrick Callaghan, DataStax
Sponsored By
Introduction
—  We will be walking through Cassandra use cases
from Patrick Callaghan on github.
—  https://github.com/PatrickCallaghan/
—  Patrick sends his apologies but due to Aer Lingus
air strike on Friday he couldn’t get a flight back to
UK
—  This presentation will cover the important points
from each sample application
Agenda
—  Transactions Example
—  Paging Example
—  Analytics Example
—  Risk Sensitivity Example
Transactions Example
Scenario
—  We want to add products, each with a quantity to
an order
—  Orders come in concurrently from random buyers
—  Products that have sold out will return “OUT OF
STOCK”
—  We want to use lightweight transactions to
guarantee that we do not allow orders to complete
when no stock is available
Lightweight Transactions
—  Guarantee a serial isolation level, ACID
—  Uses PAXOS consensus algorithm to achieve this in a
distributed system. See:
—  http://research.microsoft.com/en-us/um/people/lamport/
pubs/paxos-simple.pdf
—  Every node is still equal, no master or locks
—  Allows for conditional inserts & updates
—  The cost of linearizable consistency is higher latency,
not suitable for high volume writes where low latency is
required
Retrieve & Run the Code
1.  git clone
https://github.com/PatrickCallaghan/datastax-
transaction-demo.git
2.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.demo.SchemaSetup”
3.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.transactions.Main" -
Dload=true -DcontactPoints=127.0.0.1 -
DnoOfThreads=10
Schema
1.  create keyspace if not exists
datastax_transactions_demo WITH replication =
{'class': 'SimpleStrategy',
'replication_factor': '1' };
2.  create table if not exists products(productId
text, capacityleft int, orderIds set<text>,
PRIMARY KEY (productId));
3.  create table if not exists
buyers_orders(buyerId text, orderId text,
productId text, PRIMARY KEY(buyerId, orderId));
Model
public class Order {	
	
	private String orderId;	
	private String productId;	
	private String buyerId;	
		
	…	
}
Method
—  Find current product quantity at CL.SERIAL
—  This allows us to execute a PAXOS query without
proposing an update, i.e. read the current value
SELECT capacityLeft from products WHERE
productId = ‘1234’
e.g. capacityLeft = 5
Method Contd.
—  Do a conditional update using IF operator to make
sure product quantity has not changed since last
quantity check
—  Note the use of the set collection type here.
—  This statement will only succeed if the IF condition is
met
UPDATE products SET orderIds=orderIds +
{'3'}, capacityleft = 4 WHERE productId =
’1234' IF capacityleft = 5;
Method Contd.
—  If last query succeeds, simply insert the order.
INSERT into orders (buyerId, orderId,
productId) values (1,3,’1234’);
—  This guarantees that no order will be placed where
there is insufficient quantity to fulfill it.
Comments
—  Using LWT incurs a cost of higher latency because
all replicas must be consulted before a value is
committed / returned.
—  CL.SERIAL does not propose a new value but is
used to read the possibly uncommitted PAXOS
state
—  The IF operator can also be used as IF NOT EXISTS
which is useful for user creation for example
Paging Example
Scenario
—  We have 1000s of products in our product
catalogue
—  We want to browse these using a simple select
—  We don’t want to retrieve all at once!
Cursors
—  We are often dealing with wide rows in Cassandra
—  Reading entire rows or multiple rows at once could
lead to OOM errors
—  Traditionally this meant using range queries to
retrieve content
—  Cassandra 2.0 (and Java driver) introduces cursors
—  Makes row based queries more efficient (no need to
use the token() function)
—  This will simplify client code
Retrieve & Run the Code
1.  git clone
https://github.com/PatrickCallaghan/datastax-
paging-demo.git
2.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.demo.SchemaSetup"
3.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.paging.Main"
Schema
create table if not exists
products(productId text, capacityleft int,
orderIds set<text>, PRIMARY KEY
(productId));
—  N.B With the default partitioner, products will be
ordered based on Murmer3 hash value. Old way we
would need to use the token() function to retrieve
them in order
Model
public class Product {	
	
	private String productId;	
	private int capacityLeft;	
	private Set<String> orderIds;	
	
	…	
}
Method
1.  Create a simple select query for the products
table.
2.  Set the fetch size parameter
3.  Execute the statement
Statement stmt = new
SimpleStatement("Select * from products”);	
stmt.setFetchSize(100);	
ResultSet resultSet =
this.session.execute(stmt);
Method Contd.
1.  Get an iterator for the result set
2.  Use a while loop to iterate over the result set
Iterator<Row> iterator = resultSet.iterator();	
while (iterator.hasNext()){	
	Row row = iterator.next();	
// do stuff with the row	
}
Comments
—  Very easy to transparently iterate in a memory
efficient way over a large result set
—  Cursor state is maintained by driver.
—  Allows for failover between different page
responses, i.e. the state is not lost if a page fails to
load from a node in the replica set, the page will be
requested from another node
—  See: http://www.datastax.com/dev/blog/client-
side-improvements-in-cassandra-2-0
Analytics Example
Scenario
—  Don’t have Hadoop but want to run some HIVE type
analytics on our large dataset
—  Example: Get the Top10 financial transactions
ordered by monetary value for each user
—  May want to add more complex filtering later
(where value > 1000) or even do mathematical
groupings, percentiles, means, min, max
Cassandra for Analytics
—  Useful for many scenarios when no other analytics
solution is available
—  Using cursors, queries are bounded & memory efficient
depending on the operation
—  Can be applied anywhere we can do iterative or recursive
processing, SUM, AVG, MIN, MAX etc.
—  NB: The example code also includes an
CQLSSTableWriter which is fast & convenient if we want
to manually create SSTables of large datasets rather
than send millions of insert queries to Cassandra
Retrieve & Run the Code
1.  git clone
https://github.com/PatrickCallaghan/datastax-
analytics-example.git
2.  export MAVEN_OPTS=-Xmx512M (up the memory)
3.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.bulkloader.Main"
4.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.analytics.TopTrans
actionsByAmountForUserRunner"
Schema
create table IF NOT EXISTS transactions (	
	accid text,	
	txtnid uuid,	
	txtntime timestamp,	
	amount double,	
	type text,	
	reason text,	
	PRIMARY KEY(accid, txtntime)	
);
Model
public class Transaction {	
	pivate String txtnId;	
	private String acountId;	
	private double amount;	
	private Date txtnDate;	
	private String reason;	
	private String type;	
	…	
}
Method
—  Pass a blocking queue into the DAO method which cursors the
data, allows us to pop items off as they are added
—  NB: Could also use a callback here to update the queue
public void
getAllProducts(BlockingQueue<Transaction>
processorQueue)	
Statement stmt = new SimpleStatement(“SELECT * FROM
transactions”);	
stmt.setFetchSize(2500);	
ResultSet resultSet = this.session.execute(stmt);
Method Contd.
1.  Get an iterator for the result set
2.  Use a while loop to iterate over the result set, add each row
into the queue
while (iterator.hasNext()) {	
	Row row = iterator.next();	
	Transaction transaction = 	
	createTransactionFromRow(row); //convenience	
	queue.offer(transaction); 	 	 		
}
Method Contd.
1.  Use Java Collections & Transaction comparator to
track Top results
private Set<Transaction> orderedSet = new
BoundedTreeSet<Transaction>(10, new
TransactionAmountComparator());
Comments
—  Entirely possible, but probably not to be thought of as a
complete replacement for dedicated analytics solutions
—  Issues are token distribution across replicas and mixed write
and read patterns
—  Running analytics or MR operations can be a read heavy
operation (as well as memory and i/o intensive)
—  Transaction logging tends to be write heavy
—  Cassandra can handle it, but in practice it is better to split
workloads except for smaller cases, where latency doesn’t
matter or where the cluster is not generally under significant
load
—  Consider DSE Hadoop, Spark, Storm as alternatives
Risk Sensitivity Example
Scenario
—  In financial risk systems, positions have sensitivity to
certain variable
—  Positions are hierarchical and is associated with a trader
at a desk which is part of an asset type in a certain
location.
—  E.g. Frankfurt/FX/desk10/trader7/position23
—  Sensitivity values are inserted for each position. We
need to aggregate them for each level in the hierarchy
—  The Sum of all sensitivities over time is the new
sensitivity as they are represented by deltas.
Scenario
—  E.g. Aggregations for:
—  Frankfurt/FX/desk10/trader7
—  Frankfurt/FX/desk10
—  Frankfurt/FX
—  As new positions are entered the risk sensitivities will
change and will need to be aggregated for each level
for the new value to be available
Queries
select * from risk_sensitivities_hierarchy
where hier_path = 'Paris/FX'; !
select * from risk_sensitivities_hierarchy
where hier_path = 'Paris/FX/desk4' and
sub_hier_path='trader3'; !
select * from risk_sensitivities_hierarchy
where hier_path = 'Paris/FX/desk4' and
sub_hier_path='trader3' and
risk_sens_name='irDelta';!
Retrieve & Run the Code
1.  git clone
https://github.com/PatrickCallaghan/datastax-
analytics-example.git
2.  export MAVEN_OPTS=-Xmx512M (up the memory)
3.  mvn clean compile exec:java -
Dexec.mainClass="com.datastax.bulkloader.Main"
4.  mvn clean compile exec:java -
Dexec.mainClass="com.heb.finance.analytics.Main"
-DstopSize=1000000
Schema
create table if not exists risk_sensitivities_hierarchy ( 	
	hier_path text,	
	sub_hier_path text, 	
	risk_sens_name text, 	
	value double, 	
	PRIMARY KEY (hier_path, sub_hier_path,
risk_sens_name)	
) WITH compaction={'class': 'LeveledCompactionStrategy'};	
NB: Notice the use of LCS as we want the table to be efficient for
reads also
Model
public class RiskSensitivity	
	public final String name;	
	public final String path;	
	public final String position;	
	public final BigDecimal value;	
	…	
}
Method
—  Write a service to write new sensitivities to
Cassandra Periodically.
insert into risk_sensitivities_hierarchy
(hier_path, sub_hier_path, risk_sens_name,
value) VALUES (?, ?, ?, ?)
Method Contd.
—  In our aggregator do the following periodically
—  Select data for hierarchies we wish to aggregate
select * from risk_sensitivities_hierarchy where
hier_path = ‘Frankfurt/FX/desk10/trader4’
—  Will get all positions related to this hierarchy
—  Add the values (represented as deltas) to each other to get
the new sensitivity
—  E.g. S1 = -3, S2 = 2, S3= -1
—  Write it back for ‘Frankfurt/FX/desk10/trader4’
Comments
—  Simple way to maintain up to date risk sensitivity
on an on going basis based on previous data
—  Will mean (N Hierarchies) * (N variables) queries
are executed periodically (keep an eye on this)
—  Cursors, blocking queue and bounded collections
help us achieve the same result without reading
entire rows
—  Has other applications such as roll ups for stream
data provided you have a reasonably low cardinality
in terms of number of (time resolution) * variables.
—  Thanks Patrick Callaghan for the hard work coding
the examples!
— Questions?

More Related Content

What's hot

Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
Robbie Strickland
 
Lambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter LawreyLambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter Lawrey
JAXLondon_Conference
 
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing Cassandra
Christopher Batey
 
Five Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Five Data Models for Sharding | Nordic PGDay 2018 | Craig KerstiensFive Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Five Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Citus Data
 
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
confluent
 
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Event sourcing  - what could possibly go wrong ? Devoxx PL 2021Event sourcing  - what could possibly go wrong ? Devoxx PL 2021
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Andrzej Ludwikowski
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
Petr Zapletal
 
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams Applications
Guozhang Wang
 
Building a fully-automated Fast Data Platform
Building a fully-automated Fast Data PlatformBuilding a fully-automated Fast Data Platform
Building a fully-automated Fast Data Platform
Manuel Sehlinger
 
Spark streaming: Best Practices
Spark streaming: Best PracticesSpark streaming: Best Practices
Spark streaming: Best Practices
Prakash Chockalingam
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
J On The Beach
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and how
Petr Zapletal
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Alexey Kharlamov
 
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco SlotDistributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Citus Data
 
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Codemotion
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
Renato Guimaraes
 
WSO2 Complex Event Processor
WSO2 Complex Event ProcessorWSO2 Complex Event Processor
WSO2 Complex Event Processor
Sriskandarajah Suhothayan
 
Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)
Brian Brazil
 
Creating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on MesosCreating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on Mesos
ArangoDB Database
 

What's hot (20)

Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015Lambda at Weather Scale - Cassandra Summit 2015
Lambda at Weather Scale - Cassandra Summit 2015
 
Lambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter LawreyLambdas puzzler - Peter Lawrey
Lambdas puzzler - Peter Lawrey
 
LA Cassandra Day 2015 - Testing Cassandra
LA Cassandra Day 2015  - Testing CassandraLA Cassandra Day 2015  - Testing Cassandra
LA Cassandra Day 2015 - Testing Cassandra
 
Five Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Five Data Models for Sharding | Nordic PGDay 2018 | Craig KerstiensFive Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
Five Data Models for Sharding | Nordic PGDay 2018 | Craig Kerstiens
 
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
Riddles of Streaming - Code Puzzlers for Fun & Profit (Nick Dearden, Confluen...
 
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
Event sourcing  - what could possibly go wrong ? Devoxx PL 2021Event sourcing  - what could possibly go wrong ? Devoxx PL 2021
Event sourcing - what could possibly go wrong ? Devoxx PL 2021
 
Distributed Real-Time Stream Processing: Why and How 2.0
Distributed Real-Time Stream Processing:  Why and How 2.0Distributed Real-Time Stream Processing:  Why and How 2.0
Distributed Real-Time Stream Processing: Why and How 2.0
 
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
WSO2 Product Release Webinar: WSO2 Complex Event Processor 4.0
 
Performance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams ApplicationsPerformance Analysis and Optimizations for Kafka Streams Applications
Performance Analysis and Optimizations for Kafka Streams Applications
 
Building a fully-automated Fast Data Platform
Building a fully-automated Fast Data PlatformBuilding a fully-automated Fast Data Platform
Building a fully-automated Fast Data Platform
 
Spark streaming: Best Practices
Spark streaming: Best PracticesSpark streaming: Best Practices
Spark streaming: Best Practices
 
Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey Low latency in java 8 by Peter Lawrey
Low latency in java 8 by Peter Lawrey
 
Distributed real time stream processing- why and how
Distributed real time stream processing- why and howDistributed real time stream processing- why and how
Distributed real time stream processing- why and how
 
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
Building large-scale analytics platform with Storm, Kafka and Cassandra - NYC...
 
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco SlotDistributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
Distributed Computing on PostgreSQL | PGConf EU 2017 | Marco Slot
 
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
Andrzej Ludwikowski - Event Sourcing - what could possibly go wrong? - Codemo...
 
Stream processing - Apache flink
Stream processing - Apache flinkStream processing - Apache flink
Stream processing - Apache flink
 
WSO2 Complex Event Processor
WSO2 Complex Event ProcessorWSO2 Complex Event Processor
WSO2 Complex Event Processor
 
Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)Life of a Label (PromCon2016, Berlin)
Life of a Label (PromCon2016, Berlin)
 
Creating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on MesosCreating Fault Tolerant Services on Mesos
Creating Fault Tolerant Services on Mesos
 

Similar to Cassandra hands on

Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
Joe Stein
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value StoreSantal Li
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_storedrewz lin
 
App Grid Dev With Coherence
App Grid Dev With CoherenceApp Grid Dev With Coherence
App Grid Dev With Coherence
James Bayer
 
App Grid Dev With Coherence
App Grid Dev With CoherenceApp Grid Dev With Coherence
App Grid Dev With CoherenceJames Bayer
 
Application Grid Dev with Coherence
Application Grid Dev with CoherenceApplication Grid Dev with Coherence
Application Grid Dev with Coherence
James Bayer
 
Pragmatic Cloud Security Automation
Pragmatic Cloud Security AutomationPragmatic Cloud Security Automation
Pragmatic Cloud Security Automation
CloudVillage
 
Practical catalyst
Practical catalystPractical catalyst
Practical catalyst
dwm042
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWS
DataStax Academy
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Accumulo Summit
 
Logisland "Event Mining at scale"
Logisland "Event Mining at scale"Logisland "Event Mining at scale"
Logisland "Event Mining at scale"
Thomas Bailet
 
Streaming, Analytics and Reactive Applications with Apache Cassandra
Streaming, Analytics and Reactive Applications with Apache CassandraStreaming, Analytics and Reactive Applications with Apache Cassandra
Streaming, Analytics and Reactive Applications with Apache Cassandra
Cédrick Lunven
 
Apache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksApache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected Talks
Andrii Gakhov
 
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2
Max De Marzi
 
Lampstack (1)
Lampstack (1)Lampstack (1)
Lampstack (1)
ShivamKumar773
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgzznate
 
Riga DevDays 2017 - Efficient AWS Lambda
Riga DevDays 2017 - Efficient AWS LambdaRiga DevDays 2017 - Efficient AWS Lambda
Riga DevDays 2017 - Efficient AWS Lambda
Antons Kranga
 
Machine learning at scale with aws sage maker
Machine learning at scale with aws sage makerMachine learning at scale with aws sage maker
Machine learning at scale with aws sage maker
PhilipBasford
 
Java performance
Java performanceJava performance
Java performance
Rajesuwer P. Singaravelu
 

Similar to Cassandra hands on (20)

Apache Cassandra 2.0
Apache Cassandra 2.0Apache Cassandra 2.0
Apache Cassandra 2.0
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
 
App Grid Dev With Coherence
App Grid Dev With CoherenceApp Grid Dev With Coherence
App Grid Dev With Coherence
 
App Grid Dev With Coherence
App Grid Dev With CoherenceApp Grid Dev With Coherence
App Grid Dev With Coherence
 
Application Grid Dev with Coherence
Application Grid Dev with CoherenceApplication Grid Dev with Coherence
Application Grid Dev with Coherence
 
Pragmatic Cloud Security Automation
Pragmatic Cloud Security AutomationPragmatic Cloud Security Automation
Pragmatic Cloud Security Automation
 
Practical catalyst
Practical catalystPractical catalyst
Practical catalyst
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWS
 
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic SystemTimely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
Timely Year Two: Lessons Learned Building a Scalable Metrics Analytic System
 
Logisland "Event Mining at scale"
Logisland "Event Mining at scale"Logisland "Event Mining at scale"
Logisland "Event Mining at scale"
 
Streaming, Analytics and Reactive Applications with Apache Cassandra
Streaming, Analytics and Reactive Applications with Apache CassandraStreaming, Analytics and Reactive Applications with Apache Cassandra
Streaming, Analytics and Reactive Applications with Apache Cassandra
 
Apache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected TalksApache Big Data Europe 2015: Selected Talks
Apache Big Data Europe 2015: Selected Talks
 
Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2Neo4j Stored Procedure Training Part 2
Neo4j Stored Procedure Training Part 2
 
Lampstack (1)
Lampstack (1)Lampstack (1)
Lampstack (1)
 
Amazon elastic map reduce
Amazon elastic map reduceAmazon elastic map reduce
Amazon elastic map reduce
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhg
 
Riga DevDays 2017 - Efficient AWS Lambda
Riga DevDays 2017 - Efficient AWS LambdaRiga DevDays 2017 - Efficient AWS Lambda
Riga DevDays 2017 - Efficient AWS Lambda
 
Machine learning at scale with aws sage maker
Machine learning at scale with aws sage makerMachine learning at scale with aws sage maker
Machine learning at scale with aws sage maker
 
Java performance
Java performanceJava performance
Java performance
 

Recently uploaded

假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
cuobya
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
zoowe
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
Trending Blogers
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
uehowe
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
Trish Parr
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
vmemo1
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
zyfovom
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Brad Spiegel Macon GA
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
Rogerio Filho
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
nhiyenphan2005
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
ysasp1
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
SEO Article Boost
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
cuobya
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Florence Consulting
 

Recently uploaded (20)

假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
假文凭国外(Adelaide毕业证)澳大利亚国立大学毕业证成绩单办理
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
 
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptx
 
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...guildmasters guide to ravnica Dungeons & Dragons 5...
guildmasters guide to ravnica Dungeons & Dragons 5...
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
 

Cassandra hands on

  • 1. Cassandra Hands On Niall Milton, CTO, DigBigData Examples courtesy of Patrick Callaghan, DataStax Sponsored By
  • 2. Introduction —  We will be walking through Cassandra use cases from Patrick Callaghan on github. —  https://github.com/PatrickCallaghan/ —  Patrick sends his apologies but due to Aer Lingus air strike on Friday he couldn’t get a flight back to UK —  This presentation will cover the important points from each sample application
  • 3. Agenda —  Transactions Example —  Paging Example —  Analytics Example —  Risk Sensitivity Example
  • 5. Scenario —  We want to add products, each with a quantity to an order —  Orders come in concurrently from random buyers —  Products that have sold out will return “OUT OF STOCK” —  We want to use lightweight transactions to guarantee that we do not allow orders to complete when no stock is available
  • 6. Lightweight Transactions —  Guarantee a serial isolation level, ACID —  Uses PAXOS consensus algorithm to achieve this in a distributed system. See: —  http://research.microsoft.com/en-us/um/people/lamport/ pubs/paxos-simple.pdf —  Every node is still equal, no master or locks —  Allows for conditional inserts & updates —  The cost of linearizable consistency is higher latency, not suitable for high volume writes where low latency is required
  • 7. Retrieve & Run the Code 1.  git clone https://github.com/PatrickCallaghan/datastax- transaction-demo.git 2.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.demo.SchemaSetup” 3.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.transactions.Main" - Dload=true -DcontactPoints=127.0.0.1 - DnoOfThreads=10
  • 8. Schema 1.  create keyspace if not exists datastax_transactions_demo WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1' }; 2.  create table if not exists products(productId text, capacityleft int, orderIds set<text>, PRIMARY KEY (productId)); 3.  create table if not exists buyers_orders(buyerId text, orderId text, productId text, PRIMARY KEY(buyerId, orderId));
  • 9. Model public class Order { private String orderId; private String productId; private String buyerId; … }
  • 10. Method —  Find current product quantity at CL.SERIAL —  This allows us to execute a PAXOS query without proposing an update, i.e. read the current value SELECT capacityLeft from products WHERE productId = ‘1234’ e.g. capacityLeft = 5
  • 11. Method Contd. —  Do a conditional update using IF operator to make sure product quantity has not changed since last quantity check —  Note the use of the set collection type here. —  This statement will only succeed if the IF condition is met UPDATE products SET orderIds=orderIds + {'3'}, capacityleft = 4 WHERE productId = ’1234' IF capacityleft = 5;
  • 12. Method Contd. —  If last query succeeds, simply insert the order. INSERT into orders (buyerId, orderId, productId) values (1,3,’1234’); —  This guarantees that no order will be placed where there is insufficient quantity to fulfill it.
  • 13. Comments —  Using LWT incurs a cost of higher latency because all replicas must be consulted before a value is committed / returned. —  CL.SERIAL does not propose a new value but is used to read the possibly uncommitted PAXOS state —  The IF operator can also be used as IF NOT EXISTS which is useful for user creation for example
  • 15. Scenario —  We have 1000s of products in our product catalogue —  We want to browse these using a simple select —  We don’t want to retrieve all at once!
  • 16. Cursors —  We are often dealing with wide rows in Cassandra —  Reading entire rows or multiple rows at once could lead to OOM errors —  Traditionally this meant using range queries to retrieve content —  Cassandra 2.0 (and Java driver) introduces cursors —  Makes row based queries more efficient (no need to use the token() function) —  This will simplify client code
  • 17. Retrieve & Run the Code 1.  git clone https://github.com/PatrickCallaghan/datastax- paging-demo.git 2.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.demo.SchemaSetup" 3.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.paging.Main"
  • 18. Schema create table if not exists products(productId text, capacityleft int, orderIds set<text>, PRIMARY KEY (productId)); —  N.B With the default partitioner, products will be ordered based on Murmer3 hash value. Old way we would need to use the token() function to retrieve them in order
  • 19. Model public class Product { private String productId; private int capacityLeft; private Set<String> orderIds; … }
  • 20. Method 1.  Create a simple select query for the products table. 2.  Set the fetch size parameter 3.  Execute the statement Statement stmt = new SimpleStatement("Select * from products”); stmt.setFetchSize(100); ResultSet resultSet = this.session.execute(stmt);
  • 21. Method Contd. 1.  Get an iterator for the result set 2.  Use a while loop to iterate over the result set Iterator<Row> iterator = resultSet.iterator(); while (iterator.hasNext()){ Row row = iterator.next(); // do stuff with the row }
  • 22. Comments —  Very easy to transparently iterate in a memory efficient way over a large result set —  Cursor state is maintained by driver. —  Allows for failover between different page responses, i.e. the state is not lost if a page fails to load from a node in the replica set, the page will be requested from another node —  See: http://www.datastax.com/dev/blog/client- side-improvements-in-cassandra-2-0
  • 24. Scenario —  Don’t have Hadoop but want to run some HIVE type analytics on our large dataset —  Example: Get the Top10 financial transactions ordered by monetary value for each user —  May want to add more complex filtering later (where value > 1000) or even do mathematical groupings, percentiles, means, min, max
  • 25. Cassandra for Analytics —  Useful for many scenarios when no other analytics solution is available —  Using cursors, queries are bounded & memory efficient depending on the operation —  Can be applied anywhere we can do iterative or recursive processing, SUM, AVG, MIN, MAX etc. —  NB: The example code also includes an CQLSSTableWriter which is fast & convenient if we want to manually create SSTables of large datasets rather than send millions of insert queries to Cassandra
  • 26. Retrieve & Run the Code 1.  git clone https://github.com/PatrickCallaghan/datastax- analytics-example.git 2.  export MAVEN_OPTS=-Xmx512M (up the memory) 3.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.bulkloader.Main" 4.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.analytics.TopTrans actionsByAmountForUserRunner"
  • 27. Schema create table IF NOT EXISTS transactions ( accid text, txtnid uuid, txtntime timestamp, amount double, type text, reason text, PRIMARY KEY(accid, txtntime) );
  • 28. Model public class Transaction { pivate String txtnId; private String acountId; private double amount; private Date txtnDate; private String reason; private String type; … }
  • 29. Method —  Pass a blocking queue into the DAO method which cursors the data, allows us to pop items off as they are added —  NB: Could also use a callback here to update the queue public void getAllProducts(BlockingQueue<Transaction> processorQueue) Statement stmt = new SimpleStatement(“SELECT * FROM transactions”); stmt.setFetchSize(2500); ResultSet resultSet = this.session.execute(stmt);
  • 30. Method Contd. 1.  Get an iterator for the result set 2.  Use a while loop to iterate over the result set, add each row into the queue while (iterator.hasNext()) { Row row = iterator.next(); Transaction transaction = createTransactionFromRow(row); //convenience queue.offer(transaction); }
  • 31. Method Contd. 1.  Use Java Collections & Transaction comparator to track Top results private Set<Transaction> orderedSet = new BoundedTreeSet<Transaction>(10, new TransactionAmountComparator());
  • 32. Comments —  Entirely possible, but probably not to be thought of as a complete replacement for dedicated analytics solutions —  Issues are token distribution across replicas and mixed write and read patterns —  Running analytics or MR operations can be a read heavy operation (as well as memory and i/o intensive) —  Transaction logging tends to be write heavy —  Cassandra can handle it, but in practice it is better to split workloads except for smaller cases, where latency doesn’t matter or where the cluster is not generally under significant load —  Consider DSE Hadoop, Spark, Storm as alternatives
  • 34. Scenario —  In financial risk systems, positions have sensitivity to certain variable —  Positions are hierarchical and is associated with a trader at a desk which is part of an asset type in a certain location. —  E.g. Frankfurt/FX/desk10/trader7/position23 —  Sensitivity values are inserted for each position. We need to aggregate them for each level in the hierarchy —  The Sum of all sensitivities over time is the new sensitivity as they are represented by deltas.
  • 35. Scenario —  E.g. Aggregations for: —  Frankfurt/FX/desk10/trader7 —  Frankfurt/FX/desk10 —  Frankfurt/FX —  As new positions are entered the risk sensitivities will change and will need to be aggregated for each level for the new value to be available
  • 36. Queries select * from risk_sensitivities_hierarchy where hier_path = 'Paris/FX'; ! select * from risk_sensitivities_hierarchy where hier_path = 'Paris/FX/desk4' and sub_hier_path='trader3'; ! select * from risk_sensitivities_hierarchy where hier_path = 'Paris/FX/desk4' and sub_hier_path='trader3' and risk_sens_name='irDelta';!
  • 37. Retrieve & Run the Code 1.  git clone https://github.com/PatrickCallaghan/datastax- analytics-example.git 2.  export MAVEN_OPTS=-Xmx512M (up the memory) 3.  mvn clean compile exec:java - Dexec.mainClass="com.datastax.bulkloader.Main" 4.  mvn clean compile exec:java - Dexec.mainClass="com.heb.finance.analytics.Main" -DstopSize=1000000
  • 38. Schema create table if not exists risk_sensitivities_hierarchy ( hier_path text, sub_hier_path text, risk_sens_name text, value double, PRIMARY KEY (hier_path, sub_hier_path, risk_sens_name) ) WITH compaction={'class': 'LeveledCompactionStrategy'}; NB: Notice the use of LCS as we want the table to be efficient for reads also
  • 39. Model public class RiskSensitivity public final String name; public final String path; public final String position; public final BigDecimal value; … }
  • 40. Method —  Write a service to write new sensitivities to Cassandra Periodically. insert into risk_sensitivities_hierarchy (hier_path, sub_hier_path, risk_sens_name, value) VALUES (?, ?, ?, ?)
  • 41. Method Contd. —  In our aggregator do the following periodically —  Select data for hierarchies we wish to aggregate select * from risk_sensitivities_hierarchy where hier_path = ‘Frankfurt/FX/desk10/trader4’ —  Will get all positions related to this hierarchy —  Add the values (represented as deltas) to each other to get the new sensitivity —  E.g. S1 = -3, S2 = 2, S3= -1 —  Write it back for ‘Frankfurt/FX/desk10/trader4’
  • 42. Comments —  Simple way to maintain up to date risk sensitivity on an on going basis based on previous data —  Will mean (N Hierarchies) * (N variables) queries are executed periodically (keep an eye on this) —  Cursors, blocking queue and bounded collections help us achieve the same result without reading entire rows —  Has other applications such as roll ups for stream data provided you have a reasonably low cardinality in terms of number of (time resolution) * variables.
  • 43. —  Thanks Patrick Callaghan for the hard work coding the examples! — Questions?