SlideShare a Scribd company logo
1 of 39
Confidential
Apache Cassandra & Java
Caroline George
DataStax Solutions Engineer
June 30, 2014
1
Agenda
2
• Cassandra Overview
• Cassandra Architecture
• Cassandra Query Language
• Interacting with Cassandra using Java
• About DataStax
CASSANDRA OVERVIEW
3
Who is using DataStax?
4
Collections /
Playlists
Recommendation /
Personalization
Fraud detection
Messaging
Internet of Things /
Sensor data
What is Apache Cassandra?
Apache Cassandra™ is a massively scalable NoSQL database.
• Continuous availability
• High performing writes and reads
• Linear scalability
• Multi-data center support
6
The NoSQL Performance Leader
Source: Netflix Tech Blog
Netflix Cloud Benchmark…
“In terms of scalability, there is a clear winner throughout our experiments.
Cassandra achieves the highest throughput for the maximum number of nodes in
all experiments with a linear increasing throughput.”
Source: Solving Big Data Challenges for Enterprise Application Performance Management benchmark paper presented at the Very Large Database Conference,
2013.
End Point Independent NoSQL Benchmark
Highest in throughput…
Lowest in latency…
10
50
3070
80
40
20
60
Client
Client
Replication Factor = 3
We could still
retrieve the data
from the other 2
nodes
Token Order_id Qty Sale
70 1001 10 100
44 1002 5 50
15 1003 30 200
Node failure or it goes
down temporarily
Cassandra is Fault Tolerant
Client
10
50
3070
80
40
20
60
Client
15
55
3575
85
45
25
65
West Data CenterEast Data Center
10
50
3070
80
40
20
60
Data Center
Outage Occurs
No interruption to the business
Multi Data Center Support
9
Writes in Cassandra
Data is organized into Partitions
1. Data is written to a Commit Log for a node (durability)
2. Data is written to MemTable (in memory)
3. MemTables are flushed to disk in an SSTable based on
size.
SSTables are immutable
Client
Memory
SSTables
Commit Log
Flush to Disk
10
Tunable Data Consistency
11
Built for Modern Online Applications
• Architected for today’s needs
• Linear scalability at lowest cost
• 100% uptime
• Operationally simple
Cassandra Query Language
12
CQL - DevCenter
13
A SQL-like query language for communicating with Cassandra
DataStax DevCenter – a free, visual query tool for creating and running CQL
statements against Cassandra and DataStax Enterprise.
CQL - Create Keyspace
14
CREATE KEYSPACE demo WITH REPLICATION =
{'class' : 'NetworkTopologyStrategy', 'EastCoast': 3,
'WestCoast': 2);
CQL - Basics
15
CREATE TABLE users (
username text,
password text,
create_date timestamp,
PRIMARY KEY (username, create_date desc);
INSERT INTO users (username, password, create_date)
VALUES ('caroline', 'password1234', '2014-06-01 07:01:00');
SELECT * FROM users WHERE username = ‘caroline’ AND create_date = ‘2014-06-
01 07:01:00’;
Predicates
On the partition key: = and IN
On the cluster columns: <, <=, =, >=, >, IN
Collection Data Types
16
CQL supports having columns that contain collections of data.
The collection types include:
Set, List and Map.
Favor sets over list – better performance
CREATE TABLE users (
username text,
set_example set<text>,
list_example list<text>,
map_example map<int,text>,
PRIMARY KEY (username)
);
Plus much more…
17
Light Weight Transactions
INSERT INTO customer_account (customerID, customer_email) VALUES
(‘LauraS’, ‘lauras@gmail.com’) IF NOT EXISTS;
UPDATE customer_account SET customer_email=’laurass@gmail.com’
IF customer_email=’lauras@gmail.com’;
Counters
UPDATE UserActions SET total = total + 2
WHERE user = 123 AND action = ’xyz';
Time to live (TTL)
INSERT INTO users (id, first, last) VALUES (‘abc123’, ‘abe’,
‘lincoln’) USING TTL 3600;
Batch Statements
BEGIN BATCH
INSERT INTO users (userID, password, name) VALUES ('user2',
'ch@ngem3b', 'second user')
UPDATE users SET password = 'ps22dhds' WHERE userID = 'user2'
INSERT INTO users (userID, password) VALUES ('user3',
'ch@ngem3c')
DELETE name FROM users WHERE userID = 'user2’
APPLY BATCH;
JAVA CODE EXAMPLES
18
DataStax Java Driver
19
• Written for CQL 3.0
• Uses the binary protocol introduced in
Cassandra 1.2
• Uses Netty to provide an asynchronous architecture
• Can do asynchronous or synchronous queries
• Has connection pooling
• Has node discovery and load balancing
http://www.datastax.com/download
Add .JAR Files to Project
20
Easiest way is to do this with Maven, which is a software project
management tool
Add .JAR Files to Project
21
In the pom.xml file, select the Dependencies tab
Click the Add… button in the left column
Enter the DataStax Java driver info
Connect & Write
22
Cluster cluster = Cluster.builder()
.addContactPoints("10.158.02.40", "10.158.02.44")
.build();
Session session = cluster.connect("demo");
session.execute(
"INSERT INTO users (username, password) ”
+ "VALUES(‘caroline’, ‘password1234’)"
);
Note: Cluster and Session objects should be long-lived and re-used
Read from Table
23
ResultSet rs = session.execute("SELECT * FROM users");
List<Row> rows = rs.all();
for (Row row : rows) {
String userName = row.getString("username");
String password = row.getString("password");
}
Asynchronous Read
24
ResultSetFuture future = session.executeAsync(
"SELECT * FROM users");
for (Row row : future.get()) {
String userName = row.getString("username");
String password = row.getString("password");
}
Note: The future returned implements Guava's ListenableFuture interface. This
means you can use all Guava's Futures1 methods!
1http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/concurrent/Futures.html
Read with Callbacks
25
final ResultSetFuture future =
session.executeAsync("SELECT * FROM users");
future.addListener(new Runnable() {
public void run() {
for (Row row : future.get()) {
String userName = row.getString("username");
String password = row.getString("password");
}
}
}, executor);
Parallelize Calls
26
int queryCount = 99;
List<ResultSetFuture> futures = new
ArrayList<ResultSetFuture>();
for (int i=0; i<queryCount; i++) {
futures.add(
session.executeAsync("SELECT * FROM users "
+"WHERE username = '"+i+"'"));
}
for(ResultSetFuture future : futures) {
for (Row row : future.getUninterruptibly()) {
//do something
}
}
Prepared Statements
27
PreparedStatement statement = session.prepare(
"INSERT INTO users (username, password) "
+ "VALUES (?, ?)");
BoundStatement bs = statement.bind();
bs.setString("username", "caroline");
bs.setString("password", "password1234");
session.execute(bs);
Query Builder
28
Query query = QueryBuilder
.select()
.all()
.from("demo", "users")
.where(eq("username", "caroline"));
ResultSet rs = session.execute(query);
Load Balancing
29
Determine which node will next be contacted once a
connection to a cluster has been established
Cluster cluster = Cluster.builder()
.addContactPoints("10.158.02.40","10.158.02.44")
.withLoadBalancingPolicy(
new DCAwareRoundRobinPolicy("DC1"))
.build();
Policies are:
• RoundRobinPolicy
• DCAwareRoundRobinPolicy (default)
• TokenAwarePolicy
RoundRobinPolicy
30
• Not data-center aware
• Each subsequent request after initial connection to the
cluster goes to the next node in the cluster
• If the node that is serving as the coordinator fails during a
request, the next node is used
DCAwareRoundRobinPolicy
31
• Is data center aware
• Does a round robin within the local data center
• Only goes to another
data center if there is
not a node available
to be coordinator in
the local data center
TokenAwarePolicy
32
• Is aware of where the replicas for a given token live
• Instead of round robin, the client chooses the node that
contains the primary replica to be the chosen coordinator
• Avoids unnecessary time taken to go to any node to have it
serve as coordinator to then contact the nodes with the
replicas
Additional Information & Support
33
• Community Site
(http://planetcassandra.org)
• Documentation
(http://www.datastax.com/docs)
• Downloads
(http://www.datastax.com/download)
• Getting Started
(http://www.datastax.com/documentation/gettingstarted/index.html)
• DataStax
(http://www.datastax.com)
ABOUT DATASTAX
34
About DataStax
35
Founded in April 2010
30
Percent
500+
Customers
Santa Clara, Austin, New York, London
300+
Employees
Confidential
DataStax delivers
Apache Cassandra to the Enterprise
36
Certified /
Enterprise-ready Cassandra
Visual Management &
Monitoring Tools
24x7 Support & Training
37
DSE 4.5
38
Thank You!
caroline.george@datastax.com
https://www.linkedin.com/in/carolinerg
@carolinerg
Follow for more updates all the time: @PatrickMcFadin

More Related Content

What's hot

Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, strongerPatrick McFadin
 
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?SegFaultConf
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynotejbellis
 
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!Michaël Figuière
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data modelPatrick McFadin
 
了解Oracle rac brain split resolution
了解Oracle rac brain split resolution了解Oracle rac brain split resolution
了解Oracle rac brain split resolutionmaclean liu
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
Paris Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for DevelopersParis Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for DevelopersMichaël Figuière
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessJon Haddad
 
Geneva JUG - Cassandra for Java Developers
Geneva JUG - Cassandra for Java DevelopersGeneva JUG - Cassandra for Java Developers
Geneva JUG - Cassandra for Java DevelopersMichaël Figuière
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Markus Klems
 
YaJug - Cassandra for Java Developers
YaJug - Cassandra for Java DevelopersYaJug - Cassandra for Java Developers
YaJug - Cassandra for Java DevelopersMichaël Figuière
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax Academy
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1jbellis
 
C*ollege Credit: Data Modeling for Apache Cassandra
C*ollege Credit: Data Modeling for Apache CassandraC*ollege Credit: Data Modeling for Apache Cassandra
C*ollege Credit: Data Modeling for Apache CassandraDataStax
 
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentationEnkitec
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data modelPatrick McFadin
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesPatrick McFadin
 

What's hot (20)

Cassandra 2.0 better, faster, stronger
Cassandra 2.0   better, faster, strongerCassandra 2.0   better, faster, stronger
Cassandra 2.0 better, faster, stronger
 
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
Robert Pankowecki - Czy sprzedawcy SQLowych baz nas oszukali?
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!Cassandra summit 2013 - DataStax Java Driver Unleashed!
Cassandra summit 2013 - DataStax Java Driver Unleashed!
 
The world's next top data model
The world's next top data modelThe world's next top data model
The world's next top data model
 
了解Oracle rac brain split resolution
了解Oracle rac brain split resolution了解Oracle rac brain split resolution
了解Oracle rac brain split resolution
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
Paris Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for DevelopersParis Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for Developers
 
Cassandra 3.0 Awesomeness
Cassandra 3.0 AwesomenessCassandra 3.0 Awesomeness
Cassandra 3.0 Awesomeness
 
Geneva JUG - Cassandra for Java Developers
Geneva JUG - Cassandra for Java DevelopersGeneva JUG - Cassandra for Java Developers
Geneva JUG - Cassandra for Java Developers
 
Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3Apache Cassandra Lesson: Data Modelling and CQL3
Apache Cassandra Lesson: Data Modelling and CQL3
 
ChtiJUG - Cassandra 2.0
ChtiJUG - Cassandra 2.0ChtiJUG - Cassandra 2.0
ChtiJUG - Cassandra 2.0
 
YaJug - Cassandra for Java Developers
YaJug - Cassandra for Java DevelopersYaJug - Cassandra for Java Developers
YaJug - Cassandra for Java Developers
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetchDataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
 
C*ollege Credit: Data Modeling for Apache Cassandra
C*ollege Credit: Data Modeling for Apache CassandraC*ollege Credit: Data Modeling for Apache Cassandra
C*ollege Credit: Data Modeling for Apache Cassandra
 
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentation
 
The data model is dead, long live the data model
The data model is dead, long live the data modelThe data model is dead, long live the data model
The data model is dead, long live the data model
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 

Viewers also liked

NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementDATAVERSITY
 
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax EnterpriseSolr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax EnterpriseDataStax Academy
 
User Inspired Management of Scientific Jobs in Grids and Clouds
User Inspired Management of Scientific Jobs in Grids and CloudsUser Inspired Management of Scientific Jobs in Grids and Clouds
User Inspired Management of Scientific Jobs in Grids and CloudsEran Chinthaka Withana
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithmsDuyhai Doan
 
Multi Data Center Strategies
Multi Data Center StrategiesMulti Data Center Strategies
Multi Data Center StrategiesSteven Francia
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 

Viewers also liked (6)

NoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application EnablementNoSQL – Data Center Centric Application Enablement
NoSQL – Data Center Centric Application Enablement
 
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax EnterpriseSolr & Cassandra: Searching Cassandra with DataStax Enterprise
Solr & Cassandra: Searching Cassandra with DataStax Enterprise
 
User Inspired Management of Scientific Jobs in Grids and Clouds
User Inspired Management of Scientific Jobs in Grids and CloudsUser Inspired Management of Scientific Jobs in Grids and Clouds
User Inspired Management of Scientific Jobs in Grids and Clouds
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithms
 
Multi Data Center Strategies
Multi Data Center StrategiesMulti Data Center Strategies
Multi Data Center Strategies
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 

Similar to DataStax NYC Java Meetup: Cassandra with Java

Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataVictor Coustenoble
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkRed Hat Developers
 
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...DataStax Academy
 
Escalabilidad horizontal y arquitecturas elásticas en Microsoft azure
Escalabilidad horizontal y arquitecturas elásticas en Microsoft azureEscalabilidad horizontal y arquitecturas elásticas en Microsoft azure
Escalabilidad horizontal y arquitecturas elásticas en Microsoft azureEnrique Catala Bañuls
 
Introduction to .Net Driver
Introduction to .Net DriverIntroduction to .Net Driver
Introduction to .Net DriverDataStax Academy
 
Terraform introduction
Terraform introductionTerraform introduction
Terraform introductionJason Vance
 
Store and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and CassandraStore and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and CassandraDeependra Ariyadewa
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Take Control of your Integration Testing with TestContainers
Take Control of your Integration Testing with TestContainersTake Control of your Integration Testing with TestContainers
Take Control of your Integration Testing with TestContainersNaresha K
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Lucidworks
 
Qtp connect to an oracle database database - database skill
Qtp connect to an oracle database   database - database skillQtp connect to an oracle database   database - database skill
Qtp connect to an oracle database database - database skillsiva1991
 
DAC4B 2015 - Polybase
DAC4B 2015 - PolybaseDAC4B 2015 - Polybase
DAC4B 2015 - PolybaseŁukasz Grala
 
In Defense Of Core Data
In Defense Of Core DataIn Defense Of Core Data
In Defense Of Core DataDonny Wals
 
MySQL as a Document Store
MySQL as a Document StoreMySQL as a Document Store
MySQL as a Document StoreDave Stokes
 

Similar to DataStax NYC Java Meetup: Cassandra with Java (20)

Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Multi-cluster k8ssandra
Multi-cluster k8ssandraMulti-cluster k8ssandra
Multi-cluster k8ssandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
 
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech TalkA Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
A Microservices approach with Cassandra and Quarkus | DevNation Tech Talk
 
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
 
Escalabilidad horizontal y arquitecturas elásticas en Microsoft azure
Escalabilidad horizontal y arquitecturas elásticas en Microsoft azureEscalabilidad horizontal y arquitecturas elásticas en Microsoft azure
Escalabilidad horizontal y arquitecturas elásticas en Microsoft azure
 
Introduction to .Net Driver
Introduction to .Net DriverIntroduction to .Net Driver
Introduction to .Net Driver
 
Terraform introduction
Terraform introductionTerraform introduction
Terraform introduction
 
Store and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and CassandraStore and Process Big Data with Hadoop and Cassandra
Store and Process Big Data with Hadoop and Cassandra
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Take Control of your Integration Testing with TestContainers
Take Control of your Integration Testing with TestContainersTake Control of your Integration Testing with TestContainers
Take Control of your Integration Testing with TestContainers
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6
 
Couchbas for dummies
Couchbas for dummiesCouchbas for dummies
Couchbas for dummies
 
Qtp connect to an oracle database database - database skill
Qtp connect to an oracle database   database - database skillQtp connect to an oracle database   database - database skill
Qtp connect to an oracle database database - database skill
 
DAC4B 2015 - Polybase
DAC4B 2015 - PolybaseDAC4B 2015 - Polybase
DAC4B 2015 - Polybase
 
Presentation
PresentationPresentation
Presentation
 
In Defense Of Core Data
In Defense Of Core DataIn Defense Of Core Data
In Defense Of Core Data
 
MySQL as a Document Store
MySQL as a Document StoreMySQL as a Document Store
MySQL as a Document Store
 

Recently uploaded

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Recently uploaded (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

DataStax NYC Java Meetup: Cassandra with Java

  • 1. Confidential Apache Cassandra & Java Caroline George DataStax Solutions Engineer June 30, 2014 1
  • 2. Agenda 2 • Cassandra Overview • Cassandra Architecture • Cassandra Query Language • Interacting with Cassandra using Java • About DataStax
  • 4. Who is using DataStax? 4 Collections / Playlists Recommendation / Personalization Fraud detection Messaging Internet of Things / Sensor data
  • 5. What is Apache Cassandra? Apache Cassandra™ is a massively scalable NoSQL database. • Continuous availability • High performing writes and reads • Linear scalability • Multi-data center support
  • 6. 6 The NoSQL Performance Leader Source: Netflix Tech Blog Netflix Cloud Benchmark… “In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments with a linear increasing throughput.” Source: Solving Big Data Challenges for Enterprise Application Performance Management benchmark paper presented at the Very Large Database Conference, 2013. End Point Independent NoSQL Benchmark Highest in throughput… Lowest in latency…
  • 7. 10 50 3070 80 40 20 60 Client Client Replication Factor = 3 We could still retrieve the data from the other 2 nodes Token Order_id Qty Sale 70 1001 10 100 44 1002 5 50 15 1003 30 200 Node failure or it goes down temporarily Cassandra is Fault Tolerant
  • 8. Client 10 50 3070 80 40 20 60 Client 15 55 3575 85 45 25 65 West Data CenterEast Data Center 10 50 3070 80 40 20 60 Data Center Outage Occurs No interruption to the business Multi Data Center Support
  • 9. 9 Writes in Cassandra Data is organized into Partitions 1. Data is written to a Commit Log for a node (durability) 2. Data is written to MemTable (in memory) 3. MemTables are flushed to disk in an SSTable based on size. SSTables are immutable Client Memory SSTables Commit Log Flush to Disk
  • 11. 11 Built for Modern Online Applications • Architected for today’s needs • Linear scalability at lowest cost • 100% uptime • Operationally simple
  • 13. CQL - DevCenter 13 A SQL-like query language for communicating with Cassandra DataStax DevCenter – a free, visual query tool for creating and running CQL statements against Cassandra and DataStax Enterprise.
  • 14. CQL - Create Keyspace 14 CREATE KEYSPACE demo WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'EastCoast': 3, 'WestCoast': 2);
  • 15. CQL - Basics 15 CREATE TABLE users ( username text, password text, create_date timestamp, PRIMARY KEY (username, create_date desc); INSERT INTO users (username, password, create_date) VALUES ('caroline', 'password1234', '2014-06-01 07:01:00'); SELECT * FROM users WHERE username = ‘caroline’ AND create_date = ‘2014-06- 01 07:01:00’; Predicates On the partition key: = and IN On the cluster columns: <, <=, =, >=, >, IN
  • 16. Collection Data Types 16 CQL supports having columns that contain collections of data. The collection types include: Set, List and Map. Favor sets over list – better performance CREATE TABLE users ( username text, set_example set<text>, list_example list<text>, map_example map<int,text>, PRIMARY KEY (username) );
  • 17. Plus much more… 17 Light Weight Transactions INSERT INTO customer_account (customerID, customer_email) VALUES (‘LauraS’, ‘lauras@gmail.com’) IF NOT EXISTS; UPDATE customer_account SET customer_email=’laurass@gmail.com’ IF customer_email=’lauras@gmail.com’; Counters UPDATE UserActions SET total = total + 2 WHERE user = 123 AND action = ’xyz'; Time to live (TTL) INSERT INTO users (id, first, last) VALUES (‘abc123’, ‘abe’, ‘lincoln’) USING TTL 3600; Batch Statements BEGIN BATCH INSERT INTO users (userID, password, name) VALUES ('user2', 'ch@ngem3b', 'second user') UPDATE users SET password = 'ps22dhds' WHERE userID = 'user2' INSERT INTO users (userID, password) VALUES ('user3', 'ch@ngem3c') DELETE name FROM users WHERE userID = 'user2’ APPLY BATCH;
  • 19. DataStax Java Driver 19 • Written for CQL 3.0 • Uses the binary protocol introduced in Cassandra 1.2 • Uses Netty to provide an asynchronous architecture • Can do asynchronous or synchronous queries • Has connection pooling • Has node discovery and load balancing http://www.datastax.com/download
  • 20. Add .JAR Files to Project 20 Easiest way is to do this with Maven, which is a software project management tool
  • 21. Add .JAR Files to Project 21 In the pom.xml file, select the Dependencies tab Click the Add… button in the left column Enter the DataStax Java driver info
  • 22. Connect & Write 22 Cluster cluster = Cluster.builder() .addContactPoints("10.158.02.40", "10.158.02.44") .build(); Session session = cluster.connect("demo"); session.execute( "INSERT INTO users (username, password) ” + "VALUES(‘caroline’, ‘password1234’)" ); Note: Cluster and Session objects should be long-lived and re-used
  • 23. Read from Table 23 ResultSet rs = session.execute("SELECT * FROM users"); List<Row> rows = rs.all(); for (Row row : rows) { String userName = row.getString("username"); String password = row.getString("password"); }
  • 24. Asynchronous Read 24 ResultSetFuture future = session.executeAsync( "SELECT * FROM users"); for (Row row : future.get()) { String userName = row.getString("username"); String password = row.getString("password"); } Note: The future returned implements Guava's ListenableFuture interface. This means you can use all Guava's Futures1 methods! 1http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/concurrent/Futures.html
  • 25. Read with Callbacks 25 final ResultSetFuture future = session.executeAsync("SELECT * FROM users"); future.addListener(new Runnable() { public void run() { for (Row row : future.get()) { String userName = row.getString("username"); String password = row.getString("password"); } } }, executor);
  • 26. Parallelize Calls 26 int queryCount = 99; List<ResultSetFuture> futures = new ArrayList<ResultSetFuture>(); for (int i=0; i<queryCount; i++) { futures.add( session.executeAsync("SELECT * FROM users " +"WHERE username = '"+i+"'")); } for(ResultSetFuture future : futures) { for (Row row : future.getUninterruptibly()) { //do something } }
  • 27. Prepared Statements 27 PreparedStatement statement = session.prepare( "INSERT INTO users (username, password) " + "VALUES (?, ?)"); BoundStatement bs = statement.bind(); bs.setString("username", "caroline"); bs.setString("password", "password1234"); session.execute(bs);
  • 28. Query Builder 28 Query query = QueryBuilder .select() .all() .from("demo", "users") .where(eq("username", "caroline")); ResultSet rs = session.execute(query);
  • 29. Load Balancing 29 Determine which node will next be contacted once a connection to a cluster has been established Cluster cluster = Cluster.builder() .addContactPoints("10.158.02.40","10.158.02.44") .withLoadBalancingPolicy( new DCAwareRoundRobinPolicy("DC1")) .build(); Policies are: • RoundRobinPolicy • DCAwareRoundRobinPolicy (default) • TokenAwarePolicy
  • 30. RoundRobinPolicy 30 • Not data-center aware • Each subsequent request after initial connection to the cluster goes to the next node in the cluster • If the node that is serving as the coordinator fails during a request, the next node is used
  • 31. DCAwareRoundRobinPolicy 31 • Is data center aware • Does a round robin within the local data center • Only goes to another data center if there is not a node available to be coordinator in the local data center
  • 32. TokenAwarePolicy 32 • Is aware of where the replicas for a given token live • Instead of round robin, the client chooses the node that contains the primary replica to be the chosen coordinator • Avoids unnecessary time taken to go to any node to have it serve as coordinator to then contact the nodes with the replicas
  • 33. Additional Information & Support 33 • Community Site (http://planetcassandra.org) • Documentation (http://www.datastax.com/docs) • Downloads (http://www.datastax.com/download) • Getting Started (http://www.datastax.com/documentation/gettingstarted/index.html) • DataStax (http://www.datastax.com)
  • 35. About DataStax 35 Founded in April 2010 30 Percent 500+ Customers Santa Clara, Austin, New York, London 300+ Employees
  • 36. Confidential DataStax delivers Apache Cassandra to the Enterprise 36 Certified / Enterprise-ready Cassandra Visual Management & Monitoring Tools 24x7 Support & Training
  • 37. 37

Editor's Notes

  1. DSE C* Architecture Integration w Java Q&A
  2. collections / playlist: Spotify Fraud: financial services Recommendations / personalization: EBay IoT - smart devices: nest Messaging: Instagram
  3. Massively scalable NoSQL database/Netflix example: 10 million/sec; 1 trillion/day; 3000 nodes Q: kilo, mega, giga, tera, peta, exabyte, zetta, yotta
  4. Always on: Peer to peer architecture – all nodes are equal; each node is responsible for an assigned range (or ranges) of data Clients can write (or read0 data to any node in the ring – native drivers can round robin across a DC and distribute load to a coordinator node Coordinate node writes (or reads) copies of data to nodes which own each copy In the case of a failure (such as a drive going down), 2 out of the 3 nodes are still on, so the ability to write and read data still works for the majority of nodes and therefore C* is always on
  5. Multi-DC is very, very easy to configure with Cassandra Datacenters are active – active: write to either DC and the other one will get a copy In the case of a datacenter outage, applications can carry on a retry policy which flips over to the other datacenter which also has a copy of the data; Outbrain story – Hurricane Sandy Rack-aware What is Quorum? i.e. 10 nodes, rf = 6 (q=4 -> can tolerate 2 down nodes)
  6. similar to sql but does not have all sql options DevCenter / cqlsh
  7. keyspace = db instance simple strategy vs NetworkTopologyStrategy replication factor
  8. data types primary key / partition key
  9. set: ordered alphabetically. ex: preferences: email subscriptions list: specific order. ex: search engine: google, yahoo, bing map: phone & phone type
  10. user defined types batch -> all or nothing - atomic
  11. Asynchronous: Netty is an asynchronous event-driven network application framework. Not blocking threads -> can scale It greatly simplifies and streamlines network programming such as TCP and UDP socket server / support request pipeline Other drivers: C#, Python, ODBC + community drivers high frequency / multi threaded get push requests back from C* i.e., topology changes, node status changes, schema changes
  12. v2.0.2
  13. contact points = discover cluster. not connect to these nodes Retrieves metadata from the cluster, including: Name of the cluster , IP address, rack, and data center for each node Session instances are thread-safe and normally a single session instance is all you need per application per keyspace schemaless -> no create date
  14. exectureAsync => returns future implementing Guavas futures => large scale > execute long query > callbacks Execute async to get a future on your resultset - which means from a single thread you can push a bunch of requests to be execute in paralell - look at futures in Java
  15. example: read w callback break into small queries, execute as many in C* -> send as much as C* can take
  16. fire ton of select, get futures back & iterate & do … breaking into small queries -> better latency, easier to retry small query
  17. prepared, compiled, execute for CQL statements executed multiple times for performance: precompiled & cached on server. C* parses once & exec multiple times use ? as placeholder for literal values gets passed as BoundStatement object BS gets executed by Session object
  18. programmatically build query When you need to generate queries dynamically, quite easy Or if your just familiar with it Note: the default consistency level of ConsistencyLevel.ONE
  19. Round Robin: not DC aware; request goes to next node in cluster DCAwareRoundRobin: goes to next node in next DC
  20. client connects to node where data resides. performance advantage
  21. HIRING HQ: SF offices in Austin, London, Paris delivering Apache C* to the enterprise
  22. DataStax is the company that delivers Cassandra to the enterprise. First, we take the open source software and put it through rigorous quality assurance tests including a 1000 node scalability test. We certify it and provide the worlds most comprehensive support, training and consulting for Cassandra so that you can get up and running quickly. But that isn’t all DataStax does. We also build additional software features on top of DataStax including security, search, analytics as well as provide in memory capabilities that don’t come with the open source Cassandra product. We also provide management services to help visualize your nodes, plan your capacity and repair issues automatically. Finally, we also provide developer tools and drivers as well as monitoring tools. DataStax is the commercial company behind Apache Cassandra plus a whole host of additional software and services.
  23. Spark integration - 100x faster analytics - integrated real-time analytics performance tuning - supply diagnostic info on how cluster is doing byoh