Your SlideShare is downloading. ×
0
What%is%Cassandra?%
•  Apache Cassandra™ is a massively
scalable NoSQL database.
•  Cassandra is designed to handle big da...
Why%Cassandra%
•  Fast / Linear scalability
•  Elastic
•  No single point of failure
•  Very little moving parts
•  Enterp...
4% 2%
3%
1%
data1%
data1%
data2%
data2%
Network%Topology%
Data%Consistency%
•  Any%
•  One%
•  Quorum%
•  Local_Quorum%
•  Each_Quorum%
•  All%
Writes%
•  One%
•  Quorum%
•  Local_...
Durable%Writes%
INSERT%INTO…%
Commit&log& memtable&
SSTable&
Data%Structure%
Keyspace:&Matrix&&&&&&replica7on_factor:&3&
Column%Family:%character_locaLons%
day1% morphius:<7meuuid>:&c...
Overview%of%DataStax%
•  Founded in April 2010
•  Commercial leader in Apache Cassandra™
•  300+ customers (including 20 o...
DataStax%Enterprise%Architecture%
DataStax%Cassandra%
•  Kerberos%authenLcaLon%
•  Encrypted%data%at%rest%
•  AudiLng%
•  iSECpartners%validated%
<schema%name="wikipedia"%version="1.1">%
%<types>%
%%<fieldType%name="string"%class="solr.StrField"/>%
%%<fieldType%name="te...
Searching%Data%
HTTP&
curl%"hZp://localhost:8983/solr/wiki.solr/select?%q=Ltle%3AnaLo%2A%20AND
%20Ltle%3A%5B2000%20TO%2020...
Workload%IsolaLon%
Solr%
C*%
C*%
C*%
C*%
Solr%
Solr%
Solr%
Solr%Queries%
Cassandra%Queries%
Hive%
•  {LEFT|RIGHT|FULL}%[OUTER]%JOIN%
•  GROUP%BY%
•  {SORT|DISTRIBUTE|CLUSTER|ORDER}%BY%
•  UNION%
•  Sub%Queries%
%
Hive%p>%Cassandra%Example%
DROP%TABLE%IF%EXISTS%StockHist;%
CREATE%EXTERNAL%TABLE%
StockHist(row_key%string,%column_name%s...
Pig%
cassandra_data%=%LOAD%'cassandra://<keyspace>/<CF>'%%
USING%CassandraStorage()%AS%(name,%columns:%bag%{T:%tuple(score...
Workload%IsolaLon%
H*%
C*%
C*%
C*%
C*%
H*%
Solr%
Solr%
Solr%Queries%
Cassandra%Queries%
Hadoop%AnalyLcs%
Cassandra%Roadmap%
%
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
The Matrix and DataStax
Upcoming SlideShare
Loading in...5
×

The Matrix and DataStax

700

Published on

By: Hayato Shimizu

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
700
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
20
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "The Matrix and DataStax"

  1. 1. What%is%Cassandra?% •  Apache Cassandra™ is a massively scalable NoSQL database. •  Cassandra is designed to handle big data workloads across multiple data centers with no single point of failure, providing enterprises with continuous availability without compromising performance.
  2. 2. Why%Cassandra% •  Fast / Linear scalability •  Elastic •  No single point of failure •  Very little moving parts •  Enterprise / multi-data center / cloud data distribution •  Location independence – read and write anywhere •  Dynamic / Flexible data structure •  Tunable data consistency (per operation) •  Data compression •  Cloud ready •  Familiar SQL-Like language – CQL •  Easy setup •  No special hardware needed •  No special caching layer needed
  3. 3. 4% 2% 3% 1% data1% data1% data2% data2%
  4. 4. Network%Topology%
  5. 5. Data%Consistency% •  Any% •  One% •  Quorum% •  Local_Quorum% •  Each_Quorum% •  All% Writes% •  One% •  Quorum% •  Local_Quorum% •  Each_Quorum% •  All% Reads%
  6. 6. Durable%Writes% INSERT%INTO…% Commit&log& memtable& SSTable&
  7. 7. Data%Structure% Keyspace:&Matrix&&&&&&replica7on_factor:&3& Column%Family:%character_locaLons% day1% morphius:<7meuuid>:&coordinates%neo:<7meuuid>:&coordinates% day1:neo% <7meuuid>:&coordinates% day1:morph% <7meuuid>:&coordinates% <7meuuid>:&coordinates% <7meuuid>:&coordinates% Column%Family:%character_informaLon% neo% DOB:&2600H06H27%Actor:&Keanu&Reeves% email1:&Neo@matrix% email2:&mr.anderson@vr.net%
  8. 8. Overview%of%DataStax% •  Founded in April 2010 •  Commercial leader in Apache Cassandra™ •  300+ customers (including 20 of the Fortune 100) •  100+ employees •  Home to Apache Cassandra Chair & most committers •  Headquartered in San Mateo •  Funded by prominent venture firms
  9. 9. DataStax%Enterprise%Architecture%
  10. 10. DataStax%Cassandra% •  Kerberos%authenLcaLon% •  Encrypted%data%at%rest% •  AudiLng% •  iSECpartners%validated%
  11. 11. <schema%name="wikipedia"%version="1.1">% %<types>% %%<fieldType%name="string"%class="solr.StrField"/>% %%<fieldType%name="text"%class="solr.TextField">% %%%%<analyzer><tokenizer%class="solr.WikipediaTokenizerFactory"/></analyzer>% %%</fieldType>% %</types>% %<fields>% %%%%<field%name="id"%%type="string"%indexed="true"%%stored="true"/>% %%%%<field%name="name"%%type="text"%indexed="true"%%stored="true"/>% %%%%<field%name="body"%%type="text"%indexed="true"%%stored="true"/>% %%%%<field%name="Ltle"%%type="text"%indexed="true"%%stored="true"/>% %%%%<field%name="date"%%type="string"%indexed="true"%%stored="true"/>% %</fields>% %<defaultSearchField>body</defaultSearchField>% %<uniqueKey>id</uniqueKey>%
  12. 12. Searching%Data% HTTP& curl%"hZp://localhost:8983/solr/wiki.solr/select?%q=Ltle%3AnaLo%2A%20AND %20Ltle%3A%5B2000%20TO%202010%5D"%% & & CQL3& use%wiki;% select%Ltle%from%solr%where%solr_query='Ltle:naLo*%AND%Ltle:[2000%TO%2010]';%% %
  13. 13. Workload%IsolaLon% Solr% C*% C*% C*% C*% Solr% Solr% Solr% Solr%Queries% Cassandra%Queries%
  14. 14. Hive% •  {LEFT|RIGHT|FULL}%[OUTER]%JOIN% •  GROUP%BY% •  {SORT|DISTRIBUTE|CLUSTER|ORDER}%BY% •  UNION% •  Sub%Queries% %
  15. 15. Hive%p>%Cassandra%Example% DROP%TABLE%IF%EXISTS%StockHist;% CREATE%EXTERNAL%TABLE% StockHist(row_key%string,%column_name%string,%value%double)% STORED%BY%'org.apache.hadoop.hive.cassandra.Cassand%raStorageHandler’% WITH%SERDEPROPERTIES%("cassandra.ks.name"%=%"PorvolioDemo",%% "cassandra.cf.validatorType"%=%"UTF8Type,UTF8Type,DoubleType"%);%% %
  16. 16. Pig% cassandra_data%=%LOAD%'cassandra://<keyspace>/<CF>'%% USING%CassandraStorage()%AS%(name,%columns:%bag%{T:%tuple(score,%value)});% % total_scores%=%FOREACH%cassandra_data%GENERATE%name,% COUNT(columns.score),%LongSum(columns.score)%as%total%PARALLEL%3;% % ordered_scores%=%ORDER%total_scores%BY%total%DESC%PARALLEL%3;% % STORE%ordered_scores%INTO%'cfs:///final_scores.txt'%USING%PigStorage();%
  17. 17. Workload%IsolaLon% H*% C*% C*% C*% C*% H*% Solr% Solr% Solr%Queries% Cassandra%Queries% Hadoop%AnalyLcs%
  18. 18. Cassandra%Roadmap% %
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×