The Matrix and DataStax
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,006
On Slideshare
1,006
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
14
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. What%is%Cassandra?% •  Apache Cassandra™ is a massively scalable NoSQL database. •  Cassandra is designed to handle big data workloads across multiple data centers with no single point of failure, providing enterprises with continuous availability without compromising performance.
  • 2. Why%Cassandra% •  Fast / Linear scalability •  Elastic •  No single point of failure •  Very little moving parts •  Enterprise / multi-data center / cloud data distribution •  Location independence – read and write anywhere •  Dynamic / Flexible data structure •  Tunable data consistency (per operation) •  Data compression •  Cloud ready •  Familiar SQL-Like language – CQL •  Easy setup •  No special hardware needed •  No special caching layer needed
  • 3. 4% 2% 3% 1% data1% data1% data2% data2%
  • 4. Network%Topology%
  • 5. Data%Consistency% •  Any% •  One% •  Quorum% •  Local_Quorum% •  Each_Quorum% •  All% Writes% •  One% •  Quorum% •  Local_Quorum% •  Each_Quorum% •  All% Reads%
  • 6. Durable%Writes% INSERT%INTO…% Commit&log& memtable& SSTable&
  • 7. Data%Structure% Keyspace:&Matrix&&&&&&replica7on_factor:&3& Column%Family:%character_locaLons% day1% morphius:<7meuuid>:&coordinates%neo:<7meuuid>:&coordinates% day1:neo% <7meuuid>:&coordinates% day1:morph% <7meuuid>:&coordinates% <7meuuid>:&coordinates% <7meuuid>:&coordinates% Column%Family:%character_informaLon% neo% DOB:&2600H06H27%Actor:&Keanu&Reeves% email1:&Neo@matrix% email2:&mr.anderson@vr.net%
  • 8. Overview%of%DataStax% •  Founded in April 2010 •  Commercial leader in Apache Cassandra™ •  300+ customers (including 20 of the Fortune 100) •  100+ employees •  Home to Apache Cassandra Chair & most committers •  Headquartered in San Mateo •  Funded by prominent venture firms
  • 9. DataStax%Enterprise%Architecture%
  • 10. DataStax%Cassandra% •  Kerberos%authenLcaLon% •  Encrypted%data%at%rest% •  AudiLng% •  iSECpartners%validated%
  • 11. <schema%name="wikipedia"%version="1.1">% %<types>% %%<fieldType%name="string"%class="solr.StrField"/>% %%<fieldType%name="text"%class="solr.TextField">% %%%%<analyzer><tokenizer%class="solr.WikipediaTokenizerFactory"/></analyzer>% %%</fieldType>% %</types>% %<fields>% %%%%<field%name="id"%%type="string"%indexed="true"%%stored="true"/>% %%%%<field%name="name"%%type="text"%indexed="true"%%stored="true"/>% %%%%<field%name="body"%%type="text"%indexed="true"%%stored="true"/>% %%%%<field%name="Ltle"%%type="text"%indexed="true"%%stored="true"/>% %%%%<field%name="date"%%type="string"%indexed="true"%%stored="true"/>% %</fields>% %<defaultSearchField>body</defaultSearchField>% %<uniqueKey>id</uniqueKey>%
  • 12. Searching%Data% HTTP& curl%"hZp://localhost:8983/solr/wiki.solr/select?%q=Ltle%3AnaLo%2A%20AND %20Ltle%3A%5B2000%20TO%202010%5D"%% & & CQL3& use%wiki;% select%Ltle%from%solr%where%solr_query='Ltle:naLo*%AND%Ltle:[2000%TO%2010]';%% %
  • 13. Workload%IsolaLon% Solr% C*% C*% C*% C*% Solr% Solr% Solr% Solr%Queries% Cassandra%Queries%
  • 14. Hive% •  {LEFT|RIGHT|FULL}%[OUTER]%JOIN% •  GROUP%BY% •  {SORT|DISTRIBUTE|CLUSTER|ORDER}%BY% •  UNION% •  Sub%Queries% %
  • 15. Hive%p>%Cassandra%Example% DROP%TABLE%IF%EXISTS%StockHist;% CREATE%EXTERNAL%TABLE% StockHist(row_key%string,%column_name%string,%value%double)% STORED%BY%'org.apache.hadoop.hive.cassandra.Cassand%raStorageHandler’% WITH%SERDEPROPERTIES%("cassandra.ks.name"%=%"PorvolioDemo",%% "cassandra.cf.validatorType"%=%"UTF8Type,UTF8Type,DoubleType"%);%% %
  • 16. Pig% cassandra_data%=%LOAD%'cassandra://<keyspace>/<CF>'%% USING%CassandraStorage()%AS%(name,%columns:%bag%{T:%tuple(score,%value)});% % total_scores%=%FOREACH%cassandra_data%GENERATE%name,% COUNT(columns.score),%LongSum(columns.score)%as%total%PARALLEL%3;% % ordered_scores%=%ORDER%total_scores%BY%total%DESC%PARALLEL%3;% % STORE%ordered_scores%INTO%'cfs:///final_scores.txt'%USING%PigStorage();%
  • 17. Workload%IsolaLon% H*% C*% C*% C*% C*% H*% Solr% Solr% Solr%Queries% Cassandra%Queries% Hadoop%AnalyLcs%
  • 18. Cassandra%Roadmap% %