Your SlideShare is downloading. ×
0
Real-Time Big Data inpractice with CassandraMichaël Figuière@mfiguiere
Speaker                 Michaël Figuière                      @mfiguiere©2012 DataStax                      2
Ring Architecture                        Node     Node                 Node     Cassandra     Node                        ...
Ring Architecture                         Node     Replica                 Node                                           ...
Linear Scalability                 Client Writes/s by Node Count - Replication Factor = 3©2012 DataStax                   ...
Client / Server Communication          Client   ?           Node     Replica          Client                       Node   ...
Client / Server Communication          Client           Node         Replica          Client                   Node       ...
Tunable Consistency                                    Time                        A   A   A           3 replicas©2012 Dat...
Tunable Consistency                                                            Time                        A   A      A   ...
Tunable Consistency                                                                  Time                               R ...
Tunable Consistency                                                                    Time                               ...
Tunable Consistency                                                                    Time                               ...
Tunable Consistency                                        Time                  R = W = QUORUM                   A      A...
Request Path                       1          Client                   Node             Replica                           ...
Column Family Data Model                             name        email         address    state                  jbellis  ...
Column Family Data Model                            dhutch     egilmore    datastax   mzcassie                  jbellis   ...
CQL3 Data Model    Timeline Table         user_id     tweet_id      author                       body         gmason      ...
CQL3 Data Model    Timeline Table         user_id     tweet_id     author                       body         gmason       ...
CQL3 Data Model    Timeline Table         user_id         tweet_id        author                            body         g...
Real-Time Analytics   Google Analytics gives you   immediate statistics about         your website traffic©2012 DataStax   ...
Web Analytics Data Model   Analytics Table             url     time      views   from_search   direct   from_referrer     ...
Web Analytics Data Model   Analytics Table             url     time      views   from_search   direct   from_referrer     ...
Web Analytics Data Model   Analytics Table             url     time         views   from_search    direct   from_referrer ...
Connect and Write       Cluster cluster = Cluster.builder()                         .addContactPoints("127.0.0.1", "127.0....
Read             ResultSet rs = session.execute("SELECT * FROM user");             List<CQLRow> rows = rs.fetchAll();     ...
Object Mapping     @Table("user_and_messages")   public enum Gender {     public class User {     	                       ...
Aggregation  @Table("user_and_messages")         	 public class Message {  public class User {  	                         ...
Inheritance@Table("catalog")                                        @InheritanceValue("tv")@Inheritance({Phone.class, TV.c...
Online Business Intelligence                      Storage for application            Distributed batch                    ...
Stay Tuned!          blog.datastax.com          @mfiguiere
Upcoming SlideShare
Loading in...5
×

NoSQL Matters 2012 - Real Time Big Data in practice with Cassandra

2,804

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,804
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
46
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "NoSQL Matters 2012 - Real Time Big Data in practice with Cassandra"

  1. 1. Real-Time Big Data inpractice with CassandraMichaël Figuière@mfiguiere
  2. 2. Speaker Michaël Figuière @mfiguiere©2012 DataStax 2
  3. 3. Ring Architecture Node Node Node Cassandra Node Node Node©2012 DataStax 3
  4. 4. Ring Architecture Node Replica Node Replica Node Replica©2012 DataStax 4
  5. 5. Linear Scalability Client Writes/s by Node Count - Replication Factor = 3©2012 DataStax 5
  6. 6. Client / Server Communication Client ? Node Replica Client Node Replica Client Node Client Replica©2012 DataStax 6
  7. 7. Client / Server Communication Client Node Replica Client Node Replica Client Node Client Replica Coordinator node: Forwards all R/W requests to corresponding replicas©2012 DataStax 7
  8. 8. Tunable Consistency Time A A A 3 replicas©2012 DataStax 8
  9. 9. Tunable Consistency Time A A A B A A Write ‘B’ Write and wait for acknowledge from one node©2012 DataStax 9
  10. 10. Tunable Consistency Time R +W < N A A A B A A B A A Read waiting for one node Write and wait for to answer acknowledge from one node©2012 DataStax 10
  11. 11. Tunable Consistency Time R +W = N A A A B B A B B A Read waiting for one node Write and wait for to answer acknowledges from two nodes©2012 DataStax 11
  12. 12. Tunable Consistency Time R +W > N A A A B B A B B A Read waiting for two nodes Write and wait for to answer acknowledges from two nodes©2012 DataStax 12
  13. 13. Tunable Consistency Time R = W = QUORUM A A A B B A B B A QUORUM = (N / 2) + 1©2012 DataStax 13
  14. 14. Request Path 1 Client Node Replica 2 3 Client 4 2 Node Replica 3 Client 2 3 Node Client Replica Coordinator node©2012 DataStax 14
  15. 15. Column Family Data Model name email address state jbellis Jonathan jb@ds.com 123 main TX name email address state dhutch Daria dh@ds.com 45 2nd st CA name email egilmore Eric eg@ds.com Row Key Columns©2012 DataStax 15
  16. 16. Column Family Data Model dhutch egilmore datastax mzcassie jbellis egilmore dhutch datastax mzcassie egilmore Row Key Columns©2012 DataStax 16
  17. 17. CQL3 Data Model Timeline Table user_id tweet_id author body gmason 1765 phenry Give me liberty or give me death gmason 1742 gwashington I chopped down the cherry tree ahamilton 1797 jadams A government of laws, not men ahamilton 1742 gwashington I chopped down the cherry tree Partition Remaining Key Key©2012 DataStax 17
  18. 18. CQL3 Data Model Timeline Table user_id tweet_id author body gmason 1765 phenry Give me liberty or give me death gmason 1742 gwashington I chopped down the cherry tree ahamilton 1797 jadams A government of laws, not men ahamilton 1742 gwashington I chopped down the cherry tree CQL CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id));©2012 DataStax 18
  19. 19. CQL3 Data Model Timeline Table user_id tweet_id author body gmason 1765 phenry Give me liberty or give me death gmason 1742 gwashington I chopped down the cherry tree ahamilton 1797 jadams A government of laws, not men ahamilton 1742 gwashington I chopped down the cherry treeTimeline Physical Layout [1742, author] [1742, body] [1765, author] [1765, body] gmason gwashington I chopped down the... phenry Give me liberty or give... [1742, author] [1742, body] [1797, author] [1797, body] ahamilton gwashington I chopped down the... jadams A government of laws...©2012 DataStax 19
  20. 20. Real-Time Analytics Google Analytics gives you immediate statistics about your website traffic©2012 DataStax 20
  21. 21. Web Analytics Data Model Analytics Table url time views from_search direct from_referrer /index.html 12:00 354 300 20 34 /index.html 12:01 402 333 25 44 /contacts.html 12:00 23 3 0 20 /contacts.html 12:01 20 4 1 15 CQL CREATE TABLE analytics ( url varchar, time timestamp, views counter, from_search counter, direct counter, from_referrer counter, PRIMARY KEY (url, time));©2012 DataStax 21
  22. 22. Web Analytics Data Model Analytics Table url time views from_search direct from_referrer /index.html 12:00 354 300 20 34 /index.html 12:01 402 333 25 44 /contacts.html 12:00 23 3 0 20 /contacts.html 12:01 20 4 1 15 CQL UPDATE analytics SET views = views + 1, from_search = from_search + 1 WHERE url = /index.html AND time = 2012-10-06 12:00;©2012 DataStax 22
  23. 23. Web Analytics Data Model Analytics Table url time views from_search direct from_referrer /index.html 12:00 354 300 20 34 /index.html 12:01 402 333 25 44 /contacts.html 12:00 23 3 0 20 /contacts.html 12:01 20 4 1 15 CQL SELECT * FROM analytics WHERE url = /index.html©2012 DataStax 23
  24. 24. Connect and Write Cluster cluster = Cluster.builder() .addContactPoints("127.0.0.1", "127.0.0.2") .build(); Session session = cluster.connect(); session.execute( "INSERT INTO user (user_id, name, email) VALUES (12345, johndoe, john@doe.com)" );©2012 DataStax 24
  25. 25. Read ResultSet rs = session.execute("SELECT * FROM user"); List<CQLRow> rows = rs.fetchAll(); for (CQLRow row : rows) { String userId = row.getString("user_id"); String name = row.getString("name"); String email = row.getString("email"); }©2012 DataStax 25
  26. 26. Object Mapping @Table("user_and_messages") public enum Gender { public class User { @EnumValue("m") @Column("user_id") MALE, private String userId; @EnumValue("f") private String name; FEMALE; } private String email; private Gender gender; }©2012 DataStax 26
  27. 27. Aggregation @Table("user_and_messages") public class Message { public class User { private String title; @Column("user_id") private String userId; private String body; } private String name; private String email; @GroupBy("user_id") private List<Message> messages; }©2012 DataStax 27
  28. 28. Inheritance@Table("catalog") @InheritanceValue("tv")@Inheritance({Phone.class, TV.class}) public class TV@InheritanceColumn("product_type") extends Product {public abstract class Product { private float size; @Column("product_id") } private String productId; private float price; private String vendor; private String model;}©2012 DataStax 28
  29. 29. Online Business Intelligence Storage for application Distributed batch in production processing Application Cassandra Hadoop Using results in Storage for production results©2012 DataStax 29
  30. 30. Stay Tuned! blog.datastax.com @mfiguiere
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×