Your SlideShare is downloading. ×
ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra

5,451
views

Published on


0 Comments
8 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,451
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
66
Comments
0
Likes
8
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Real-Time Big Data inpractice with CassandraMichaël Figuière@mfiguiere
  • 2. Speaker Michaël Figuière @mfiguiere©2012 DataStax 2
  • 3. Ring Architecture Node Node Node Cassandra Node Node Node©2012 DataStax 3
  • 4. Ring Architecture Node Replica Node Replica Node Replica©2012 DataStax 4
  • 5. Linear Scalability Client Writes/s by Node Count - Replication Factor = 3©2012 DataStax 5
  • 6. Client / Server Communication Client ? Node Replica Client Node Replica Client Node Client Replica©2012 DataStax 6
  • 7. Client / Server Communication Client Node Replica Client Node Replica Client Node Client Replica Coordinator node: Forwards all R/W requests to corresponding replicas©2012 DataStax 7
  • 8. Tunable Consistency Time A A A 3 replicas©2012 DataStax 8
  • 9. Tunable Consistency Time A A A B A A Write ‘B’ Write and wait for acknowledge from one node©2012 DataStax 9
  • 10. Tunable Consistency Time R +W < N A A A B A A B A A Read waiting for one node Write and wait for to answer acknowledge from one node©2012 DataStax 10
  • 11. Tunable Consistency Time R +W = N A A A B B A B B A Read waiting for one node Write and wait for to answer acknowledges from two nodes©2012 DataStax 11
  • 12. Tunable Consistency Time R +W > N A A A B B A B B A Read waiting for two nodes Write and wait for to answer acknowledges from two nodes©2012 DataStax 12
  • 13. Tunable Consistency Time R = W = QUORUM A A A B B A B B A QUORUM = (N / 2) + 1©2012 DataStax 13
  • 14. Request Path 1 Client Node Replica 2 3 Client 4 2 Node Replica 3 Client 2 3 Node Client Replica Coordinator node©2012 DataStax 14
  • 15. Column Family Data Model name email address state jbellis Jonathan jb@ds.com 123 main TX name email address state dhutch Daria dh@ds.com 45 2nd st CA name email egilmore Eric eg@ds.com Row Key Columns©2012 DataStax 15
  • 16. Column Family Data Model dhutch egilmore datastax mzcassie jbellis egilmore dhutch datastax mzcassie egilmore Row Key Columns©2012 DataStax 16
  • 17. CQL3 Data Model Timeline Table user_id tweet_id author body gmason 1765 phenry Give me liberty or give me death gmason 1742 gwashington I chopped down the cherry tree ahamilton 1797 jadams A government of laws, not men ahamilton 1742 gwashington I chopped down the cherry tree Partition Remaining Key Key©2012 DataStax 17
  • 18. CQL3 Data Model Timeline Table user_id tweet_id author body gmason 1765 phenry Give me liberty or give me death gmason 1742 gwashington I chopped down the cherry tree ahamilton 1797 jadams A government of laws, not men ahamilton 1742 gwashington I chopped down the cherry tree CQL CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id));©2012 DataStax 18
  • 19. CQL3 Data Model Timeline Table user_id tweet_id author body gmason 1765 phenry Give me liberty or give me death gmason 1742 gwashington I chopped down the cherry tree ahamilton 1797 jadams A government of laws, not men ahamilton 1742 gwashington I chopped down the cherry treeTimeline Physical Layout [1742, author] [1742, body] [1765, author] [1765, body] gmason gwashington I chopped down the... phenry Give me liberty or give... [1742, author] [1742, body] [1797, author] [1797, body] ahamilton gwashington I chopped down the... jadams A government of laws...©2012 DataStax 19
  • 20. Denormalized Data Model Data duplicated over several tables©2012 DataStax 20
  • 21. Real-Time Analytics Google Analytics gives you immediate statistics about your website traffic©2012 DataStax 21
  • 22. Web Analytics Data Model Analytics Table url time views from_search direct from_referrer /index.html 12:00 354 300 20 34 /index.html 12:01 402 333 25 44 /contacts.html 12:00 23 3 0 20 /contacts.html 12:01 20 4 1 15 CQL CREATE TABLE analytics ( url varchar, time timestamp, views counter, from_search counter, direct counter, from_referrer counter, PRIMARY KEY (url, time));©2012 DataStax 22
  • 23. Web Analytics Data Model Analytics Table url time views from_search direct from_referrer /index.html 12:00 354 300 20 34 /index.html 12:01 402 333 25 44 /contacts.html 12:00 23 3 0 20 /contacts.html 12:01 20 4 1 15 CQL UPDATE analytics SET views = views + 1, from_search = from_search + 1 WHERE url = /index.html AND time = 2012-10-06 12:00;©2012 DataStax 23
  • 24. Web Analytics Data Model Analytics Table url time views from_search direct from_referrer /index.html 12:00 354 300 20 34 /index.html 12:01 402 333 25 44 /contacts.html 12:00 23 3 0 20 /contacts.html 12:01 20 4 1 15 CQL SELECT * FROM analytics WHERE url = /index.html©2012 DataStax 24
  • 25. Online Business Intelligence Storage for application Distributed batch in production processing Application Cassandra Hadoop Using results in Storage for production results©2012 DataStax 25
  • 26. Stay Tuned! blog.datastax.com @mfiguiere