More Related Content
Similar to ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
Similar to ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra (12)
More from Michaël Figuière
More from Michaël Figuière (20)
ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
- 2. Speaker
Michaël Figuière
@mfiguiere
©2012 DataStax 2
- 5. Linear Scalability
Client Writes/s by Node Count - Replication Factor = 3
©2012 DataStax 5
- 6. Client / Server Communication
Client ? Node Replica
Client
Node
Replica
Client
Node
Client Replica
©2012 DataStax 6
- 7. Client / Server Communication
Client Node Replica
Client
Node
Replica
Client
Node
Client Replica
Coordinator node:
Forwards all R/W requests
to corresponding replicas
©2012 DataStax 7
- 9. Tunable Consistency
Time
A A A
B A A
Write ‘B’
Write and wait for
acknowledge from one node
©2012 DataStax 9
- 10. Tunable Consistency
Time
R +W < N
A A A
B A A
B A A
Read waiting for one node Write and wait for
to answer acknowledge from one node
©2012 DataStax 10
- 11. Tunable Consistency
Time
R +W = N
A A A
B B A
B B A
Read waiting for one node Write and wait for
to answer acknowledges from two nodes
©2012 DataStax 11
- 12. Tunable Consistency
Time
R +W > N
A A A
B B A
B B A
Read waiting for two nodes Write and wait for
to answer acknowledges from two nodes
©2012 DataStax 12
- 13. Tunable Consistency
Time
R = W = QUORUM
A A A
B B A
B B A
QUORUM = (N / 2) + 1
©2012 DataStax 13
- 14. Request Path
1
Client Node Replica
2
3
Client
4 2
Node
Replica
3
Client
2
3
Node
Client Replica
Coordinator node
©2012 DataStax 14
- 15. Column Family Data Model
name email address state
jbellis
Jonathan jb@ds.com 123 main TX
name email address state
dhutch
Daria dh@ds.com 45 2nd st CA
name email
egilmore
Eric eg@ds.com
Row Key Columns
©2012 DataStax 15
- 16. Column Family Data Model
dhutch egilmore datastax mzcassie
jbellis
egilmore
dhutch
datastax mzcassie
egilmore
Row Key Columns
©2012 DataStax 16
- 17. CQL3 Data Model
Timeline Table
user_id tweet_id author body
gmason 1765 phenry Give me liberty or give me death
gmason 1742 gwashington I chopped down the cherry tree
ahamilton 1797 jadams A government of laws, not men
ahamilton 1742 gwashington I chopped down the cherry tree
Partition Remaining
Key Key
©2012 DataStax 17
- 18. CQL3 Data Model
Timeline Table
user_id tweet_id author body
gmason 1765 phenry Give me liberty or give me death
gmason 1742 gwashington I chopped down the cherry tree
ahamilton 1797 jadams A government of laws, not men
ahamilton 1742 gwashington I chopped down the cherry tree
CQL
CREATE TABLE timeline (
user_id varchar,
tweet_id uuid,
author varchar,
body varchar,
PRIMARY KEY (user_id, tweet_id));
©2012 DataStax 18
- 19. CQL3 Data Model
Timeline Table
user_id tweet_id author body
gmason 1765 phenry Give me liberty or give me death
gmason 1742 gwashington I chopped down the cherry tree
ahamilton 1797 jadams A government of laws, not men
ahamilton 1742 gwashington I chopped down the cherry tree
Timeline Physical Layout
[1742, author] [1742, body] [1765, author] [1765, body]
gmason
gwashington I chopped down the... phenry Give me liberty or give...
[1742, author] [1742, body] [1797, author] [1797, body]
ahamilton
gwashington I chopped down the... jadams A government of laws...
©2012 DataStax 19
- 21. Real-Time Analytics
Google Analytics gives you
immediate statistics about
your website traffic
©2012 DataStax 21
- 22. Web Analytics Data Model
Analytics Table
url time views from_search direct from_referrer
/index.html 12:00 354 300 20 34
/index.html 12:01 402 333 25 44
/contacts.html 12:00 23 3 0 20
/contacts.html 12:01 20 4 1 15
CQL
CREATE TABLE analytics (
url varchar,
time timestamp,
views counter,
from_search counter,
direct counter,
from_referrer counter,
PRIMARY KEY (url, time));
©2012 DataStax 22
- 23. Web Analytics Data Model
Analytics Table
url time views from_search direct from_referrer
/index.html 12:00 354 300 20 34
/index.html 12:01 402 333 25 44
/contacts.html 12:00 23 3 0 20
/contacts.html 12:01 20 4 1 15
CQL
UPDATE analytics
SET views = views + 1,
from_search = from_search + 1
WHERE url = '/index.html'
AND time = '2012-10-06 12:00';
©2012 DataStax 23
- 24. Web Analytics Data Model
Analytics Table
url time views from_search direct from_referrer
/index.html 12:00 354 300 20 34
/index.html 12:01 402 333 25 44
/contacts.html 12:00 23 3 0 20
/contacts.html 12:01 20 4 1 15
CQL
SELECT * FROM analytics
WHERE url = '/index.html'
©2012 DataStax 24
- 25. Online Business Intelligence
Storage for application Distributed batch
in production processing
Application Cassandra Hadoop
Using results in Storage for
production results
©2012 DataStax 25