Hello Cronies,
Here are the slides of our recent meetup. .
Title: It's about Time: Deep dive into event store using Apache Cassandra
Big data At-A-Glance
· What is Big data?
· What we have seen so far in AJM Bigdata series?
· Refresher/Overview of basic terminology
· Where it is? Am I using it?
Introduction to Apache Cassandra
· What, When and Why of Apache Cassandra
· Protocol, Queries, Architecture and everything else
· Who is using Apache Cassandra
· Interesting use cases of Apache Cassandra ( Twitter/ Disqus/ etc.)
· Demo application walk-through
1. It's about time :
Deep dive into event store using
Apache Cassandra
by Nikunj Thakkar
2. Agenda
●
What is Big Data?
●
So far in AJM Bigdata
Series
●
Where it is? Am I using
it?
Big Data at-a-glance
Introduction to
Apache Cassandra
●
What, When and Why of
Cassandra
●
Protocol, Architecture,
Queries and Evrything
else
●
Interesting Use-cases
●
Demo
17. Targeted marketing
Public sector
Big Data:
Am I using it?
Health care
Social media and web
data
Global personal location tracking
Social media and web
data
Social media and web
data
Automated device generated data
48. Apache Cassandra@
Disqus
➔
Disqus - Disqus is a discussion platform for the
web. It connects publishers with users and
allow them to have a public discourse in a
medium that allows communication across the
web.
49. Apache Cassandra@
Disqus
➔
Disqus uses Cassandra in a number of different places. Mainly
in the product; it’s used for content recommendation and also
a little bit of advertising. Let’s say you’re on that article
reading about the war in Syria and you notice that there’s
another interesting article relating to what the British PMs
have released as a public statement relating to whether or
not it’s legal to go to war, and maybe you’re interested in
reading that response. What Cassandra does is it powers the
analytics and content engine behind how disqus recommends
content.
50. Apache Cassandra@
Disqus
➔
Main cluster - 24 nodes
➔
CPU - 6-core Xeons 3Ghz – Biggest – Because
turning out to be a small bottleneck at times
➔
24GB RAM – Per node – 8 GB Heap Size
➔
32 or 48GBs RAM wasn't helping much
➔
it’s handling our load of about 30,000 reads a second