Apache Cassandra:NoSQL in theEnterprise, today             Jonathan Ellis                      CTO                  @spyced
Cassandra Job Trends (indeed.com)
“Big Data” trend
Why Big Data MattersResearch done by McKinsey & Company shows the eye-opening, 10-yearcategory growth rate differences bet...
Big data Analytics       Realtime             ? (Hadoop)        (“NoSQL”)
Some users✤   Financial✤   Social Media✤   Advertising✤   Entertainment✤   Energy✤   E-tail✤   Health care✤   Government
Common use cases✤   Time series data✤   Messaging✤   Ad tracking✤   Data mining✤   User activity streams✤   User sessions✤...
Why Cassandra?✤   Fully distributed, no SPOF✤   Multi-master, multi-DC✤   Linearly scalable✤   Larger-than-memory datasets...
Classing partitioning with SPOF   partition 1   partition 2        partition 3   partition 4      slave      slave     mas...
Fully distributed, no SPOF  client           p3                 p6          p1            p1                      p1
Performance summary
“With Cassandra, we get better business agility, and wedon’t have to plan capacity in advance, we don’t need toask permiss...
Netflix on Cassandra✤   Could not build datacenters fast enough✤   Made decision to go to cloud (AWS)✤   Applications incl...
“Without Cassandra, our engineers would’ve had tocreate something that could scale to our needs, thatwould’ve prevented us...
Backupify on Cassandra✤   Cloud-based utility that enables businesses and    consumers to backup, search and restore the c...
“You can seamlessly add new nodes and expand yourtotal capacity without deteriorating the performance ofthe data store. Ca...
Ooyala on Cassandra✤   Ooyala provides a suite of technologies and services that    support content owners in managing, an...
“Cassandra has allowed us to build bigger featuresfaster and more reliably, while using less money andwithout needing to e...
Formspring on Cassandra✤   Users of Formspring engage with and learn more about    each other by asking and responding to ...
Big data Analytics       Realtime             ? (Hadoop)        (“NoSQL”)
The evolution of Analytics            Analytics + Realtime
The evolution of Analytics                   replication       Analytics                 Realtime
The evolution of Analytics                  ETL
Big data Analytics    Datastax    Realtime (Hadoop)    Enterprise   (“NoSQL”)
DataStax Enterprise re-unifiesrealtime and analytics
Portfolio Demo dataflowPortfolios                PortfoliosHistorical Prices         Live Prices for todayIntermediate Res...
Operations✤   “Vanilla” Hadoop    ✤   8+ services to setup, monitor, backup, and recover        (NameNode, SecondaryNameNo...
Managing & Monitoring Big Data✤   DataStax OpsCenter    manages and    monitors all    Cassandra and    Hadoop operations
Questions?
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
Apache Cassandra: NoSQL in the enterprise
Upcoming SlideShare
Loading in...5
×

Apache Cassandra: NoSQL in the enterprise

3,895

Published on

Published in: Technology
0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,895
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
99
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide

Apache Cassandra: NoSQL in the enterprise

  1. 1. Apache Cassandra:NoSQL in theEnterprise, today Jonathan Ellis CTO @spyced
  2. 2. Cassandra Job Trends (indeed.com)
  3. 3. “Big Data” trend
  4. 4. Why Big Data MattersResearch done by McKinsey & Company shows the eye-opening, 10-yearcategory growth rate differences between businesses that smartly use their bigdata and those that do not.
  5. 5. Big data Analytics Realtime ? (Hadoop) (“NoSQL”)
  6. 6. Some users✤ Financial✤ Social Media✤ Advertising✤ Entertainment✤ Energy✤ E-tail✤ Health care✤ Government
  7. 7. Common use cases✤ Time series data✤ Messaging✤ Ad tracking✤ Data mining✤ User activity streams✤ User sessions✤ Anything requiring: Scalable + performant + highly available
  8. 8. Why Cassandra?✤ Fully distributed, no SPOF✤ Multi-master, multi-DC✤ Linearly scalable✤ Larger-than-memory datasets✤ Best-in-class performance (not just writes!)✤ Fully durable✤ Integrated caching✤ Tuneable consistency
  9. 9. Classing partitioning with SPOF partition 1 partition 2 partition 3 partition 4 slave slave master request router
  10. 10. Fully distributed, no SPOF client p3 p6 p1 p1 p1
  11. 11. Performance summary
  12. 12. “With Cassandra, we get better business agility, and wedon’t have to plan capacity in advance, we don’t need toask permission of other people to build things for us,and we don’t worry about running out of space orpower.”Adrian Cockcroft, Cloud Architect
  13. 13. Netflix on Cassandra✤ Could not build datacenters fast enough✤ Made decision to go to cloud (AWS)✤ Applications include Netflix’s subscriber system, AB testing, and viewing history service✤ Over a year in, Netflix finds Cassandra to be ✤ Fast ✤ Cost-effective ✤ Scalable ✤ Flexible ✤ Reliable: no SPOF
  14. 14. “Without Cassandra, our engineers would’ve had tocreate something that could scale to our needs, thatwould’ve prevented us from focusing on buildingproduct and solving problems for Backupify’s users,which are far more important tasks.”Matt Conway, VP Engineering
  15. 15. Backupify on Cassandra✤ Cloud-based utility that enables businesses and consumers to backup, search and restore the content of popular online applications such as Google Apps, Gmail, Facebook, Twitter, and Blogger✤ Cassandra findings: ✤ Solved scaling, allowing engineers to focus on their business ✤ DataStax OpsCenter made it easy to monitor the health and performance of their cluster ✤ Reliable, redundant and scalable data storage helped eliminate down-time ✤ Ability to offer both backup and storage, but also analysis
  16. 16. “You can seamlessly add new nodes and expand yourtotal capacity without deteriorating the performance ofthe data store. Cassandra has allowed us to scale veryeffectively.”Harry Robertson, Tech Lead
  17. 17. Ooyala on Cassandra✤ Ooyala provides a suite of technologies and services that support content owners in managing, analyzing and monetizing the digital video they publish online✤ Cassandra findings: ✤ Classic “Big Data” problem did not require re-architecting ✤ Delivered ability to respond to increasingly sophisticated analytic needs of customers ✤ Developers spend time building application features, not figuring out how to scale
  18. 18. “Cassandra has allowed us to build bigger featuresfaster and more reliably, while using less money andwithout needing to expand our staff.”Kyle Ambroff, Sr. Engineer
  19. 19. Formspring on Cassandra✤ Users of Formspring engage with and learn more about each other by asking and responding to questions. Close to 4B responses in the system and 30M unique users✤ Cassandra experience ✤ No sharding needed – just add nodes to scale ✤ Performance – the popular users with many followers saw no speed reduction. No more memcached! ✤ Flexibility of a schema-optional architecture is very developer friendly
  20. 20. Big data Analytics Realtime ? (Hadoop) (“NoSQL”)
  21. 21. The evolution of Analytics Analytics + Realtime
  22. 22. The evolution of Analytics replication Analytics Realtime
  23. 23. The evolution of Analytics ETL
  24. 24. Big data Analytics Datastax Realtime (Hadoop) Enterprise (“NoSQL”)
  25. 25. DataStax Enterprise re-unifiesrealtime and analytics
  26. 26. Portfolio Demo dataflowPortfolios PortfoliosHistorical Prices Live Prices for todayIntermediate ResultsLargest loss Largest loss
  27. 27. Operations✤ “Vanilla” Hadoop ✤ 8+ services to setup, monitor, backup, and recover (NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker, Zookeeper, Region Server,...) ✤ Single points of failure ✤ Cant separate online and offline processing✤ DataStax Enterprise ✤ Single, simplified component ✤ Self-organizes based on workload ✤ Peer to peer ✤ JobTracker failover
  28. 28. Managing & Monitoring Big Data✤ DataStax OpsCenter manages and monitors all Cassandra and Hadoop operations
  29. 29. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×