Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing How to Choose


Published on

Big data doesn't mean big money. In fact, choosing a NoSQL solution will almost certainly save your business money, in terms of hardware, licensing, and total cost of ownership. What's more, choosing the correct technology for your use case will almost certainly increase your top line as well.
Big words, right? We'll back them up with customer case studies and lots of details.
This webinar will give you the basics for growing your business in a profitable way. What's the use of growing your top line but outspending any gains on cumbersome, ineffective, outdated IT? We'll take you through the specific use cases and business models that are the best fit for NoSQL solutions.
By the way, no prior knowledge is required. If you don't even know what RDBMS or NoSQL stand for, you are in the right place. Get your questions answered, and get your business on the right track to meeting your customers' needs in today's data environment.

Published in: Technology
  • Be the first to comment

Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing How to Choose

  1. 1. ROI on Big Data: RDBMS, NoSQL or Both? Robin Schumacher VP Products, DataStax
  2. 2. 3 Big Ideas for today’s conversation •Big data != big money •Big words require big back-up •All questions are big (and never foolish) •Exciting News: Launch of DataStax Enterprise 3.1
  3. 3. Agenda: What Will We Cover? • Introduction to DataStax and NoSQL • Overview of legacy vs. modern, big data applications • Comparing RDBMS’s and NoSQL • Customers examples of RDBMS-to-NoSQL swap out’s and co-existence strategies • Conclusions
  4. 4. Avoid Big Data FUD • Cost compared to what? • Value compared to what? • How to plan for success? money-waste/240157956
  5. 5. DataStax: An Overview • Founded in April 2010 • We drive Apache Cassandra™, the popular open-source NoSQL database • We provide DataStax Enterprise for enterprise NoSQL implementations • 300+ customers • 100+ employees • Home to Apache Cassandra Chair & most committers • Headquartered in San Francisco Bay area • Funded by prominent venture firms
  6. 6. What is Apache Cassandra? Datacenter Cloud Massively scalable NoSQL database Source: ( And easy data distribution That offers uptime, all the time (continuous availability)
  7. 7. What is DataStax Enterprise? DataStax Enterprise -- powered by Apache Cassandra™, certified for production 1. DataStax Enterprise Server 2. OpsCenter Enterprise 3. Expert Support & Services • Massive scalability • Continuous availability, and • Operational simplicity for real-time, analytic, and enterprise search data.
  8. 8. Details of DataStax Enterprise Server • Production-certified version of Cassandra for online applications. • Integrated Hadoop for batch analytics. • Built-in Solr for enterprise search. • Comprehensive security for sensitive data. • Active everywhere architecture. • Gold standard for multi-data center and cloud deployments. • Built-in data replication; removes need for ETL. • Complete isolation between different workloads. • Methods for data migration from legacy RDBMS’s.
  9. 9. Details of DataStax OpsCenter A new, 10-node Cassandra (or Hadoop) cluster with OpsCenter running in 3 minutes…A new, 10-node DSE cluster with OpsCenter running on AWS in 3 minutes… Done1 2 3
  10. 10. Launch Today: DataStax Enterprise 3.1 • Lower Total Cost of Ownership • Better ROI • Simpler & faster development • Greater insight • More flexibility and functionality
  11. 11. What’s New: Cassandra 1.2 Integration • Manage up to 10x more Cassandra data per node than prior versions for many use cases • Use vnodes and parallel operations to increase capacity and perform maintenance operations much faster • Get much greater functionality with new CQL binary protocol via Java and .NET drivers • Store arrays and lists of data much more easily with collections • Get deeper visibility into the response times of your queries and other database operations with tracing
  12. 12. What’s New: Solr 4.3 Integration • 60+ new features • Even faster performance • Stability Improvements • New memory caches and memory monitoring • Easier customization with new pluggable document handling
  13. 13. Cassandra/DataStax Users: A Sample
  14. 14. on.html Netflix Cloud Benchmark… “In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments with a linear increasing throughput.” Solving Big Data Challenges for Enterprise Application Performance Management, Tilman Rable, et al., August 2013, p. 10. Benchmark paper presented at the Very Large Database Conference, 2013. End Point Independent NoSQL Benchmark Highest in throughput… Lowest in latency… Cassandra: NoSQL Performance Leader
  15. 15. Use Cases Handled By DataStax Enterprise Managed by Cassandra Managed by Hadoop Managed by Solr • Time series data • Device/Sensor/Data “exhaust” systems • Distributed applications • Media streaming • Online Web retail (transactional, shopping carts, etc.) • Real-time data analytics • Social media capture and analysis • Web click-stream analysis • Write-intensive transactional systems • Buyer behavior analytics • Compliance/regulatory analysis • Customer recommendation output • Fraud detection • Risk analysis • Sales program campaign analysis • Supply chain analytics • Batch Web clickstream analysis • General Web search • Web retail faceted (categorization) search • Search/hit prioritization and highlighting • Application log search and analysis • Document (PDF, MS Word, etc.) search and analysis • Geospatial search • Real estate location and property search • Social media match ups
  16. 16. NoSQL Momentum “The economics don’t look great for Oracle. According to analysis by Wikibon’s David Floyer (and highlighted in the Wall Street Journal), the NoSQL database market is expected to grow at a compound annual growth rate of nearly 60% between 2011 and 2017. The SQL slice of the Big Data market, in contrast, will grow at just a 26% CAGR during that same time period.”
  17. 17. NoSQL Momentum “NoSQL is the stuff of the Internet Age.” - Andrew Oliver, InfoWorld
  18. 18. Examples of Oracle RDBMS Replacements
  19. 19. But does this mean the RDBMS is on the way out…? The truth is the vast majority of modern application architectures use both an RDBMS and NoSQL. The question is when and where should each be used?
  20. 20. Legacy vs. Today’s Data Applications LOB App RDBMS Oracle LOB App RDBMS MySQL LOB App RDBMS SQL Server Data Warehouse RDBMS Teradata/ Column DB’s LOB App NoSQL LOB App NoSQL LOB App NoSQL C * C * C * C * C *C * C * C * C * C * C * C * C * C * C *C * C * C * C * C * C * C * C * C * C *C * C * C * C * C * Data Warehouse Hadoop Legacy Line-of- Business Apps Today’s Line-of- Business Apps
  21. 21. Components of Legacy vs. Today’s Data Applications LOB App RDBMS Oracle LOB App RDBMS MySQL LOB App RDBMS SQL Server Data Warehouse RDBMS Teradata/ Column DB’s LOB App NoSQL LOB App NoSQL LOB App NoSQL C * C * C * C * C *C * C * C * C * C * C * C * C * C * C *C * C * C * C * C * C * C * C * C * C *C * C * C * C * C * Data Warehouse Hadoop Transactions: • LOB Style • Full consistency Analytics: • ROLAP • Rank • Windowing • Partition by, etc. Search • Full Text Transactions: • LOB Style • Tunable consistency Analytics: • MapReduce • Hive • Pig • Mahout Search • Solr Transactions: • DW style Analytics: • ROLAP • RANK • Windowing • Partition by, etc. Search • Full Text Transactions: • None Analytics: • MapReduce • Hive • Pig • Mahout Search • Solr
  22. 22. Previous Generation vs. Modern Applications Slow/medium velocity data High velocity data Data coming in from one/few locations Data coming in from many locations Rigid, static structured data Flexible, fluid, multi-type data Low/medium data volumes; purge often High data volumes; retain forever Deploy app central location/ one server Deploy app everywhere / many servers Write data in one location Write data everywhere/anywhere Primary concern: scale reads Scale writes and reads Scale up for more users/data Scale out for more users/data Downtime tolerated Downtime not tolerated Legacy Applications Today’s Applications
  23. 23. DataStax / Cassandra vs. Legacy RDBMS Fluid and flexible data model Rigid data model Easily supports modern data types Difficulty in supporting all datatypes Automatic data sharding/distribution Manual data sharding/distribution Multi-data center/cloud support Single DC with data shipping options Continuous availability Medium to high availability Read from anywhere Read from primary, possibly slaves Write data anywhere Write data to primary or specified shards AID transactions; tunable consistency ACID transactions Unlimited scale out for more capacity Limited scale up for capacity (out-reads) CQL for primary interface SQL for primary interface DataStax Enterprise/Cassandra Legacy RDBMS
  24. 24. Business Catalysts For NoSQL - Do You Need To… …keep business always online and serving customers? …serve customers everywhere (i.e. in multiple locations)? …deliver information fast both internally and externally? …handle increasing customer demand? …protect information that runs the business? …make business decisions based on right information? …easily find needed information? …receive strong payback for IT investments?
  25. 25. Keep Business Online Netflix systems are run in the cloud across multiple availability zones with Cassandra and sport constant uptime. Over 95% of Netflix’s data is stored in Cassandra (much of it previously on Oracle).
  26. 26. Keep Business Online Commenting on Amazon outage in Oct 2012: “We configure all our clusters to use a replication factor of three, with each replica located in a different Availability Zone. This allowed Cassandra to handle the outage remarkably well. When a single zone became unavailable, we didn't need to do anything. Cassandra routed requests around the unavailable zone and when it recovered, the ring was repaired.” - Netflix Tech Blog
  27. 27. Serve Customers Everywhere Rightscale keeps its customers in contact with each other all over the world via DataStax clusters in 5+ global data centers.
  28. 28. Deliver Information Fast Everywhere Adobe delivers on very stringent response time requirements (< 12ms or less for 95% of requests) for its marketing cloud with DataStax clusters in two data centers.
  29. 29. Handle Increasing Customer Demand Gnip delivers social media data to 95% of Fortune 500 by using DataStax Enterprise. Data velocity rates for Twitter alone can be 20,000 tweets per second.
  30. 30. Handle Increasing Customer Demand Ooyala distributes and analyzes media/video content for companies like ESPN, Rolling Stone and others. They track about one quarter of all online video viewers each day and generate 1-2 billion events that are streaming in real-time through their DataStax cluster.
  31. 31. Handle Increasing Customer Demand
  32. 32. Make Right Business Decisions “DataStax made it all work together” • Cassandra, Hadoop, Solr, Security Manage costs & improve performance • 400% ROI over five years • $750K five-year savings in support costs • 90% better response and upload time Analyzing Information • Doctors’ notes • Analyze notes to bill back Medicare / Medicaid
  33. 33. Find Information Instantly Datafiniti, which is a search engine for data, needs to consume lots of data in real time and provide fast search on top of the same data.
  34. 34. Get Strong Payback on IT Investment Constant Contact found that scaling out with NoSQL vs. IBM DB2 saved them 90% in software costs, and was implemented in 1/3 the time... “To do what we need to do today without Cassandra would cost a couple million dollars more and would be significantly harder to manage operationally.”
  35. 35. Conclusions
  36. 36. When Legacy RDBMS over NoSQL/Cassandra • No need for a flexible data model; data is all structured and fits well within an RDBMS schema. • Data does not come in at high rates and the speed at which data is written is not important. • You need detailed/complex/nested ACID transactions. • All your data can fit into memory or reside on 1-2 machines and substantial growth is not expected. • You have no need for constant uptime; unexpected downtime has no/little impact. • You don’t need to distribute data to multiple locations, various cloud availability zones, or have multiple copies for disaster recovery purposes. • No need to integrate/seamlessly move data between real-time, analytics, and search systems. • Software costs not a concern.
  37. 37. When DataStax/Cassandra Over Legacy RDBMS • You need a more flexible data model. • You have to store a variety of data types. • You need constant uptime/continuous availability. • You need to distribute data across multiple data centers or cloud availability zones. • You need linear scale-out performance for growing data. • You need very fast write capabilities. • You need to write and read data in multiple locations. • You need transactions but eventual consistency is OK (or strong consistency with performance impact for many data copies). • You need an easy way to integrate real-time, analytics, and search data. • You need cost savings/a better ROI.
  38. 38. How Can I Try DataStax Enterprise? • Go to • Download a copy of DataStax Enterprise. • Installs and configures in minutes. • Completely free for development evaluation (no trial time bombs, etc.); subscription required for production deployments.
  39. 39. For More Information
  40. 40. Thank You – Questions? We power the big data applications that transform business.