John Glendenning - Real time data driven services in the Cloud


Published on

John Glendenning from DataStax's presentation from our Big Data breakfast conference

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Always available,Target new regions quickly, Need elasticity as traffic ebbs and flows
  • Always available,Target new regions quickly, Need elasticity as traffic ebbs and flows
  • In the old days all your structured data storage requirements went into the classic RDBMS like DB2, Oracle, Sybase, InformixRelational databases were great! You could define data structures and insert, join them to your hearts content and query them to give you the bits of data you want. After a while you joins become a massive chain and your systems start slowing down! Change the data structure to perform better? Too risky!Noooo, just buy more kit to make it perform!Then there is the multi data centre configuration. What did we used to do in the past? SAN replication? Oracle replication? There was a lot of operational headaches there.Anyway, while Larry was busy eating his hot dog, TCP/IP became more popular, ISP’s started appearing, Mosaic Browsers started appearing on people’s home PC’s, the whole internet revolution happened! So clever guys such as Larry and Sergei, and brave guys like Jeff Bezos were busy building global scale services. They hired a whole bunch of really clever engineers and gave them the global scale problems to solve. How many of you guys here who had a boss saying, I’ve got an e-commerce website, write a database engine that scales globally? That was the sort of bold move these have made back then. And what’s good about them is that rather than keeping all of this their trade secret, they started publishing papers on their implementations.Then this guy came along, stole Amazon Dynamo DB engineer AvinashLakshman, and built Cassandra, and eventually had it open sourced.Next slide
  • John Glendenning - Real time data driven services in the Cloud

    1. 1. John Glendenning DataStax ‘Real-time data driven services in the Cloud’
    2. 2. Real-time Data Driven Services in the Cloud John Glendenning, DataStax VP & GM EMEA
    3. 3. Line of Business Manager: Adapt With Customers “I have to move as fast as my market. I can‟t get slowed down by people telling me this is going to take six months. It‟s got to be ready, quickly. No matter what. And I need to adapt quickly with my customers.
    4. 4. VP of IT: How Can I Scale Without Surprises? “Given the explosion of data in the enterprise, how can I scale my IT investment to meet the demands of my lines of business, without taking on undue risk? (My choices are to spend $10 million to scale what I‟ve got versus do something new)”
    5. 5. Nearly All Businesses Must Think Global Datacenter Cloud About 1/2 OF ALL SALES will be online BY THE END OF 2013 Source: ( 24/7 monitoring demands Global market demands Localization deployment
    6. 6. Your Data Demands Can Change in an Instant 2012201120102009 Fluctuating traffic demands 14 24 25 13 Fi 5 24
    7. 7. Major Changes: The Evolving Data Center
    8. 8. DataStax in the News Big movies, big data: Netflix embraces NoSQL in the cloud With billions of reads and writes daily, Netflix relies on NoSQL database Cassandra to replace a legacy Oracle deployment May 02, 2013 (AP) The company chose Cassandra from DataStax for its flexibility to create and manage data clusters quickly, particularly in the cloud. Christos Kalantzis, Netflix's manager of cloud and platform engineering, explains that "solutions like Oracle don't run very well on virtualized hardware ... the architecture of Cassandra and the availability and consistency tuning and scalability made it
    9. 9. Major Changes: The Evolving Data Center LOB App Oracl e LOB App MySQL LOB App SQL Serve r “What’s Happening?” Hyper Velocity Transactional NoSQL Data Warehouse Teradata/ Exadata “What Happened?” Massive Volume Bit Bucket Hadoop
    10. 10. What is a NoSQL Solution? • • • • •
    11. 11. What is Apache Cassandra? Apache Cassandra™ is a massively scalable distributed open source database. Cassandra is designed to handle big data workloads across multiple data centres with no single point of failure, providing enterprises with continuous availability without compromising performance.
    12. 12. Cassandra Architecture Overview • Fast / Linear performance • Elastic scalability • No single point of failure • Enterprise / multi-data center / cloud data distribution • Location independence – read and write anywhere • Tunable data consistency (per operation) • Familiar SQL-Like language – CQL • Dynamic / Flexible schema • Can store structured, semi- structured and unstructured data • Replication Strategies from Amazon Dynamo paper • Data structure and storage design from Google BigTable paper
    13. 13. Apache Cassandra Leading in Performance “In terms of scalability, there is a clear winner throughout our experiments. Cassandra achieves the highest throughput for the maximum number of nodes in all experiments with a linear increasing throughput.” Solving Big Data Challenges for Enterprise Application Performance Management, Tilman Rable, et al., August 2013, p. 10. Benchmark paper presented at the Very Large Database Conference, 2013. cassandra-scalability-on.html Netflix Cloud Benchmark… End Point Independent NoSQL Benchmark Highest in throughput… Lowest in latency…
    14. 14. Who‟s using Cassandra?
    15. 15. Why We Exist “I can create a Cassandra cluster in any region of the world in 10 minutes. When marketing guys decide we want to move into a certain part of the world, we‟re ready.” Today‟s applications must be always available and lightning fast as they scale to previously unimaginable levels. Cassandra delivers both with a beautifully simple and elegant architecture.
    16. 16. What We Do Best Cassandra was designed to do things that are impossible in other databases when it comes to availability and performance. Forget about losing a machine here or there -- Cassandra delivers a world where you can lose an entire datacenter and still perform as your customers expect. “We have to be ready for disaster recovery all the time. It‟s really great that Cassandra allows for active-active multiple data centers where we can read and write anywhere” Jay Patel Technical Architect at eBay (Describing why they switched from legacy relational architecture)
    17. 17. Without Breaking Your Budget “To do what we need to do today without Cassandra would cost a couple million dollars more and would be significantly harder to manage operationally.”
    18. 18. DataStax: An Overview • Founded in April 2010 • Home to Apache Cassandra Chair & most committers • DataStax Enterprise – „Certified for Production‟ Big Data platform • 300+ customers • 100+ employees • Headquartered in San Francisco Bay area • European HQ in London, UK • Funded by prominent venture firms
    19. 19. DataStax Enterprise Cassandra users come to DataStax For Confidence and Innovation
    20. 20. What Innovation? • Production-certified Cassandra • Round-the-clock support by the world‟s experts • Your big data system is easy to manage • Satisfy your top security officer • Search and analyze your hot data in context
    21. 21. Ask Different Things of Your Hot Data Analyze (Hadoop) Write Read Write Search (Solr) Search (Solr) Write Read DataStax Enterprise Multi-Data Center
    22. 22. With the Security You Need Analyze (Hadoop) Write Read Write Search (Solr) Search (Solr) Write Read
    23. 23. Into the Mainstream “Security is very important to us, so we‟re naturally very pleased to see all the new security features in DataStax Enterprise 3. Its scalability and performance are enabling us to develop an exciting financial data analytics platform that will create a better experience for our audience.”
    24. 24. Managed From a Single Pane Provision Monitor Plan Optimize Recover