SlideShare a Scribd company logo
1 of 32
Download to read offline
Apache Cassandra:
NoSQL in the
Enterprise, today
             Jonathan Ellis
                      CTO
                  @spyced
Cassandra Job Trends (indeed.com)
“Big Data” trend
Why Big Data Matters




Research done by McKinsey & Company shows the eye-opening, 10-year
category growth rate differences between businesses that smartly use their big
data and those that do not.
Big data




 Analytics       Realtime
             ?
 (Hadoop)        (“NoSQL”)
Some users
✤   Financial
✤   Social Media
✤   Advertising
✤   Entertainment
✤   Energy
✤   E-tail
✤   Health care
✤   Government
Common use cases

✤   Time series data
✤   Messaging
✤   Ad tracking
✤   Data mining
✤   User activity streams
✤   User sessions
✤   Anything requiring:
    Scalable + performant + highly available
Why Cassandra?

✤   Fully distributed, no SPOF
✤   Multi-master, multi-DC
✤   Linearly scalable
✤   Larger-than-memory datasets
✤   Best-in-class performance (not just writes!)
✤   Fully durable
✤   Integrated caching
✤   Tuneable consistency
Classing partitioning with SPOF

   partition 1   partition 2        partition 3   partition 4
      slave

      slave

     master




                               request
                                router
Fully distributed, no SPOF

  client




           p3
                 p6          p1
            p1




                      p1
Performance summary
“With Cassandra, we get better business agility, and we
don’t have to plan capacity in advance, we don’t need to
ask permission of other people to build things for us,
and we don’t worry about running out of space or
power.”


Adrian Cockcroft, Cloud Architect
Netflix on Cassandra

✤   Could not build datacenters fast enough
✤   Made decision to go to cloud (AWS)
✤   Applications include Netflix’s subscriber system, AB
    testing, and viewing history service

✤   Over a year in, Netflix finds Cassandra to be
    ✤   Fast
    ✤   Cost-effective
    ✤   Scalable
    ✤   Flexible
    ✤   Reliable: no SPOF
“Without Cassandra, our engineers would’ve had to
create something that could scale to our needs, that
would’ve prevented us from focusing on building
product and solving problems for Backupify’s users,
which are far more important tasks.”


Matt Conway, VP Engineering
Backupify on Cassandra

✤   Cloud-based utility that enables businesses and
    consumers to backup, search and restore the content of
    popular online applications such as Google Apps,
    Gmail, Facebook, Twitter, and Blogger

✤   Cassandra findings:
    ✤   Solved scaling, allowing engineers to focus on their business
    ✤   DataStax OpsCenter made it easy to monitor the health and
        performance of their cluster
    ✤   Reliable, redundant and scalable data storage helped
        eliminate down-time
    ✤   Ability to offer both backup and storage, but also analysis
“You can seamlessly add new nodes and expand your
total capacity without deteriorating the performance of
the data store. Cassandra has allowed us to scale very
effectively.”


Harry Robertson, Tech Lead
Ooyala on Cassandra

✤   Ooyala provides a suite of technologies and services that
    support content owners in managing, analyzing and
    monetizing the digital video they publish online

✤   Cassandra findings:
    ✤   Classic “Big Data” problem did not require re-architecting
    ✤   Delivered ability to respond to increasingly sophisticated
        analytic needs of customers
    ✤   Developers spend time building application features, not
        figuring out how to scale
“Cassandra has allowed us to build bigger features
faster and more reliably, while using less money and
without needing to expand our staff.”


Kyle Ambroff, Sr. Engineer
Formspring on Cassandra

✤   Users of Formspring engage with and learn more about
    each other by asking and responding to questions. Close
    to 4B responses in the system and 30M unique users

✤   Cassandra experience
    ✤   No sharding needed – just add nodes to scale
    ✤   Performance – the popular users with many followers saw no
        speed reduction. No more memcached!
    ✤   Flexibility of a schema-optional architecture is very developer
        friendly
Big data




 Analytics       Realtime
             ?
 (Hadoop)        (“NoSQL”)
The evolution of Analytics




            Analytics + Realtime
The evolution of Analytics




                   replication




       Analytics                 Realtime
The evolution of Analytics




                  ETL
Big data




 Analytics    Datastax    Realtime
 (Hadoop)    Enterprise   (“NoSQL”)
DataStax Enterprise re-unifies
realtime and analytics
Portfolio Demo dataflow


Portfolios                Portfolios
Historical Prices         Live Prices for today
Intermediate Results
Largest loss              Largest loss
Operations

✤   “Vanilla” Hadoop
    ✤   8+ services to setup, monitor, backup, and recover
        (NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker,
        Zookeeper, Region Server,...)
    ✤   Single points of failure
    ✤   Can't separate online and offline processing

✤   DataStax Enterprise
    ✤   Single, simplified component
    ✤   Self-organizes based on workload
    ✤   Peer to peer
    ✤   JobTracker failover
Managing & Monitoring Big Data
✤   DataStax OpsCenter
    manages and
    monitors all
    Cassandra and
    Hadoop operations
Questions?

More Related Content

What's hot

Analytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret WeaponAnalytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret Weapon
Databricks
 

What's hot (20)

Building a Digital Bank
Building a Digital BankBuilding a Digital Bank
Building a Digital Bank
 
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
 
Building and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxBuilding and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStax
 
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup   Qubole presentation for the Cleveland Big Data and Hadoop Meetup
Qubole presentation for the Cleveland Big Data and Hadoop Meetup
 
How to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph dataHow to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph data
 
Introduction: Architecting for Scale
Introduction: Architecting for ScaleIntroduction: Architecting for Scale
Introduction: Architecting for Scale
 
Webinar: Don't Leave Your Data in the Dark
Webinar: Don't Leave Your Data in the DarkWebinar: Don't Leave Your Data in the Dark
Webinar: Don't Leave Your Data in the Dark
 
The Big Data Ecosystem for Financial Services
The Big Data Ecosystem for Financial ServicesThe Big Data Ecosystem for Financial Services
The Big Data Ecosystem for Financial Services
 
Big Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the CloudBig Data in Production: Lessons from Running in the Cloud
Big Data in Production: Lessons from Running in the Cloud
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...
 
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
Transforming Devon’s Data Pipeline with an Open Source Data Hub—Built on Data...
 
Don’t Bring Old Problems to Your New Cloud Data Warehouse
Don’t Bring Old Problems to Your New Cloud Data Warehouse Don’t Bring Old Problems to Your New Cloud Data Warehouse
Don’t Bring Old Problems to Your New Cloud Data Warehouse
 
How Analytics Teams Using SSAS Can Embrace Big Data and the Cloud
How Analytics Teams Using SSAS Can Embrace Big Data and the CloudHow Analytics Teams Using SSAS Can Embrace Big Data and the Cloud
How Analytics Teams Using SSAS Can Embrace Big Data and the Cloud
 
Webinar: BI in the Sky - The New Rules of Cloud Analytics
Webinar: BI in the Sky - The New Rules of Cloud AnalyticsWebinar: BI in the Sky - The New Rules of Cloud Analytics
Webinar: BI in the Sky - The New Rules of Cloud Analytics
 
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
 
Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)Azure Synapse Analytics Overview (r2)
Azure Synapse Analytics Overview (r2)
 
Analytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret WeaponAnalytics-Enabled Experiences: The New Secret Weapon
Analytics-Enabled Experiences: The New Secret Weapon
 
Webinar - Bringing Game Changing Insights with Graph Databases
Webinar - Bringing Game Changing Insights with Graph DatabasesWebinar - Bringing Game Changing Insights with Graph Databases
Webinar - Bringing Game Changing Insights with Graph Databases
 
Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...
Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...
Webinar - DataStax Enterprise 5.1: 3X the operational analytics speed, help f...
 
The Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
The Real-Time CDO and the Cloud-Forward Path to Predictive AnalyticsThe Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
The Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
 

Viewers also liked

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
Eric Evans
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
Ryu Kobayashi
 

Viewers also liked (12)

Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra ppt 2
Cassandra ppt 2Cassandra ppt 2
Cassandra ppt 2
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)
 
Cassandra ppt 1
Cassandra ppt 1Cassandra ppt 1
Cassandra ppt 1
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
 
Developers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQLDevelopers summit cassandraで見るNoSQL
Developers summit cassandraで見るNoSQL
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Apache cassandra architecture internals
Apache cassandra architecture internalsApache cassandra architecture internals
Apache cassandra architecture internals
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 

Similar to Apache Cassandra: NoSQL in the enterprise

DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQLDataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax
 
Big Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the CloudBig Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the Cloud
George Ang
 

Similar to Apache Cassandra: NoSQL in the enterprise (20)

DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQLDataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
 
John Glendenning - Real time data driven services in the Cloud
John Glendenning - Real time data driven services in the CloudJohn Glendenning - Real time data driven services in the Cloud
John Glendenning - Real time data driven services in the Cloud
 
Webinar: Comparing DataStax Enterprise with Open Source Apache Cassandra
Webinar: Comparing DataStax Enterprise with Open Source Apache CassandraWebinar: Comparing DataStax Enterprise with Open Source Apache Cassandra
Webinar: Comparing DataStax Enterprise with Open Source Apache Cassandra
 
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformDeploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
 
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
Webinar: ROI on Big Data - RDBMS, NoSQL or Both? A Simple Guide for Knowing H...
 
The Future Of Big Data
The Future Of Big DataThe Future Of Big Data
The Future Of Big Data
 
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStax
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStaxWebinar | From Zero to 1 Million with Google Cloud Platform and DataStax
Webinar | From Zero to 1 Million with Google Cloud Platform and DataStax
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
 
Introduction To IPaaS: Drivers, Requirements And Use Cases
Introduction To IPaaS: Drivers, Requirements And Use CasesIntroduction To IPaaS: Drivers, Requirements And Use Cases
Introduction To IPaaS: Drivers, Requirements And Use Cases
 
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetupDataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
 
Big Data Analytics on the Cloud
Big Data Analytics on the CloudBig Data Analytics on the Cloud
Big Data Analytics on the Cloud
 
Big Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the CloudBig Data on EC2: Mashing Technology in the Cloud
Big Data on EC2: Mashing Technology in the Cloud
 
Accelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with CascadingAccelerate Big Data Application Development with Cascading
Accelerate Big Data Application Development with Cascading
 
IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data IBM Relay 2015: Open for Data
IBM Relay 2015: Open for Data
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
Changing the game with cloud dw
Changing the game with cloud dwChanging the game with cloud dw
Changing the game with cloud dw
 

More from jbellis

Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
jbellis
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
jbellis
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
jbellis
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
jbellis
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
jbellis
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
jbellis
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solution
jbellis
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
jbellis
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandra
jbellis
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1
jbellis
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Java
jbellis
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
jbellis
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
jbellis
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
jbellis
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from java
jbellis
 

More from jbellis (20)

Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databases
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloud
 
Cassandra Summit 2015
Cassandra Summit 2015Cassandra Summit 2015
Cassandra Summit 2015
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Cassandra 2.1
Cassandra 2.1Cassandra 2.1
Cassandra 2.1
 
Tokyo cassandra conference 2014
Tokyo cassandra conference 2014Tokyo cassandra conference 2014
Tokyo cassandra conference 2014
 
Cassandra Summit EU 2013
Cassandra Summit EU 2013Cassandra Summit EU 2013
Cassandra Summit EU 2013
 
London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0London + Dublin Cassandra 2.0
London + Dublin Cassandra 2.0
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Top five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solutionTop five questions to ask when choosing a big data solution
Top five questions to ask when choosing a big data solution
 
State of Cassandra 2012
State of Cassandra 2012State of Cassandra 2012
State of Cassandra 2012
 
Massively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache CassandraMassively Scalable NoSQL with Apache Cassandra
Massively Scalable NoSQL with Apache Cassandra
 
Cassandra 1.1
Cassandra 1.1Cassandra 1.1
Cassandra 1.1
 
Pycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from JavaPycon 2012 What Python can learn from Java
Pycon 2012 What Python can learn from Java
 
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
 
Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011Cassandra at High Performance Transaction Systems 2011
Cassandra at High Performance Transaction Systems 2011
 
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
 
What python can learn from java
What python can learn from javaWhat python can learn from java
What python can learn from java
 

Recently uploaded

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

Apache Cassandra: NoSQL in the enterprise

  • 1. Apache Cassandra: NoSQL in the Enterprise, today Jonathan Ellis CTO @spyced
  • 2. Cassandra Job Trends (indeed.com)
  • 4. Why Big Data Matters Research done by McKinsey & Company shows the eye-opening, 10-year category growth rate differences between businesses that smartly use their big data and those that do not.
  • 5. Big data Analytics Realtime ? (Hadoop) (“NoSQL”)
  • 6. Some users ✤ Financial ✤ Social Media ✤ Advertising ✤ Entertainment ✤ Energy ✤ E-tail ✤ Health care ✤ Government
  • 7. Common use cases ✤ Time series data ✤ Messaging ✤ Ad tracking ✤ Data mining ✤ User activity streams ✤ User sessions ✤ Anything requiring: Scalable + performant + highly available
  • 8. Why Cassandra? ✤ Fully distributed, no SPOF ✤ Multi-master, multi-DC ✤ Linearly scalable ✤ Larger-than-memory datasets ✤ Best-in-class performance (not just writes!) ✤ Fully durable ✤ Integrated caching ✤ Tuneable consistency
  • 9. Classing partitioning with SPOF partition 1 partition 2 partition 3 partition 4 slave slave master request router
  • 10. Fully distributed, no SPOF client p3 p6 p1 p1 p1
  • 11.
  • 13.
  • 14. “With Cassandra, we get better business agility, and we don’t have to plan capacity in advance, we don’t need to ask permission of other people to build things for us, and we don’t worry about running out of space or power.” Adrian Cockcroft, Cloud Architect
  • 15. Netflix on Cassandra ✤ Could not build datacenters fast enough ✤ Made decision to go to cloud (AWS) ✤ Applications include Netflix’s subscriber system, AB testing, and viewing history service ✤ Over a year in, Netflix finds Cassandra to be ✤ Fast ✤ Cost-effective ✤ Scalable ✤ Flexible ✤ Reliable: no SPOF
  • 16. “Without Cassandra, our engineers would’ve had to create something that could scale to our needs, that would’ve prevented us from focusing on building product and solving problems for Backupify’s users, which are far more important tasks.” Matt Conway, VP Engineering
  • 17. Backupify on Cassandra ✤ Cloud-based utility that enables businesses and consumers to backup, search and restore the content of popular online applications such as Google Apps, Gmail, Facebook, Twitter, and Blogger ✤ Cassandra findings: ✤ Solved scaling, allowing engineers to focus on their business ✤ DataStax OpsCenter made it easy to monitor the health and performance of their cluster ✤ Reliable, redundant and scalable data storage helped eliminate down-time ✤ Ability to offer both backup and storage, but also analysis
  • 18. “You can seamlessly add new nodes and expand your total capacity without deteriorating the performance of the data store. Cassandra has allowed us to scale very effectively.” Harry Robertson, Tech Lead
  • 19. Ooyala on Cassandra ✤ Ooyala provides a suite of technologies and services that support content owners in managing, analyzing and monetizing the digital video they publish online ✤ Cassandra findings: ✤ Classic “Big Data” problem did not require re-architecting ✤ Delivered ability to respond to increasingly sophisticated analytic needs of customers ✤ Developers spend time building application features, not figuring out how to scale
  • 20. “Cassandra has allowed us to build bigger features faster and more reliably, while using less money and without needing to expand our staff.” Kyle Ambroff, Sr. Engineer
  • 21. Formspring on Cassandra ✤ Users of Formspring engage with and learn more about each other by asking and responding to questions. Close to 4B responses in the system and 30M unique users ✤ Cassandra experience ✤ No sharding needed – just add nodes to scale ✤ Performance – the popular users with many followers saw no speed reduction. No more memcached! ✤ Flexibility of a schema-optional architecture is very developer friendly
  • 22. Big data Analytics Realtime ? (Hadoop) (“NoSQL”)
  • 23. The evolution of Analytics Analytics + Realtime
  • 24. The evolution of Analytics replication Analytics Realtime
  • 25. The evolution of Analytics ETL
  • 26. Big data Analytics Datastax Realtime (Hadoop) Enterprise (“NoSQL”)
  • 28.
  • 29. Portfolio Demo dataflow Portfolios Portfolios Historical Prices Live Prices for today Intermediate Results Largest loss Largest loss
  • 30. Operations ✤ “Vanilla” Hadoop ✤ 8+ services to setup, monitor, backup, and recover (NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker, Zookeeper, Region Server,...) ✤ Single points of failure ✤ Can't separate online and offline processing ✤ DataStax Enterprise ✤ Single, simplified component ✤ Self-organizes based on workload ✤ Peer to peer ✤ JobTracker failover
  • 31. Managing & Monitoring Big Data ✤ DataStax OpsCenter manages and monitors all Cassandra and Hadoop operations