SlideShare a Scribd company logo
1 of 50
Download to read offline
CDE
• Cloud Database Engineering
• Responsible for providing data stores as
services @ Netflix
CDE Services
Agenda
• Cassandra @ Netflix
• Challenges
• Certification and benchmarking
• CDE Architecture
• 98% of streaming data is stored in
Cassandra
• Data ranges from customer
details to Viewing history /
streaming bookmarks to billing
and payment
Cassandra @ Netflix
Cassandra Footprint
• Hundreds of clusters
• Tens of Thousands of nodes
• PBs of data
• Millions of transactions / sec
Challenges
• Monitoring
• Maintenance
• Open source product
• Production readiness
Monitoring
• What do we monitor?
– Latencies
• Co-ordinator Read 99th and 95th based on cluster configurations
• Co-ordinator Write 99th and 95th based on cluster configurations
– Health
• Health check (Powered by Mantis)
• Gossip issues
• Thrift/ Binary services status
• Heap
• Dmesg - Hardware and network issues
Monitoring
• Recent maintenances
– Jenkins
– User initiated maintenances
• Wide row metrics
• Log file warning/ errors/exceptions
Common Approach
CRON System
Job
RunnerJob
RunnerJob
RunnerJob
Runner
Common Architecture
Problems inherent in polling
● Point-in-time snapshot, no state
● Establishing a connection to a cluster when it’s
under heavy load is problematic
● Not resilient to network hiccups, especially for
large clusters
A different approach
What if we had a continuous stream
of fine-grained snapshots ?
Mantis Streaming System
Stream processing system built on Apache Mesos
– Provides a flexible programming model
– Models computation as a distributed DAG
– Designed for high throughput, low latency
Health Check using Mantis
Source
Job
Local
Ring
Agg
Global
Ring
Agg
Source
Job
Source
Job
eu-west-1us-east-1us-west-2
Local
Ring
Agg
Local
Ring
Agg
Score
S
Health Evaluator
Consumes Scores
FSM
Health
Status
S
S
S
S
S
S
S
Score
MM
MM
MM
That’s great, but...
Now the health of the fleet is encapsulated in a
single data stream, so how do we make sense of
that ?
Real Time Dash (Macro View)
Macro View of the fleet
Real Time Dash (Cluster View)
Real Time Dash (Perspective)
Benefits
● Faster detection of issues
● Greater accuracy
● Massive reduction in false positives
● Separation of concerns (decouples detection
from remediation)
Known problems
• Distributed persistent stores (Not stateless)
• Unresponsive nodes
• Cloud
• Configurations setup and tuning
• Hot nodes / token distribution
• Resiliency
• Bootstrapping and automated token assignment
• Backup and recovery/restore
• Centralized configuration management
• REST API for most nodetool commands
• C* JMX metrics collection
• Monitor C* health
Building C* in cloud with Priam
(1) Alternate
availability zones
(a, b, c) around the
ring to ensure data
is written to
multiple data
centers.
(2) Survive the
loss of a data
center by ensuring
that we only lose
one node from
each replication
set.
A
B
C
A
B
c
A
B
C
A
B
C
Priam runs on each node and
will:
* Assign tokens to each
node, alternating (1) the
AZs around the ring (2).
* Perform nightly snapshot
backup to S3
* Perform incremental
SSTable backups to S3
* Bootstrap replacement
nodes to use vacated
tokens
* Collect JMX metrics for our
monitoring systems
* REST API calls to most
nodetool functions
Cassandra
Priam
Tomcat
Putting it all together
Constructing a cluster in AWS
AMI contains os, base netflix packages
and Cassandra and Priam
S3
2
Address DC Rack Status State Load Owns Token
…
###.##.##.### eu---west 1a Up Normal 108.97 GB 16.67% …
###.##.#.## us---east 1e Up Normal 103.72 GB 0.00% …
##.###.###.### eu---west 1b Up Normal 104.82 GB 16.67% …
##.##.##.### us---east 1c Up Normal 111.87 GB 0.00% …
###.##.##.### us---east 1e Up Normal 102.71 GB 0.00% …
##.###.###.### eu---west 1b Up Normal 101.87 GB 16.67% …
##.##.###.## us---east 1c Up Normal 102.83 GB 0.00% …
###.##.###.## eu---west 1c Up Normal 96.66 GB 16.67% …
##.##.##.### us---east 1d Up Normal 99.68 GB 0.00% …
Instance
Region
Availability Zone
(AZ)
Autoscaling Groups
ASGs do not map directly to
nodetool ring output, but are
used to define the cluster (# of
instances, AZs, etc).
Amazon Machine Image
Image loaded onto an AWS
instance; all packages needed
to run an application.
2
##.###.##.### eu---west 1c Up Normal 95.51 GB 16.67% …
##.##.##.## us---east 1d Up Normal 105.85 GB 0.00% …
##.###.##.### eu---west 1a Up Normal 91.25 GB 16.67% …
AWS Terminology
Constructing a cluster in AWS
Security Group
Defines access control
between ASGs
Resiliency
• Instance
• AZ
• Multiple AZ
• Region
Resiliency - Instance
• RF=AZ=3
• Cassandra bootstrapping works really well
• Replace nodes immediately
• Repair on regular interval
Resiliency - One AZ
• RF=AZ=3
• Alternating AZs ensures that each AZ has a full replica of
data
• Provision cluster to run at 2/3 capacity
• Ride out a zone outage; do not move to another zone
• Bootstrap one node at a time
• Repair after recovery
Resiliency - Multiple AZ
• Outage; can no longer satisfy quorum
• Restore from backup and repair
Resiliency - Region
• Connectivity loss between regions – operate as island
clusters until service restored
• Repair data between regions
NdBench - Netflix Data Benchmark
•
•
•
•
•
•
•
-
-
-
•
Stitching it together
C* as a Service - Architecture
J
E
N
K
I
N
S
W
I
N
S
T
O
N
EUNOMIA
Alert Atlas Mantis
C*
C*
C*
Priam
Bolt
Cluster
Metadata
Cluster Metadata/
Advisor
Maintenance
Remediation
C*
PAGE
CDE
Alert if
needed
Capacity
Prediction
Outlier
detection
C*
Forklifter
NDBench
C* Explorer
Client
Drivers
Log
analysis
•
•

More Related Content

What's hot

What's hot (20)

Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012Cassandra at eBay - Cassandra Summit 2012
Cassandra at eBay - Cassandra Summit 2012
 
Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup) Polyglot persistence @ netflix (CDE Meetup)
Polyglot persistence @ netflix (CDE Meetup)
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Kafka internals
Kafka internalsKafka internals
Kafka internals
 
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With KafkaNetflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
CockroachDB
CockroachDBCockroachDB
CockroachDB
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Introduction to Kafka Cruise Control
Introduction to Kafka Cruise ControlIntroduction to Kafka Cruise Control
Introduction to Kafka Cruise Control
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Uber Business Metrics Generation and Management Through Apache Flink
Uber Business Metrics Generation and Management Through Apache FlinkUber Business Metrics Generation and Management Through Apache Flink
Uber Business Metrics Generation and Management Through Apache Flink
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Flink Streaming
Flink StreamingFlink Streaming
Flink Streaming
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
OLAP for Big Data (Druid vs Apache Kylin vs Apache Lens)
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 

Similar to Cassandra serving netflix @ scale

Similar to Cassandra serving netflix @ scale (20)

Data Stores @ Netflix
Data Stores @ NetflixData Stores @ Netflix
Data Stores @ Netflix
 
"Traffic Speed Control System in the Cloud using Machine Learning" by Albert ...
"Traffic Speed Control System in the Cloud using Machine Learning" by Albert ..."Traffic Speed Control System in the Cloud using Machine Learning" by Albert ...
"Traffic Speed Control System in the Cloud using Machine Learning" by Albert ...
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Deep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instancesDeep Dive on Amazon EC2 instances
Deep Dive on Amazon EC2 instances
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Cassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra CommunityCassandra CLuster Management by Japan Cassandra Community
Cassandra CLuster Management by Japan Cassandra Community
 
Getting started with amazon redshift - Toronto
Getting started with amazon redshift - TorontoGetting started with amazon redshift - Toronto
Getting started with amazon redshift - Toronto
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Machine Learning on Distributed Systems by Josh Poduska
Machine Learning on Distributed Systems by Josh PoduskaMachine Learning on Distributed Systems by Josh Poduska
Machine Learning on Distributed Systems by Josh Poduska
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Scalability strategies for cloud based system architecture
Scalability strategies for cloud based system architectureScalability strategies for cloud based system architecture
Scalability strategies for cloud based system architecture
 
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
AWS re:Invent 2016: [JK REPEAT] Deep Dive on Amazon EC2 Instances, Featuring ...
 
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
AWS re:Invent 2016: Deep Dive on Amazon EC2 Instances, Featuring Performance ...
 
Cassandra's Odyssey @ Netflix
Cassandra's Odyssey @ NetflixCassandra's Odyssey @ Netflix
Cassandra's Odyssey @ Netflix
 
Azure IaaS Tanıtım - Uzun Anlatım
Azure IaaS Tanıtım - Uzun AnlatımAzure IaaS Tanıtım - Uzun Anlatım
Azure IaaS Tanıtım - Uzun Anlatım
 
AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features AWS Webcast - Redshift Overview and New Features
AWS Webcast - Redshift Overview and New Features
 
Deep Dive on Amazon EC2
Deep Dive on Amazon EC2Deep Dive on Amazon EC2
Deep Dive on Amazon EC2
 

More from Vinay Kumar Chella

More from Vinay Kumar Chella (9)

Building and running cloud native cassandra
Building and running cloud native cassandraBuilding and running cloud native cassandra
Building and running cloud native cassandra
 
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
Safer restarts, faster streaming, and better repair, just a glimpse of cassan...
 
Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0Live traffic capture and replay in cassandra 4.0
Live traffic capture and replay in cassandra 4.0
 
Query and audit logging in cassandra
Query and audit logging in cassandraQuery and audit logging in cassandra
Query and audit logging in cassandra
 
Looking towards an official cassandra sidecar netflix
Looking towards an official cassandra sidecar   netflixLooking towards an official cassandra sidecar   netflix
Looking towards an official cassandra sidecar netflix
 
A glimpse of cassandra 4.0 features netflix
A glimpse of cassandra 4.0 features   netflixA glimpse of cassandra 4.0 features   netflix
A glimpse of cassandra 4.0 features netflix
 
Honest performance testing with NDBench
Honest performance testing with NDBenchHonest performance testing with NDBench
Honest performance testing with NDBench
 
Real world repairs
Real world repairsReal world repairs
Real world repairs
 
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXCassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

Cassandra serving netflix @ scale

  • 1.
  • 2.
  • 3. CDE • Cloud Database Engineering • Responsible for providing data stores as services @ Netflix
  • 5. Agenda • Cassandra @ Netflix • Challenges • Certification and benchmarking • CDE Architecture
  • 6. • 98% of streaming data is stored in Cassandra • Data ranges from customer details to Viewing history / streaming bookmarks to billing and payment Cassandra @ Netflix
  • 7. Cassandra Footprint • Hundreds of clusters • Tens of Thousands of nodes • PBs of data • Millions of transactions / sec
  • 8. Challenges • Monitoring • Maintenance • Open source product • Production readiness
  • 9.
  • 10. Monitoring • What do we monitor? – Latencies • Co-ordinator Read 99th and 95th based on cluster configurations • Co-ordinator Write 99th and 95th based on cluster configurations – Health • Health check (Powered by Mantis) • Gossip issues • Thrift/ Binary services status • Heap • Dmesg - Hardware and network issues
  • 11. Monitoring • Recent maintenances – Jenkins – User initiated maintenances • Wide row metrics • Log file warning/ errors/exceptions
  • 12.
  • 15. Problems inherent in polling ● Point-in-time snapshot, no state ● Establishing a connection to a cluster when it’s under heavy load is problematic ● Not resilient to network hiccups, especially for large clusters
  • 16. A different approach What if we had a continuous stream of fine-grained snapshots ?
  • 17. Mantis Streaming System Stream processing system built on Apache Mesos – Provides a flexible programming model – Models computation as a distributed DAG – Designed for high throughput, low latency
  • 18. Health Check using Mantis Source Job Local Ring Agg Global Ring Agg Source Job Source Job eu-west-1us-east-1us-west-2 Local Ring Agg Local Ring Agg Score S Health Evaluator Consumes Scores FSM Health Status S S S S S S S Score MM MM MM
  • 19. That’s great, but... Now the health of the fleet is encapsulated in a single data stream, so how do we make sense of that ?
  • 20. Real Time Dash (Macro View) Macro View of the fleet
  • 21. Real Time Dash (Cluster View)
  • 22. Real Time Dash (Perspective)
  • 23. Benefits ● Faster detection of issues ● Greater accuracy ● Massive reduction in false positives ● Separation of concerns (decouples detection from remediation)
  • 24.
  • 25. Known problems • Distributed persistent stores (Not stateless) • Unresponsive nodes • Cloud • Configurations setup and tuning • Hot nodes / token distribution • Resiliency
  • 26.
  • 27. • Bootstrapping and automated token assignment • Backup and recovery/restore • Centralized configuration management • REST API for most nodetool commands • C* JMX metrics collection • Monitor C* health Building C* in cloud with Priam
  • 28. (1) Alternate availability zones (a, b, c) around the ring to ensure data is written to multiple data centers. (2) Survive the loss of a data center by ensuring that we only lose one node from each replication set. A B C A B c A B C A B C Priam runs on each node and will: * Assign tokens to each node, alternating (1) the AZs around the ring (2). * Perform nightly snapshot backup to S3 * Perform incremental SSTable backups to S3 * Bootstrap replacement nodes to use vacated tokens * Collect JMX metrics for our monitoring systems * REST API calls to most nodetool functions Cassandra Priam Tomcat Putting it all together Constructing a cluster in AWS AMI contains os, base netflix packages and Cassandra and Priam S3 2
  • 29. Address DC Rack Status State Load Owns Token … ###.##.##.### eu---west 1a Up Normal 108.97 GB 16.67% … ###.##.#.## us---east 1e Up Normal 103.72 GB 0.00% … ##.###.###.### eu---west 1b Up Normal 104.82 GB 16.67% … ##.##.##.### us---east 1c Up Normal 111.87 GB 0.00% … ###.##.##.### us---east 1e Up Normal 102.71 GB 0.00% … ##.###.###.### eu---west 1b Up Normal 101.87 GB 16.67% … ##.##.###.## us---east 1c Up Normal 102.83 GB 0.00% … ###.##.###.## eu---west 1c Up Normal 96.66 GB 16.67% … ##.##.##.### us---east 1d Up Normal 99.68 GB 0.00% … Instance Region Availability Zone (AZ) Autoscaling Groups ASGs do not map directly to nodetool ring output, but are used to define the cluster (# of instances, AZs, etc). Amazon Machine Image Image loaded onto an AWS instance; all packages needed to run an application. 2 ##.###.##.### eu---west 1c Up Normal 95.51 GB 16.67% … ##.##.##.## us---east 1d Up Normal 105.85 GB 0.00% … ##.###.##.### eu---west 1a Up Normal 91.25 GB 16.67% … AWS Terminology Constructing a cluster in AWS Security Group Defines access control between ASGs
  • 30. Resiliency • Instance • AZ • Multiple AZ • Region
  • 31. Resiliency - Instance • RF=AZ=3 • Cassandra bootstrapping works really well • Replace nodes immediately • Repair on regular interval
  • 32. Resiliency - One AZ • RF=AZ=3 • Alternating AZs ensures that each AZ has a full replica of data • Provision cluster to run at 2/3 capacity • Ride out a zone outage; do not move to another zone • Bootstrap one node at a time • Repair after recovery
  • 33. Resiliency - Multiple AZ • Outage; can no longer satisfy quorum • Restore from backup and repair
  • 34. Resiliency - Region • Connectivity loss between regions – operate as island clusters until service restored • Repair data between regions
  • 35.
  • 36.
  • 37.
  • 38. NdBench - Netflix Data Benchmark
  • 39.
  • 41.
  • 43.
  • 44.
  • 45.
  • 46.
  • 48. C* as a Service - Architecture J E N K I N S W I N S T O N EUNOMIA Alert Atlas Mantis C* C* C* Priam Bolt Cluster Metadata Cluster Metadata/ Advisor Maintenance Remediation C* PAGE CDE Alert if needed Capacity Prediction Outlier detection C* Forklifter NDBench C* Explorer Client Drivers Log analysis
  • 49.