For this upcoming meetup, we welcome Patrick Eaton PhD, Systems Architect at Stackdriver, and Joey Imbasciano, Cloud Platform Engineer at Stackdriver.
What You'll Learn At This Meetup:
• Why Stackdriver chose Cassandra over other DB offerings
• Stackdriver's data pipeline that runs into Cassandra
• Operating Cassandra Running on AWS
• Stackdriver's approach to disaster recovery
Patrick and Joey will be presenting their use of Apache Cassandra at Stackdriver, some lesson's learned, technical tips and a Q&A to end the evening.
1. Running Cassandra in AWS
Patrick Eaton, PhD
patrick@stackdriver.com
@PatrickREaton
Joey Imbasciano
joey@stackdriver.com
@_joeyi
2. Stackdriver at a Glance
Stackdriver's hosted intelligent monitoring service helps
SaaS companies innovate more by reducing the burden of
day-to-day operations
● Cloud-native and cloud-aware
● Designed for complex distributed applications
● Founded by cloud/infrastructure industry veterans
(Microsoft, VMware, EMC, Endeca, Red Hat) with deep
systems and DevOps expertise
● Team of ~25, based in Downtown Boston
3. Intelligent Monitoring
Discover customer’s cloud-hosted
applications
●
●
●
●
Infrastructure inventory
Logical units, like groups/clusters
Services, hosted and self-managed
Elastic resources
Monitor
●
●
Various data sources
● Provider metrics
● Host metrics
● Custom metrics
● Endpoints
● Events
● Health
Rich visualizations
Analyze
●
●
●
●
●
Integrate data sources
Aggregate metrics
Report utilization, cost, etc.
Detect policy violations
Recommend actions
4. Lambda Architecture
●
●
●
●
●
●
Typical of modern architectures for on-line
applications.
Formalized by Nathan Marz
Composed of "batch", "speed", and "serving" layers
Batch layer
○ Store of record
○ Compute arbitrary views
Speed layer
○ Low latency updates
○ Streaming algorithms
Serving layer
○ Combine data from batch and speed layers to
answer queries
Serving
Speed
Batch
Data
5. Stackdriver Architecture
●
●
●
●
●
Shares characteristics of lambda architecture
Indexing (speed) path
○ Make "live" data available "pre-analysis"
Analysis (batch) path
○ Compute aggregations
○ Create recommendations
Query (serving) layer
○ Combine "live" and analyzed
data to answer queries
○ May require on-the-fly analysis
Alerting (speed) path (not discussed here)
○ Stream processing to detect
Query
(Serving)
Notification
(Serving)
Database
Indexing
(Speed)
Analysis
(Batch)
policy-based anomalies
Data
Alerting
(Speed)
6. Database Options
● We chose Cassandra!
○ True P2P architecture
○ Good support for write-heavy workloads
○ Compatible data model for time series data
■ Column per metric type, timestamps as columns
● Why not MySQL?
○ Experience with operating large, sharded deployments
○ Relational data model not a good match
● Why not HBase?
○ Operational complexity - zk, hadoop, hdfs, ...
○ Special "Master" role
● Why not Dynamo?
○ Avoid vendor lock-in and high cost
7. Stackdriver Architecture ++
●
Archival pipeline stores all data
● Very small surface area, battle-tested
● Critical for disaster recovery
● S3 considered durable enough
● Replicated for availability
Query
Cassandra
Roll-ups
Analysis
Recs
Inventory
Data Series
Analyze
●
●
●
Archive means Cassandra is "soft state"
C* consolidates analysis and indexing results
Properties of data in C*
● Immutable data
● Append-only
● Read-1, write-1 consistency
S3
Archive
Index
●
Scales out easily
● Indexers, archivers, analyzers, query servers
Data
8. Cassandra at Stackdriver Cluster Configuration
●
●
●
●
●
●
Version: Datastax Community Edition 1.2.10
Replication Factor: 3
Vnodes
Murmur3Partitioner
Ec2Snitch
○ Aids in request efficiency
○ Enables Cassandra to ensure replicas are in
different Availability Zones
phi_convict_threshold: 8 -> 12
○ Used to determine when nodes are down
○ AWS network can be spotty
9. Cassandra Topology in AWS
Where we started...
Where we are...
1
us-east-1a
us-east-1a
3
2
us-east-1c
us-east-1b
us-east-1c
Keep it balanced!
us-east-1b
10. Cassandra EC2 Node Configuration
● m1.xlarge
○ 4 cores
○ 15 GB RAM
○ 4 ephemeral disks available
● 4 disks RAID-0 for Data Volume and CommitLog
○
○
○
○
ext4 - defaults,noatime
mdadm RAID-0
Compactions
Heavy Read/Write IO
11. Cassandra Automation and Operations
● Combination of Boto, Fabric, &
Puppet
○ Boto for AWS API
○ Fabric + Puppet for Bootstrapping
○ Fabric for Operations
● One command to:
○
○
○
○
○
Launch a new cluster
Upsize a cluster
Replace a dead node
Remove existing nodes
List nodes in a cluster
13. Cassandra Backups using S3
● No Cassandra Powered Backups
● Restore from S3
● Useful for major version upgrades
Data
S3
Bulk
Loader
Map
Reduce
1. Data is archived when it is received
2. Bulk loader reads from S3
3. M/R re-analyzes data
4. Cassandra is repopulated
Cassandra
14. Disaster Recover in the Wild
●
●
●
●
●
●
●
●
October 23, Stackdriver suffered a total loss of our C* cluster
● Exhausted memory due to number of open file descriptors (see graph)
We did not notice the problem until it was too late
● Nodes began crashing, resulted in inconsistent view of the ring
Attempted to restart the cluster unsuccessfully for ~2 hours
Provisioned new 36 node cluster in ~2 hours
Directed “live” data to new cluster
Started bulk restore operation from archive
● Full-fidelity data and aggregations
No data loss due to archival pipeline
See http://www.stackdriver.com/post-mortem-october-23-stackdriver-outage/