High Throughput
Analytics with
Cassandra & Azure
Charles Lamanna
Principal Dev Lead
@clamanna
MetricsHub
keep cloud services up and running for the lowest possible cost
Live Status
Cost Awareness
Alerts and Notifications
Actions and Scaling
$
2000+ customers in 6 months
0
500
1000
1500
2000
2500
10/18/2012 12/7/2012 1/26/2013 3/17/2013 5/6/2013 6/25/2013
storing data
200M data points per hour
80,000 data points per second (peak)
Planning for huge data ingestion rates
Requires high scale, real-time data
1,000 data points per minute per VM
12 data poi...
Scales fluidly
Grows horizontally – double the nodes, double capacity
Add / remove capacity / nodes with no downtime
Highl...
… and by the way
Little-to-no operations cost
New nodes take minutes to setup
Nodes just keep running for months on end
“A...
architecture
68 virtual machines (PAAS and IAAS)
Table Storage
Jobs Worker Role
(24 instances)
SQL Database
Portal Web Role
(3 instances)
Cassandra VM Cluster
(32 XL insta...
Avoiding state
• Application logic / code all
lives on stateless machines
• Keeps it simple: decreases
human operations co...
Azure Cloud Services (PAAS)
• Scale horizontally (grew from 1
to 30+ instances)
• Managed by the platform
(patched; coordi...
Table Storage
Jobs Worker Role
(24 instances)
SQL
Database
Blob storage
Web API Web Role
(8 instances)
Endpoints Replicate...
32 nodes, 8 “pods” of 4 nodes
Exposed via a single
endpoint
Exposed via a single
endpoint
Exposing the pods
• Each pod of 4 nodes
has a single load
bala...
Where does the data go?
• Data files are on 16 mounted network
backed disks (*not* ephemeral disks)
• Data disks are geo-r...
Our Column Families (CQL 3)
CREATE TABLE oneminute (
rk text,
ck text,
cnt counter,
sum counter,
PRIMARY KEY (rk, ck)
);
Updating values…
Realtime “average” values at any granularity, for any time window
update
oneminute/tenminute/oneday
set
s...
Reading values…
*ONE* round trip to fetch a metric over time (e.g. CPU over past week)
select * from oneminute
where
rk = ...
Some hard lessons…
• Static private IPs are a must; otherwise, reboots / outages can
confuse the cluster when nodes come b...
Single node tests..
• 4 disks, RAID 0, no read cache
Workload
(%write)
Ops / sec Latency
median
Latency
95th
Latency
99th
...
Workload
(%write)
Ops / sec Latency
Median
Latency
95th
Latency
99th
%100 13638 1.9 4.9 24.0
%75 3239 11.2 687.0 1099.3
%2...
Charles Lamanna
Charles.Lamanna@Microsoft.com
@clamanna
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & Azure
High Throughput Analytics with Cassandra & Azure
Upcoming SlideShare
Loading in...5
×

High Throughput Analytics with Cassandra & Azure

2,781

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,781
On Slideshare
0
From Embeds
0
Number of Embeds
14
Actions
Shares
0
Downloads
22
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

High Throughput Analytics with Cassandra & Azure

  1. 1. High Throughput Analytics with Cassandra & Azure Charles Lamanna Principal Dev Lead @clamanna
  2. 2. MetricsHub keep cloud services up and running for the lowest possible cost
  3. 3. Live Status Cost Awareness Alerts and Notifications Actions and Scaling $
  4. 4. 2000+ customers in 6 months 0 500 1000 1500 2000 2500 10/18/2012 12/7/2012 1/26/2013 3/17/2013 5/6/2013 6/25/2013
  5. 5. storing data 200M data points per hour 80,000 data points per second (peak)
  6. 6. Planning for huge data ingestion rates Requires high scale, real-time data 1,000 data points per minute per VM 12 data points per endpoint per minute Aggregate, analyze and take actions based on this data stream (in near real-time) Must be cheap, scalable and reliable
  7. 7. Scales fluidly Grows horizontally – double the nodes, double capacity Add / remove capacity / nodes with no downtime Highly available No single point of failure Replication factor (i.e. hot copies) is just a config switch
  8. 8. … and by the way Little-to-no operations cost New nodes take minutes to setup Nodes just keep running for months on end “Aggregate on write” – no jobs required! Distributed counters make it easy to do aggregates on write …and a nice kicker: has *great* perf / COGS in Azure
  9. 9. architecture 68 virtual machines (PAAS and IAAS)
  10. 10. Table Storage Jobs Worker Role (24 instances) SQL Database Portal Web Role (3 instances) Cassandra VM Cluster (32 XL instances) Web API Web Role (8 instances) End User Web Browsers Monitored Customer Resources (e.g. websites; SQL databases) Monitored Virtual Machines Endpoints Replicated data in multiple datacenters Clients PaaS IaaS Services
  11. 11. Avoiding state • Application logic / code all lives on stateless machines • Keeps it simple: decreases human operations cost • Use Azure PAAS offerings (Web and Worker roles) Table Storage Jobs Worker Role (24 instances) SQL Database Blob storage Portal Web Role (3 instances) Cassandra VM Cluster (32 XL instances) Web API Web Role (8 instances) Endpoints Replicated data in multiple datacenters PaaS
  12. 12. Azure Cloud Services (PAAS) • Scale horizontally (grew from 1 to 30+ instances) • Managed by the platform (patched; coordinated recycling; failover; etc.) • 1 click deployment from Visual Studio (with automatic load balancer swaps)
  13. 13. Table Storage Jobs Worker Role (24 instances) SQL Database Blob storage Web API Web Role (8 instances) Endpoints Replicated data in multiple datacenters Maintains all state for metrics / time series data 32 XL Linux Virtual Machines Portal Web Role (3 instances) Cassandra VM Cluster (32 XL instances) Cassandra Cluster IaaS
  14. 14. 32 nodes, 8 “pods” of 4 nodes
  15. 15. Exposed via a single endpoint Exposed via a single endpoint Exposing the pods • Each pod of 4 nodes has a single load balanced endpoint • Clients (on our stateless roles) treats the endpoints as a pool • Blacklists and skips an endpoint if it starts producing a lot of errors
  16. 16. Where does the data go? • Data files are on 16 mounted network backed disks (*not* ephemeral disks) • Data disks are geo-replicated (3 copies local; 1 remote) for “free” DR • Azure data disks offer great throughput (VMs end up CPU bound)
  17. 17. Our Column Families (CQL 3) CREATE TABLE oneminute ( rk text, ck text, cnt counter, sum counter, PRIMARY KEY (rk, ck) );
  18. 18. Updating values… Realtime “average” values at any granularity, for any time window update oneminute/tenminute/oneday set sum = sum + {sample_value}, cnt = cnt + 1 where rk = '{customer+metric}' and ck = '{tags_and_timestamp}'
  19. 19. Reading values… *ONE* round trip to fetch a metric over time (e.g. CPU over past week) select * from oneminute where rk = ‘{customer_name}' and ck < '{metric_path_start}' and ck >= '{metric_path_end}‘ order by ck desc;
  20. 20. Some hard lessons… • Static private IPs are a must; otherwise, reboots / outages can confuse the cluster when nodes come back up • Monitor performance carefully; once you tip over, it is hard to rebalance the cluster and add new nodes • Fit the cluster to the platform: in Azure, match the Upgrade Domains / Fault Domains to preserve uptime during service maintenance / hardware failure
  21. 21. Single node tests.. • 4 disks, RAID 0, no read cache Workload (%write) Ops / sec Latency median Latency 95th Latency 99th %100 20018 1.5 3.7 7.9 %75 8361 85.9 376.6 584.8 %25 5412 459.9 759.1 940.1 • 4 disks, RAID 0, read cache Workload (%write) Ops / sec Latency median Latency 95th Latency 99th %100 19208 1.5 3.8 7.9 18543 1.5 3.6 7.9 18563 1.4 3.6 8.2 %75 7112 195.9 595.8 1099.6 7581 168.9 589.5 985.2 5149 256.5 774.0 1402.9 %25 15358 23.0 110.2 309.1 3742 279.2 563.0 789.7 15376 22.1 98.8 293.3 0 1000 2000 3000 4000 5000 6000 7000 jbod RAID0 JBOD vs RAID 0 for read- heavy workload
  22. 22. Workload (%write) Ops / sec Latency Median Latency 95th Latency 99th %100 13638 1.9 4.9 24.0 %75 3239 11.2 687.0 1099.3 %25 1825 243.6 687.0 808.7 Multi-node load tests.. • 4 Nodes; RF = 3 (Quorom) • 8 Disks, RAID 0
  23. 23. Charles Lamanna Charles.Lamanna@Microsoft.com @clamanna
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×