• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud
 

Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud

on

  • 4,494 views

At Comcast Silicon Valley we have developed a general purpose message bus for the cloud. The service is API compatible with Amazon’s SQS/SNS and is built on Cassandra and Redis with the goal of ...

At Comcast Silicon Valley we have developed a general purpose message bus for the cloud. The service is API compatible with Amazon’s SQS/SNS and is built on Cassandra and Redis with the goal of linear horizontal scalability. In this Webinar we will explore the architecture of the system and how we employ Cassandra as a central component to meet key requirements. We will also take a look at the latest performance numbers.

Statistics

Views

Total Views
4,494
Views on SlideShare
4,494
Embed Views
0

Actions

Likes
3
Downloads
0
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud Presentation Transcript

    • CMBA Message Bus for the Cloud
    • CMBA Message Bus for the CloudCQS – Queuing ServiceCNS – Topic based Pub Sub Service
    • Why did we build our own?• General purpose message bus to replace projectdriven one-off solutions• Smooth data center failover, maybe even “active-active” queues• Must scale to millions of queues and 1000s ofmessages/sec (for example 1 queue per STB)• Tight latency requirements (“10ms response time95th pct”)• Evaluated other options to arrive at AWSSQS/SNS
    • AWS SQS Primer“Simple Queuing Service”• Focus on guaranteed delivery• Best effort on orderly delivery, duplicates• Few simple core APIs:– CreateQueue() / DeleteQueue()– SendMessage()– ReceiveMessage()– DeleteMessage()• Do not trust message recipients
    • Why did we build our own?AWS SQSGuaranteed Delivery +Simple, Standard API +Horizontally Scalable +Active-Active ?DC Failover ?Latency ?Limitations (Msg Size, # Artifacts, …) ?
    • “Build a horizontally scalable queuing service ontop of Cassandra (and Redis) which is APIcompatible with AWS SQS API”
    • CQS over Cassandra & Redis• Cassandra– Cross-DC persistence and replication– Proven horizontal scalability• Redis– Meet latency requirements– Help with best effort ordering– Handle Visibility Timeout (VTO)
    • Cassandra Data Modeling• How to represent queued messages inCassandra?– Single Column Queue– Single Row Queue– Multi-Row Queue
    • Cassandra Data ModelingSingle Column Queue
    • Cassandra Data ModelingSingle Row Queue
    • Cassandra Data ModelingMulti-Row Queue
    • CQS Data Flow Example1. SendMessage(MSG1)2. SendMessage(MSG2)3. SendMessage(MSG3)4. MSG1=ReceiveMessage()5. DeleteMessage(MSG1)
    • CQS Data Flow Example
    • CQS Data Flow Example
    • CQS Data Flow Example
    • CQS Data Flow Example
    • CQS Data Flow Example
    • CQS Data Flow Example
    • CQS ArchitectureRecap• Cassandra Persistence Layer– Messages sharded across 100 rows per queue• Avoid wide rows (> 500K)• Minimize churn (Tombstones)• Distribute queue among Cassandra nodes• Redis Caching Layer– To meet latency requirements• Payload cache (kicks in after first miss, pre-load next 10k)– Improve FIFOness by storing Msg IDs in Redis List– Handle message visibility entirely in Redis (Hashtable)
    • Cassandra Data ModelingKey Cassandra Features• Persistence and failover– Cross-DC replication in combination with Local QuorumReads/Writes (tunable consistency)• Millions of queues, spiky traffic patterns– Massive horizontal scalability• Message order (FIFOness) / future dated messages– Wide rows, composite column keys / TimeUUID andcolumn sort order• Message retention period (expiration)– TTL• Fast lookup of static metadata (Queues, Users etc.)– Row Cache, Secondary Indexes
    • Cassandra Data ModelingLessons Learned• Coming from RDBMS background…– Forget the table analogy, rather:• CF = HashMap<RowKey, TreeMap<ColKey, ColValue>– No need to specify column names in advance• Wide rows, value-less columns, composite keys– No unique constraints, no foreign keys, no joins:• Design schema around your queries• Use de-normalization where needed– No inserts (everything is an update!)• Design for idempotent operations• Use globally unique identifiers– But, there are indexes• Use secondary indexes
    • CQS Scalability and Availability• Scalability– Send(), Receive(), Delete()• Scale with Cassandra Ring, API Servers (stateless) andRedis Shards• Are constant time operations– Queues not sharded across Redis servers!• Availability– Depends on availability of Cassandra– Service functions without Redis!
    • CQS DC Failover
    • AWS SNS API“Simple Notification Service”• Topic based Publish/Subscribe Service• Supported protocols: HTTP/CQS/SQS• Few simple core APIs– CreateTopic() / DeleteTopic()– Subscribe() / Unsubscribe()– ConfirmSubscription()– Publish()• Do not trust message recipients (redeliverypolicy)
    • CNS Data Flow Example1. Publish(MSG1)Publish message MSG1 to a topic T with four subscribers:• S1 (HTTP)• S2 (HTTP)• S5 (CQS)• S6 (CQS)
    • CNS Data Flow Example
    • CNS Data Flow Example
    • CNS Data Flow Example
    • CNS Data Flow Example
    • CNS Architecture• CQS Queue preserves messages when PublishWorkers are down or overloaded• CQS Visibility Timeout takes care of guaranteeddelivery• Retry policy and guaranteed delivery– http://docs.aws.amazon.com/sns/latest/gsg/DeliveryPolicies.html• Publish Workers hardened for rogue endpoints– Fail endpoints, slow endpoints, …
    • Differences SQS/SNS and CQS/CNS• Goal: Full API compatibility• Current state:– All APIs implemented, most parameters supported– Can use AWS Java SDK and others• Limitations:– AWS4 signatures not supported (V1 and V2 ok)– SMS endpoints not supported, limited email support• Enhancements:– Additional APIs for monitoring and management: PeekMessage(),HealthCheck(), GetWorkerState(), ManageWorker(), ManageService(),GetAPIStats()– Unlimited number of queues, topics and subscriptions– Adjustable message size and other parameters (SNS <= 64KB, SQS <=64KB, LP <= 20 sec, DS <= 900 sec, RP, …)
    • Use CaseX1 Sports App
    • Use CaseX1 Sports App
    • Use CaseX1 Sports App
    • Use CaseCNS with CQS Endpoints
    • Moving Forward• Open Sourced (Apache 2.0)• Hardening– CNS Chaos Monkey, …• Follow SNS / SQS– SNS Throttle Policy, AWS4 Sig…• Load and stress testing• Simplify deployment & scale up– Embedded Jetty, RPM package, Puppet scripts…• Production deployments (isolated by application)• CQS as a Service• OpenStack integration
    • Thank You!http://github.com/Comcast/cmbhttp://groups.google.com/forum/#!forum/cmb-user-forumbwolf@sv.comcast.com