Eric Lubow
@elubow
elubow@simplereach.com
Message
Architectures
in Distributed
Systems
Message Architectures in Distributed Systems Eric Lubow @elubow
Overview
• SimpleReach
• Why is messaging important
• Goal...
Message Architectures in Distributed Systems Eric Lubow @elubow
Personal Vanity
• CTO of SimpleReach
• Co-author of Practi...
Message Architectures in Distributed Systems Eric Lubow @elubow
Message Architectures in Distributed Systems Eric Lubow @elubow
Message Architectures in Distributed Systems Eric Lubow @elubow
• Millions of URLs per day
• Over 3.75 billion page views ...
Message Architectures in Distributed Systems Eric Lubow @elubow
Why is Messaging Important?
• Most large scale systems dis...
Message Architectures in Distributed Systems Eric Lubow @elubow
The database is NOT a transport layer
App
❶
❹
❸
❷
incoming request
sync persist data
send response
async queue message
Data Flow
Message Architectures in Distri...
Message Architectures in Distributed Systems Eric Lubow @elubow
Goals
• Consistent interfaces between systems
• Allow acce...
Message Architectures in Distributed Systems Eric Lubow @elubow
Messaging Systems
• RabbitMQ
• ZeroMQ
• Kafka
• Amazon SQS...
Message Architectures in Distributed Systems Eric Lubow @elubow
What Did SimpleReach Choose?
Message Architectures in Dist...
Message Architectures in Distributed Systems Eric Lubow @elubow
NSQ
• Distributed and de-centralized topology
• At least o...
Message Architectures in Distributed Systems Eric Lubow @elubow
separate hosts
• a topic is a distinct stream of messages
...
Message Architectures in Distributed Systems Eric Lubow @elubow
Everyone Speaks The Same Language
http:// + {“content-type...
Message Architectures in Distributed Systems Eric Lubow @elubow
Goals
• Consistent interfaces between systems
Message Architectures in Distributed Systems Eric Lubow @elubow
• nsqadmin provides a web interface to
administrate and in...
Message Architectures in Distributed Systems Eric Lubow @elubow
Right Tool For The Job
Message Architectures in Distributed Systems Eric Lubow @elubow
Goals
• Consistent interfaces between systems
• Allow acce...
Message Architectures in Distributed Systems Eric Lubow @elubow
NSQ
NSQD
API
consumer
NSQ
NSQD
API
NSQ
NSQD
API
consumer
n...
Message Architectures in Distributed Systems Eric Lubow @elubow
The Schrute of the Problem
Message Architectures in Distributed Systems Eric Lubow @elubow
Goals
• Consistent interfaces between systems
• Allow acce...
Message Architectures in Distributed Systems Eric Lubow @elubow
Simple Deployment & Automation
• Chef cookbook - github.co...
Message Architectures in Distributed Systems Eric Lubow @elubow
Goals
• Consistent interfaces between systems
• Allow acce...
Message Architectures in Distributed Systems Eric Lubow @elubow
nsqlookupd nsqlookupd
consumer
➊ regularly poll for topic ...
Message Architectures in Distributed Systems Eric Lubow @elubow
Goals
• Consistent interfaces between systems
• Allow acce...
Message Architectures in Distributed Systems Eric Lubow @elubow
Path of a Packet
Internet
EC
InternalAPI
Solr
C*
Mongo
Red...
Message Architectures in Distributed Systems Eric Lubow @elubow
Message Architectures in Distributed Systems Eric Lubow @elubow
Controlled Data Flow
Social Event
Collector
Social Data
Ba...
Message Architectures in Distributed Systems Eric Lubow @elubow
Controlled Data Flow
Social Event
Collector
Social Data
Ba...
Message Architectures in Distributed Systems Eric Lubow @elubow
Broadcast Importance for Polyglottany
Aggregator
Mongo Wri...
Message Architectures in Distributed Systems Eric Lubow @elubow
Message Architectures in Distributed Systems Eric Lubow @elubow
Controlled Data Flow
Social Event
Collector
Social Data
Ba...
Message Architectures in Distributed Systems Eric Lubow @elubow
Goals
• Consistent interfaces between systems
• Allow acce...
Message Architectures in Distributed Systems Eric Lubow @elubow
What Is Enrichment?
A mechanism to add
value to a message ...
Message Architectures in Distributed Systems Eric Lubow @elubow
How Do We Enrich
Raw Event
Enriched
Event
Consumer A
Consu...
Message Architectures in Distributed Systems Eric Lubow @elubow
Goals
• Consistent interfaces between systems
• Allow acce...
Message Architectures in Distributed Systems Eric Lubow @elubow
Monitoring / Instrumentation
• Comes with statsd support b...
Message Architectures in Distributed Systems Eric Lubow @elubow
Goals
• Consistent interfaces between systems
• Allow acce...
Message Architectures in Distributed Systems Eric Lubow @elubow
Summary
• Large Systems are more than just storage
• Abstr...
Message Architectures in Distributed Systems Eric Lubow @elubow
We’re
Hiring
Message Architectures in Distributed Systems Eric Lubow @elubow
Questions are guaranteed in life.
Answers aren’t.
Eric Lub...
Upcoming SlideShare
Loading in …5
×

Cassandra Day NY 2014: Message Architectures in Distributed Systems at SimpleReach

1,083 views
972 views

Published on

Eric will be presenting on SimpleReach's use of message architectures and why they an important part of a distributed system stack. They are often overlooked because the prevailing sentiment is that the storage and processing engines are the most important aspects of the system. Without the highways, the data won’t be able to get to its destination.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,083
On SlideShare
0
From Embeds
0
Number of Embeds
249
Actions
Shares
0
Downloads
22
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Cassandra Day NY 2014: Message Architectures in Distributed Systems at SimpleReach

  1. 1. Eric Lubow @elubow elubow@simplereach.com Message Architectures in Distributed Systems
  2. 2. Message Architectures in Distributed Systems Eric Lubow @elubow Overview • SimpleReach • Why is messaging important • Goals • Explanations • Questions
  3. 3. Message Architectures in Distributed Systems Eric Lubow @elubow Personal Vanity • CTO of SimpleReach • Co-author of Practical Cassandra • Skydiver, Mixed Martial Artist, Motorcyclist, Dog dad, NY Giants fan • IronMatt Foundation for Pediatric Brian Tumors (ironmatt.org)
  4. 4. Message Architectures in Distributed Systems Eric Lubow @elubow
  5. 5. Message Architectures in Distributed Systems Eric Lubow @elubow
  6. 6. Message Architectures in Distributed Systems Eric Lubow @elubow • Millions of URLs per day • Over 3.75 billion page views per month • 7b events per day (~80k events/second) • Auto-scale 175-190 machines depending on traffic • Built a predictive measurement algorithm for the social web SimpleReach
  7. 7. Message Architectures in Distributed Systems Eric Lubow @elubow Why is Messaging Important? • Most large scale systems discussions only talk about storage • Direct high volumes of data around your infrastructure • Control flow of data through your infrastructure • Decouple important systems • Scalability, Elasticity, Deliverability, and Redundancy • Buffering and Asynchronous communication
  8. 8. Message Architectures in Distributed Systems Eric Lubow @elubow The database is NOT a transport layer
  9. 9. App ❶ ❹ ❸ ❷ incoming request sync persist data send response async queue message Data Flow Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  10. 10. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling • Controlled Data Flow Patterns • Enrichment/In-stream Modification Schemes • Monitoring and Instrumentation
  11. 11. Message Architectures in Distributed Systems Eric Lubow @elubow Messaging Systems • RabbitMQ • ZeroMQ • Kafka • Amazon SQS • NSQ • ActiveMQ • Resque • Custom
  12. 12. Message Architectures in Distributed Systems Eric Lubow @elubow What Did SimpleReach Choose? Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  13. 13. Message Architectures in Distributed Systems Eric Lubow @elubow NSQ • Distributed and de-centralized topology • At least once delivery guaranteed • Multicast style message routing • Simple to configure and deploy • Allow for maintenance windows with no downtime • Ephemeral channels for testing • Channel sampling github.com/bitly/nsq
  14. 14. Message Architectures in Distributed Systems Eric Lubow @elubow separate hosts • a topic is a distinct stream of messages (a single nsqd instance can have multiple topics) • a channel is an independent queue for a topic (a topic can have multiple channels) • consumers discover producers by querying nsqlookupd (a discovery service for topics) • topics and channels are created at runtime (just start publishing/ subscribing) nsqd “metrics” Channels “event” Topics “enrichment” “writer” Consumers AAABBB Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14 Topics and Channels
  15. 15. Message Architectures in Distributed Systems Eric Lubow @elubow Everyone Speaks The Same Language http:// + {“content-type”: “application/json”} Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  16. 16. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems
  17. 17. Message Architectures in Distributed Systems Eric Lubow @elubow • nsqadmin provides a web interface to administrate and introspect an NSQ cluster at runtime (and empty, pause, or delete topics/ channels) • nsq_to_http - utility that helps transport an aggregate stream over HTTP • nsq_to_file - utility that safely persists an aggregated stream to disk • nsq_stat - iostat like utility for a topic/channel • nsq_tail - tail like utility for a topic/channel NSQ Tools Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  18. 18. Message Architectures in Distributed Systems Eric Lubow @elubow Right Tool For The Job
  19. 19. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets
  20. 20. Message Architectures in Distributed Systems Eric Lubow @elubow NSQ NSQD API consumer NSQ NSQD API NSQ NSQD API consumer nsqlookupd nsqlookupd PUBLISH REGISTER DISCOVER SUBSCRIBE How Does It Work? Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  21. 21. Message Architectures in Distributed Systems Eric Lubow @elubow The Schrute of the Problem
  22. 22. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability
  23. 23. Message Architectures in Distributed Systems Eric Lubow @elubow Simple Deployment & Automation • Chef cookbook - github.com/simplereach/chef-nsq • Written in Go • Easily distributable binaries • Deploy lookup nodes • Nsqd’s installed locally
  24. 24. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge
  25. 25. Message Architectures in Distributed Systems Eric Lubow @elubow nsqlookupd nsqlookupd consumer ➊ regularly poll for topic producers ➋ connect to all producers HTTP requests Runtime Discovery Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  26. 26. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling
  27. 27. Message Architectures in Distributed Systems Eric Lubow @elubow Path of a Packet Internet EC InternalAPI Solr C* Mongo Redis Vertica API Fire Hose SC Consumers Queue
  28. 28. Message Architectures in Distributed Systems Eric Lubow @elubow
  29. 29. Message Architectures in Distributed Systems Eric Lubow @elubow Controlled Data Flow Social Event Collector Social Data Batch & Write Processed Data Batch & Write Raw Data Calculate Score Write NSQ Broadcast NSQ
  30. 30. Message Architectures in Distributed Systems Eric Lubow @elubow Controlled Data Flow Social Event Collector Social Data Batch & Write Processed Data Batch & Write Raw Data Calculate Score Write NSQ Broadcast NSQ
  31. 31. Message Architectures in Distributed Systems Eric Lubow @elubow Broadcast Importance for Polyglottany Aggregator Mongo Writer Broadcast Redis Writer Cassandra Writer Solr Writer Calculator NSQ Vertica Writer
  32. 32. Message Architectures in Distributed Systems Eric Lubow @elubow
  33. 33. Message Architectures in Distributed Systems Eric Lubow @elubow Controlled Data Flow Social Event Collector Social Data Batch & Write Processed Data Batch & Write Raw Data Calculate Score Write NSQ Broadcast NSQ
  34. 34. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling • Controlled Data Flow
  35. 35. Message Architectures in Distributed Systems Eric Lubow @elubow What Is Enrichment? A mechanism to add value to a message to enhance processing in your system
  36. 36. Message Architectures in Distributed Systems Eric Lubow @elubow How Do We Enrich Raw Event Enriched Event Consumer A Consumer B Consumer C NSQ Broadcast
  37. 37. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling • Controlled Data Flow • Enrichment
  38. 38. Message Architectures in Distributed Systems Eric Lubow @elubow Monitoring / Instrumentation • Comes with statsd support built-in • Statsd talks to both Graphite and nsqadmin • Nsqadmin comes with graphs for message processing stats • Nagios plugins available for monitoring topic/channel depth • Average end to end latency calculations are done on a per-channel basis
  39. 39. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling • Controlled Data Flow • Enrichment • Monitoring and Instrumentation
  40. 40. Message Architectures in Distributed Systems Eric Lubow @elubow Summary • Large Systems are more than just storage • Abstraction • Highly Available • Controlled Data Flow Patterns • Monitoring & Automation
  41. 41. Message Architectures in Distributed Systems Eric Lubow @elubow We’re Hiring
  42. 42. Message Architectures in Distributed Systems Eric Lubow @elubow Questions are guaranteed in life. Answers aren’t. Eric Lubow @elubow elubow@simplereach.com Cassandra Day, New York Thank you.

×