Your SlideShare is downloading. ×
Cassandra Day NY 2014: Message Architectures in Distributed Systems at SimpleReach
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cassandra Day NY 2014: Message Architectures in Distributed Systems at SimpleReach

615
views

Published on

Eric will be presenting on SimpleReach's use of message architectures and why they an important part of a distributed system stack. They are often overlooked because the prevailing sentiment is that …

Eric will be presenting on SimpleReach's use of message architectures and why they an important part of a distributed system stack. They are often overlooked because the prevailing sentiment is that the storage and processing engines are the most important aspects of the system. Without the highways, the data won’t be able to get to its destination.

Published in: Technology

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
615
On Slideshare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
16
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Eric Lubow @elubow elubow@simplereach.com Message Architectures in Distributed Systems
  • 2. Message Architectures in Distributed Systems Eric Lubow @elubow Overview • SimpleReach • Why is messaging important • Goals • Explanations • Questions
  • 3. Message Architectures in Distributed Systems Eric Lubow @elubow Personal Vanity • CTO of SimpleReach • Co-author of Practical Cassandra • Skydiver, Mixed Martial Artist, Motorcyclist, Dog dad, NY Giants fan • IronMatt Foundation for Pediatric Brian Tumors (ironmatt.org)
  • 4. Message Architectures in Distributed Systems Eric Lubow @elubow
  • 5. Message Architectures in Distributed Systems Eric Lubow @elubow
  • 6. Message Architectures in Distributed Systems Eric Lubow @elubow • Millions of URLs per day • Over 3.75 billion page views per month • 7b events per day (~80k events/second) • Auto-scale 175-190 machines depending on traffic • Built a predictive measurement algorithm for the social web SimpleReach
  • 7. Message Architectures in Distributed Systems Eric Lubow @elubow Why is Messaging Important? • Most large scale systems discussions only talk about storage • Direct high volumes of data around your infrastructure • Control flow of data through your infrastructure • Decouple important systems • Scalability, Elasticity, Deliverability, and Redundancy • Buffering and Asynchronous communication
  • 8. Message Architectures in Distributed Systems Eric Lubow @elubow The database is NOT a transport layer
  • 9. App ❶ ❹ ❸ ❷ incoming request sync persist data send response async queue message Data Flow Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  • 10. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling • Controlled Data Flow Patterns • Enrichment/In-stream Modification Schemes • Monitoring and Instrumentation
  • 11. Message Architectures in Distributed Systems Eric Lubow @elubow Messaging Systems • RabbitMQ • ZeroMQ • Kafka • Amazon SQS • NSQ • ActiveMQ • Resque • Custom
  • 12. Message Architectures in Distributed Systems Eric Lubow @elubow What Did SimpleReach Choose? Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  • 13. Message Architectures in Distributed Systems Eric Lubow @elubow NSQ • Distributed and de-centralized topology • At least once delivery guaranteed • Multicast style message routing • Simple to configure and deploy • Allow for maintenance windows with no downtime • Ephemeral channels for testing • Channel sampling github.com/bitly/nsq
  • 14. Message Architectures in Distributed Systems Eric Lubow @elubow separate hosts • a topic is a distinct stream of messages (a single nsqd instance can have multiple topics) • a channel is an independent queue for a topic (a topic can have multiple channels) • consumers discover producers by querying nsqlookupd (a discovery service for topics) • topics and channels are created at runtime (just start publishing/ subscribing) nsqd “metrics” Channels “event” Topics “enrichment” “writer” Consumers AAABBB Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14 Topics and Channels
  • 15. Message Architectures in Distributed Systems Eric Lubow @elubow Everyone Speaks The Same Language http:// + {“content-type”: “application/json”} Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  • 16. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems
  • 17. Message Architectures in Distributed Systems Eric Lubow @elubow • nsqadmin provides a web interface to administrate and introspect an NSQ cluster at runtime (and empty, pause, or delete topics/ channels) • nsq_to_http - utility that helps transport an aggregate stream over HTTP • nsq_to_file - utility that safely persists an aggregated stream to disk • nsq_stat - iostat like utility for a topic/channel • nsq_tail - tail like utility for a topic/channel NSQ Tools Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  • 18. Message Architectures in Distributed Systems Eric Lubow @elubow Right Tool For The Job
  • 19. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets
  • 20. Message Architectures in Distributed Systems Eric Lubow @elubow NSQ NSQD API consumer NSQ NSQD API NSQ NSQD API consumer nsqlookupd nsqlookupd PUBLISH REGISTER DISCOVER SUBSCRIBE How Does It Work? Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  • 21. Message Architectures in Distributed Systems Eric Lubow @elubow The Schrute of the Problem
  • 22. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability
  • 23. Message Architectures in Distributed Systems Eric Lubow @elubow Simple Deployment & Automation • Chef cookbook - github.com/simplereach/chef-nsq • Written in Go • Easily distributable binaries • Deploy lookup nodes • Nsqd’s installed locally
  • 24. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge
  • 25. Message Architectures in Distributed Systems Eric Lubow @elubow nsqlookupd nsqlookupd consumer ➊ regularly poll for topic producers ➋ connect to all producers HTTP requests Runtime Discovery Message Architectures in Distributed Systems Eric Lubow @elubow #ddtx14
  • 26. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling
  • 27. Message Architectures in Distributed Systems Eric Lubow @elubow Path of a Packet Internet EC InternalAPI Solr C* Mongo Redis Vertica API Fire Hose SC Consumers Queue
  • 28. Message Architectures in Distributed Systems Eric Lubow @elubow
  • 29. Message Architectures in Distributed Systems Eric Lubow @elubow Controlled Data Flow Social Event Collector Social Data Batch & Write Processed Data Batch & Write Raw Data Calculate Score Write NSQ Broadcast NSQ
  • 30. Message Architectures in Distributed Systems Eric Lubow @elubow Controlled Data Flow Social Event Collector Social Data Batch & Write Processed Data Batch & Write Raw Data Calculate Score Write NSQ Broadcast NSQ
  • 31. Message Architectures in Distributed Systems Eric Lubow @elubow Broadcast Importance for Polyglottany Aggregator Mongo Writer Broadcast Redis Writer Cassandra Writer Solr Writer Calculator NSQ Vertica Writer
  • 32. Message Architectures in Distributed Systems Eric Lubow @elubow
  • 33. Message Architectures in Distributed Systems Eric Lubow @elubow Controlled Data Flow Social Event Collector Social Data Batch & Write Processed Data Batch & Write Raw Data Calculate Score Write NSQ Broadcast NSQ
  • 34. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling • Controlled Data Flow
  • 35. Message Architectures in Distributed Systems Eric Lubow @elubow What Is Enrichment? A mechanism to add value to a message to enhance processing in your system
  • 36. Message Architectures in Distributed Systems Eric Lubow @elubow How Do We Enrich Raw Event Enriched Event Consumer A Consumer B Consumer C NSQ Broadcast
  • 37. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling • Controlled Data Flow • Enrichment
  • 38. Message Architectures in Distributed Systems Eric Lubow @elubow Monitoring / Instrumentation • Comes with statsd support built-in • Statsd talks to both Graphite and nsqadmin • Nsqadmin comes with graphs for message processing stats • Nagios plugins available for monitoring topic/channel depth • Average end to end latency calculations are done on a per-channel basis
  • 39. Message Architectures in Distributed Systems Eric Lubow @elubow Goals • Consistent interfaces between systems • Allow access to many toolsets • Minimize downtime/Minimize cost of downtime • High availability • Clients should have minimal architecture knowledge • Horizontal Scaling • Controlled Data Flow • Enrichment • Monitoring and Instrumentation
  • 40. Message Architectures in Distributed Systems Eric Lubow @elubow Summary • Large Systems are more than just storage • Abstraction • Highly Available • Controlled Data Flow Patterns • Monitoring & Automation
  • 41. Message Architectures in Distributed Systems Eric Lubow @elubow We’re Hiring
  • 42. Message Architectures in Distributed Systems Eric Lubow @elubow Questions are guaranteed in life. Answers aren’t. Eric Lubow @elubow elubow@simplereach.com Cassandra Day, New York Thank you.

×