Your SlideShare is downloading. ×
0
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Scalable Event Analytics with MongoDB & Ruby on Rails
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Scalable Event Analytics with MongoDB & Ruby on Rails

31,062

Published on

Slides from my talk at RubyConfChina 2010.

Slides from my talk at RubyConfChina 2010.

Published in: Technology
7 Comments
112 Likes
Statistics
Notes
No Downloads
Views
Total Views
31,062
On Slideshare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
239
Comments
7
Likes
112
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Scalable Event Analytics with Ruby on Rails & MongoDB<br />Ruby Conf China 2010<br />Jared Rosoff (@forjared) jrosoff@yottaa.com<br />
  • 2. Yottaa!!!! (www.yottaa.com)<br />
  • 3. Overview<br />Ruby at Scale<br />What is Event Analytics? <br />What are the different ways you could do it? <br />How we did it<br />
  • 4. Ruby At Scale?<br />http://www.flickr.com/photos/laughingsquid<br />
  • 5. Event Analytics<br />Data Source<br />Event<br />User<br />Event<br />Event<br />Query<br />Event Analytics<br />Data Source<br />Report<br />Event<br />Event<br />Event<br />Event<br />User<br />Query<br />Data Source<br />Event<br />Event<br />Report<br />Event<br />Event<br />Event<br />Data Source<br />
  • 6.
  • 7. Oh and we are a startup<br />
  • 8. Our requirements:<br />
  • 9. Rails default architecture<br />Collection Server<br />Data Source<br />MySQL<br />User<br />Reporting Server<br />
  • 10. Rails default architecture<br />Collection Server<br />Data Source<br />MySQL<br />User<br />Reporting Server<br />“Just” a Rails App<br />
  • 11. Rails default architecture<br /> Performance Bottleneck: Too much load<br />Collection Server<br />Data Source<br />MySQL<br />User<br />Reporting Server<br />“Just” a Rails App<br />
  • 12. Let’s add replication!<br />MySQL<br />Master<br />Collection Server<br />Data Source<br />Replication<br />MySQL<br />Master<br />User<br />Reporting Server<br />MySQL<br />Master<br />MySQL<br />Master<br />
  • 13. Let’s add replication!<br />MySQL<br />Master<br />Collection Server<br />Data Source<br />Replication<br />MySQL<br />Master<br />User<br />Reporting Server<br />MySQL<br />Master<br />MySQL<br />Master<br />Off the shelf!<br />Scalable Reads!<br />
  • 14. Let’s add replication!<br /> Performance Bottleneck: Still can’t scale writes<br />MySQL<br />Master<br />Collection Server<br />Data Source<br />Replication<br />MySQL<br />Master<br />User<br />Reporting Server<br />MySQL<br />Master<br />MySQL<br />Master<br />Off the shelf!<br />Scalable Reads!<br />
  • 15. What about sharding?<br />Collection Server<br />Data Source<br />Sharding<br />MySQL<br />Master<br />MySQL<br />Master<br />MySQL<br />Master<br />User<br />Reporting Server<br />Sharding<br />
  • 16. What about sharding?<br />Collection Server<br />Data Source<br />Sharding<br />MySQL<br />Master<br />MySQL<br />Master<br />MySQL<br />Master<br />User<br />Reporting Server<br />Sharding<br />Scalable Writes!<br />
  • 17. What about sharding?<br /> Development Bottleneck:<br />Need to write custom code<br />Collection Server<br />Data Source<br />Sharding<br />MySQL<br />Master<br />MySQL<br />Master<br />MySQL<br />Master<br />User<br />Reporting Server<br />Sharding<br />Scalable Writes!<br />
  • 18. Key Value stores to the rescue?<br />Collection Server<br />Data Source<br />MySQL<br />Master<br />MySQL<br />Master<br />Cassandra or<br />Voldemort<br />User<br />Reporting Server<br />
  • 19. Key Value stores to the rescue?<br />Collection Server<br />Data Source<br />MySQL<br />Master<br />MySQL<br />Master<br />Cassandra or<br />Voldemort<br />User<br />Reporting Server<br />Scalable Writes!<br />
  • 20. Key Value stores to the rescue?<br /> Development Bottleneck:<br />Reporting is limited / hard<br />Collection Server<br />Data Source<br />MySQL<br />Master<br />MySQL<br />Master<br />Cassandra or<br />Voldemort<br />User<br />Reporting Server<br />Scalable Writes!<br />
  • 21. Can I Hadoop my way out of this?<br />MySQL<br />Master<br />MySQL<br />Master<br />Cassandra or<br />Voldemort<br />Collection Server<br />Data Source<br />Hadoop<br />MySQL<br />Master<br />MySQL<br />Master<br />User<br />Reporting Server<br />MySQL<br />Master<br />MySQL<br />Slave<br />
  • 22. Can I Hadoop my way out of this?<br />MySQL<br />Master<br />MySQL<br />Master<br />Cassandra or<br />Voldemort<br />Collection Server<br />Data Source<br />Hadoop<br />MySQL<br />Master<br />Scalable Writes!<br />MySQL<br />Master<br />User<br />Reporting Server<br />MySQL<br />Master<br />MySQL<br />Slave<br />
  • 23. Can I Hadoop my way out of this?<br />MySQL<br />Master<br />MySQL<br />Master<br />Cassandra or<br />Voldemort<br />Collection Server<br />Data Source<br />Hadoop<br />MySQL<br />Master<br />Scalable Writes!<br />Flexible Reports!<br />MySQL<br />Master<br />User<br />Reporting Server<br />MySQL<br />Master<br />MySQL<br />Slave<br />
  • 24. Can I Hadoop my way out of this?<br />MySQL<br />Master<br />MySQL<br />Master<br />Cassandra or<br />Voldemort<br />Collection Server<br />Data Source<br />Hadoop<br />MySQL<br />Master<br />Scalable Writes!<br />Flexible Reports!<br />“Just” a Rails App<br />MySQL<br />Master<br />User<br />Reporting Server<br />MySQL<br />Master<br />MySQL<br />Slave<br />
  • 25. Can I Hadoop my way out of this?<br /> Development Bottleneck:<br />Too many systems!<br />MySQL<br />Master<br />MySQL<br />Master<br />Cassandra or<br />Voldemort<br />Collection Server<br />Data Source<br />Hadoop<br />MySQL<br />Master<br />Scalable Writes!<br />Flexible Reports!<br />“Just” a Rails App<br />MySQL<br />Master<br />User<br />Reporting Server<br />MySQL<br />Master<br />MySQL<br />Slave<br />
  • 26. MongoDB! <br />Collection Server<br />Data Source<br />MySQL<br />Master<br />MySQL<br />Master<br />MongoDB<br />User<br />Reporting Server<br />
  • 27. MongoDB! <br />Collection Server<br />Data Source<br />MySQL<br />Master<br />MySQL<br />Master<br />MongoDB<br />User<br />Reporting Server<br />Scalable Writes!<br />
  • 28. MongoDB! <br />Collection Server<br />Data Source<br />MySQL<br />Master<br />MySQL<br />Master<br />MongoDB<br />User<br />Reporting Server<br />Scalable Writes!<br />Flexible Reporting!<br />
  • 29. MongoDB! <br />Collection Server<br />Data Source<br />MySQL<br />Master<br />MySQL<br />Master<br />MongoDB<br />User<br />Reporting Server<br />Scalable Writes!<br />“Just” a rails app<br />Flexible Reporting!<br />
  • 30. MongoD<br />App Server<br />Data Source<br />Collection<br />MongoD<br />Load<br />Balancer<br />Passenger<br />Nginx<br />Mongos<br />Reporting<br />User<br />MongoD<br />
  • 31. MongoD<br />App Server<br />Data Source<br />Collection<br />MongoD<br />Load<br />Balancer<br />Passenger<br />Nginx<br />Mongos<br />Reporting<br />User<br />MongoD<br />Sharding!<br />
  • 32. MongoD<br />App Server<br />Data Source<br />Collection<br />MongoD<br />Load<br />Balancer<br />Passenger<br />Nginx<br />Mongos<br />Reporting<br />User<br />MongoD<br />Sharding!<br />High Concurrency<br />
  • 33. MongoD<br />App Server<br />Data Source<br />Collection<br />MongoD<br />Load<br />Balancer<br />Passenger<br />Nginx<br />Mongos<br />Reporting<br />User<br />MongoD<br />Sharding!<br />High Concurrency<br />Scale-Out<br />
  • 34. MongoDBSharding<br />
  • 35. MongoDBSharding<br />Replica Sets let us scale storage & transaction capacity for each shard<br />
  • 36. MongoDBSharding<br />Replica Sets let us scale storage & transaction capacity for each shard<br />Mongos routes transactions to shards based on “shard key”<br />
  • 37. MongoDBSharding<br />Replica Sets let us scale storage & transaction capacity for each shard<br />Mongos routes transactions to shards based on “shard key”<br />Config servers store information about which shards exist<br />
  • 38. Inserting<br />3<br />2<br />Shard key == name<br />bob  Shard 2<br />1<br />insert { ‘name’ : bob }<br />Insert { ‘name’ : bob }<br />
  • 39. Querying<br />3<br />2<br />Shard key == name<br />bob  Shard 2<br />1<br />Query { ‘name’ : bob }<br />Query { ‘name’ : bob }<br />
  • 40. Map Reduce<br />2<br />2<br />2<br />2<br />1<br />Map-reduce( … ) <br />Map-reduce(…)<br />
  • 41. Working with Mongo<br />MongoMapper makes it look like ActiveRecord<br />Documents are more natural than rows in many cases<br />Map-Reduce rocks (but needs better support in rails)<br />http://www.flickr.com/photos/elhamalawy/2526783078/<br />
  • 42.
  • 43. Ruby<br />Mongo<br />
  • 44. Runs over all the objects in the views table, counting how many times a page was viewed<br />Adds up all the counts for a unique url / date combination<br />Run the map reduce job and return a collection containing the results<br />
  • 45. Results<br />Version 1 of our analytics system took 2 weeks with 1 engineer <br />We have since added a lot more complexity, but we did it incrementally<br />We replaced MySQL entirely with MongoDB<br />No need for joins, transactions <br />Every table is now a document collection<br />It’s fast! <br />63ms – Average response time for sending data to server<br />93ms – Average response time for displaying reports<br />

×