Off the Reservation with TweetReach Hayes Davis Co-Founder, Appozite [email_address] @hayesdavis
So what's this all about? <ul><li>TweetReach is a simple Twitter app that helps you see how many people have seen a partic...
A few requirements </li><ul><li>Show people useful stats
Be simple(ish)
Be reasonably fast for a resonable number of users
Fit within Twitter API limits </li></ul></ul>
What to use? <ul><li>Web app framework: Sinatra
Persistence layer: Tokyo Cabinet + Tokyo Tyrant + Memcache-client
Twitter API: Grackle </li></ul>
What is Sinatra? <ul><li>Framework for simple web applications
Handles just the view and controller part </li><ul><li>Controller is a simple file with a DSL for defining routes and the ...
Views can use ERB (and others) </li></ul><li>Can use ActiveRecord or another model and/or persistence mechanism (as you'll...
Sinatra Example set :port, 3000 get '/' do erb :index end get '/reach' do @query = params[:q] tr = TweetReach.new(username...
What is Tokyo Cabinet? <ul><li>Persistent (and fast) key-value store from Mikio Hirabayashi at mixi (large Japanese social...
Stats: 2.5M inserts/second, 3M queries/second, Store 8 exabytes
Has a server called Tokyo Tyrant </li></ul>
More Tokyo Cabinet <ul><li>Offers multiple database engines: </li><ul><li>Hash: simple key-value store
B-tree: functionally the same as the hash DB but with ordered keys based on a user-defined function
Fixed-length: basically a giant array which you index into by offset keys
Table: similar to a relational DB except no predefined schema (ala CouchDB). Can index columns and query them </li></ul></ul>
Tokyo Tyrant <ul><li>Server for Tokyo Cabinet databases
Upcoming SlideShare
Loading in...5
×

Building TweetReach with Sinatra, Tokyo Cabinet and Grackle: Austin on Rails 2009-03-24

8,568

Published on

Slides from a presentation I gave to Austin on Rails on March 24, 2009. It describes building an application called TweetReach that uses Sinatra, Tokyo Cabinet/Tyrant and Grackle.

Published in: Technology
2 Comments
14 Likes
Statistics
Notes
No Downloads
Views
Total Views
8,568
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
84
Comments
2
Likes
14
Embeds 0
No embeds

No notes for slide

Building TweetReach with Sinatra, Tokyo Cabinet and Grackle: Austin on Rails 2009-03-24

  1. 1. Off the Reservation with TweetReach Hayes Davis Co-Founder, Appozite [email_address] @hayesdavis
  2. 2. So what's this all about? <ul><li>TweetReach is a simple Twitter app that helps you see how many people have seen a particular “message” you've sent
  3. 3. A few requirements </li><ul><li>Show people useful stats
  4. 4. Be simple(ish)
  5. 5. Be reasonably fast for a resonable number of users
  6. 6. Fit within Twitter API limits </li></ul></ul>
  7. 7. What to use? <ul><li>Web app framework: Sinatra
  8. 8. Persistence layer: Tokyo Cabinet + Tokyo Tyrant + Memcache-client
  9. 9. Twitter API: Grackle </li></ul>
  10. 10. What is Sinatra? <ul><li>Framework for simple web applications
  11. 11. Handles just the view and controller part </li><ul><li>Controller is a simple file with a DSL for defining routes and the blocks that handle them
  12. 12. Views can use ERB (and others) </li></ul><li>Can use ActiveRecord or another model and/or persistence mechanism (as you'll see) </li></ul>
  13. 13. Sinatra Example set :port, 3000 get '/' do erb :index end get '/reach' do @query = params[:q] tr = TweetReach.new(username,pass) @results = tr.measure_reach(@query) erb :reach_results end
  14. 14. What is Tokyo Cabinet? <ul><li>Persistent (and fast) key-value store from Mikio Hirabayashi at mixi (large Japanese social network)
  15. 15. Stats: 2.5M inserts/second, 3M queries/second, Store 8 exabytes
  16. 16. Has a server called Tokyo Tyrant </li></ul>
  17. 17. More Tokyo Cabinet <ul><li>Offers multiple database engines: </li><ul><li>Hash: simple key-value store
  18. 18. B-tree: functionally the same as the hash DB but with ordered keys based on a user-defined function
  19. 19. Fixed-length: basically a giant array which you index into by offset keys
  20. 20. Table: similar to a relational DB except no predefined schema (ala CouchDB). Can index columns and query them </li></ul></ul>
  21. 21. Tokyo Tyrant <ul><li>Server for Tokyo Cabinet databases
  22. 22. Provides replication, failover, etc
  23. 23. Speaks memcached protocol, so it looks just like memcached with a couple exceptions </li><ul><li>Data is persistent
  24. 24. Does not provide expiration </li></ul></ul>
  25. 25. Why Tokyo Cabinet/Tyrant? <ul><li>Relevant Twitter information is easily indexed by either a Twitter screen name or id
  26. 26. Reach calculations require O(n) looping and retrievals – much faster with a dedicated key-value store vs. a relational DB
  27. 27. Need a read-through cache so I don't have to hit Twitter so often
  28. 28. Can store stuff for relatively long periods that I may not want to keep solely in memory </li></ul>
  29. 29. What is Grackle? <ul><li>Grackle is a simple Twitter REST and Search API library designed not to break when the Twitter API changes (or breaks)
  30. 30. I wrote it based on my experience building CheapTweet.com
  31. 31. Dynamically makes requests to the APIs via a generic syntax that maps to Twitter URIs
  32. 32. Builds OpenStructs of returned JSON or XML dynamically </li></ul>
  33. 33. Grackle Example client = Grackle::Client( :username=>'some_user', :password=>'secret' ) #GET http://twitter.com/users/show.json?id=hayesdavis client.users.show? :id=>'hayesdavis' #POST http://twitter.com/statuses/update.json client.statuses.update! :status=>'howdy world' #GET http://search.twitter.com/search.json?q=AoR client[:search].search? :q=>'AoR'
  34. 34. Building TweetReach
  35. 35. Basic Structure TweetReach Calculator Read-Through Cache Sinatra
  36. 36. Using Sinatra <ul><li>Install the sinatra gem
  37. 37. Sinatra apps are a single ruby file (I called it app.rb)
  38. 38. Just require rubygems and sinatra
  39. 39. Run “ruby app.rb” </li></ul>
  40. 40. Using Tokyo Cabinet/Tyrant <ul><li>Get the source for Tokyo Cabinet and Tokyo Tyrant and build it
  41. 41. Run “ttserver -port <port> filename.<ext>”
  42. 42. The type of database comes from the filename extension. I used “tch” which is the Hash engine
  43. 43. Get Mike Perham's memcache-client gem </li></ul>
  44. 44. Using Grackle <ul><li>Make sure you've added gems.github.com to your list of gem sources
  45. 45. Install the hayesdavis-grackle gem
  46. 46. Just require “grackle” </li></ul>
  47. 47. Lessons Learned
  48. 48. Sinatra Lessons <ul><li>Very little built-in anything: </li><ul><li>No page or action caching
  49. 49. No helpers for easy formatting (stole things from Rails)
  50. 50. No validation against models </li></ul><li>Make sure you really only want something extremely simple
  51. 51. You'll most likely need to roll one or two of your own things </li></ul>
  52. 52. Tokyo Cabinet Lessons <ul><li>Lack of auto-expiration when using as mostly a key-value cache is annoying
  53. 53. Would definitely use it again for this type of task </li></ul>
  54. 54. Grackle <ul><li>Twitter is Twitter so make sure you've got decent error handling in place – things will go wrong or not respond, etc
  55. 55. Still a very new library so I'm sure there are places that need cleaning up </li></ul>
  56. 56. Handy Stuff <ul><li>http://www.sinatrarb.com/
  57. 57. http://www.igvita.com/2009/02/13/tokyo-cabinet-beyond-key-value-store/
  58. 58. http://www.scribd.com/doc/12016121/Tokyo-Cabinet-and-Tokyo-Tyrant-Presentation
  59. 59. http://github.com/hayesdavis/grackle </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×