Building TweetReach with Sinatra, Tokyo Cabinet and Grackle: Austin on Rails 2009-03-24

  • 8,318 views
Uploaded on

Slides from a presentation I gave to Austin on Rails on March 24, 2009. It describes building an application called TweetReach that uses Sinatra, Tokyo Cabinet/Tyrant and Grackle.

Slides from a presentation I gave to Austin on Rails on March 24, 2009. It describes building an application called TweetReach that uses Sinatra, Tokyo Cabinet/Tyrant and Grackle.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
8,318
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
83
Comments
2
Likes
15

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Off the Reservation with TweetReach Hayes Davis Co-Founder, Appozite [email_address] @hayesdavis
  • 2. So what's this all about?
    • TweetReach is a simple Twitter app that helps you see how many people have seen a particular “message” you've sent
    • 3. A few requirements
      • Show people useful stats
      • 4. Be simple(ish)
      • 5. Be reasonably fast for a resonable number of users
      • 6. Fit within Twitter API limits
  • 7. What to use?
    • Web app framework: Sinatra
    • 8. Persistence layer: Tokyo Cabinet + Tokyo Tyrant + Memcache-client
    • 9. Twitter API: Grackle
  • 10. What is Sinatra?
    • Framework for simple web applications
    • 11. Handles just the view and controller part
      • Controller is a simple file with a DSL for defining routes and the blocks that handle them
      • 12. Views can use ERB (and others)
    • Can use ActiveRecord or another model and/or persistence mechanism (as you'll see)
  • 13. Sinatra Example set :port, 3000 get '/' do erb :index end get '/reach' do @query = params[:q] tr = TweetReach.new(username,pass) @results = tr.measure_reach(@query) erb :reach_results end
  • 14. What is Tokyo Cabinet?
    • Persistent (and fast) key-value store from Mikio Hirabayashi at mixi (large Japanese social network)
    • 15. Stats: 2.5M inserts/second, 3M queries/second, Store 8 exabytes
    • 16. Has a server called Tokyo Tyrant
  • 17. More Tokyo Cabinet
    • Offers multiple database engines:
      • Hash: simple key-value store
      • 18. B-tree: functionally the same as the hash DB but with ordered keys based on a user-defined function
      • 19. Fixed-length: basically a giant array which you index into by offset keys
      • 20. Table: similar to a relational DB except no predefined schema (ala CouchDB). Can index columns and query them
  • 21. Tokyo Tyrant
    • Server for Tokyo Cabinet databases
    • 22. Provides replication, failover, etc
    • 23. Speaks memcached protocol, so it looks just like memcached with a couple exceptions
      • Data is persistent
      • 24. Does not provide expiration
  • 25. Why Tokyo Cabinet/Tyrant?
    • Relevant Twitter information is easily indexed by either a Twitter screen name or id
    • 26. Reach calculations require O(n) looping and retrievals – much faster with a dedicated key-value store vs. a relational DB
    • 27. Need a read-through cache so I don't have to hit Twitter so often
    • 28. Can store stuff for relatively long periods that I may not want to keep solely in memory
  • 29. What is Grackle?
    • Grackle is a simple Twitter REST and Search API library designed not to break when the Twitter API changes (or breaks)
    • 30. I wrote it based on my experience building CheapTweet.com
    • 31. Dynamically makes requests to the APIs via a generic syntax that maps to Twitter URIs
    • 32. Builds OpenStructs of returned JSON or XML dynamically
  • 33. Grackle Example client = Grackle::Client( :username=>'some_user', :password=>'secret' ) #GET http://twitter.com/users/show.json?id=hayesdavis client.users.show? :id=>'hayesdavis' #POST http://twitter.com/statuses/update.json client.statuses.update! :status=>'howdy world' #GET http://search.twitter.com/search.json?q=AoR client[:search].search? :q=>'AoR'
  • 34. Building TweetReach
  • 35. Basic Structure TweetReach Calculator Read-Through Cache Sinatra
  • 36. Using Sinatra
    • Install the sinatra gem
    • 37. Sinatra apps are a single ruby file (I called it app.rb)
    • 38. Just require rubygems and sinatra
    • 39. Run “ruby app.rb”
  • 40. Using Tokyo Cabinet/Tyrant
    • Get the source for Tokyo Cabinet and Tokyo Tyrant and build it
    • 41. Run “ttserver -port <port> filename.<ext>”
    • 42. The type of database comes from the filename extension. I used “tch” which is the Hash engine
    • 43. Get Mike Perham's memcache-client gem
  • 44. Using Grackle
    • Make sure you've added gems.github.com to your list of gem sources
    • 45. Install the hayesdavis-grackle gem
    • 46. Just require “grackle”
  • 47. Lessons Learned
  • 48. Sinatra Lessons
    • Very little built-in anything:
      • No page or action caching
      • 49. No helpers for easy formatting (stole things from Rails)
      • 50. No validation against models
    • Make sure you really only want something extremely simple
    • 51. You'll most likely need to roll one or two of your own things
  • 52. Tokyo Cabinet Lessons
    • Lack of auto-expiration when using as mostly a key-value cache is annoying
    • 53. Would definitely use it again for this type of task
  • 54. Grackle
    • Twitter is Twitter so make sure you've got decent error handling in place – things will go wrong or not respond, etc
    • 55. Still a very new library so I'm sure there are places that need cleaning up
  • 56. Handy Stuff
    • http://www.sinatrarb.com/
    • 57. http://www.igvita.com/2009/02/13/tokyo-cabinet-beyond-key-value-store/
    • 58. http://www.scribd.com/doc/12016121/Tokyo-Cabinet-and-Tokyo-Tyrant-Presentation
    • 59. http://github.com/hayesdavis/grackle