Your SlideShare is downloading. ×
How we build Vox
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

How we build Vox

14,083

Published on

Slides about "How we build Vox" in Six Apart, presented by Benjamin Trott, the CTO and co-founder of the company in

Slides about "How we build Vox" in Six Apart, presented by Benjamin Trott, the CTO and co-founder of the company in

6 Comments
47 Likes
Statistics
Notes
No Downloads
Views
Total Views
14,083
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
6
Likes
47
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. How we Build Vox April 4 2007
  • 2. Six Apart
    • Movable Type
    • TypePad
    • LiveJournal
    • Vox
  • 3. How we Build Vox: a Web 2.0, Large-scale, Fast, Internationalized website
  • 4. How we Build Vox: a Web 2.0 , Large-scale, Fast, Internationalized website
  • 5. Web 2.0
    • Overused…
    • But useful
  • 6. Vox talks to web services
  • 7. APIs: Tools
    • We use our own custom libraries
      • No Net::Amazon, Net::Flickr, etc
      • Why?
      • We don’t want to load 7 XML parsers
      • All of our tools use XML::LibXML
  • 8. APIs: Open Media Profile
    • <entry>
    • <title>Foo bar baz</title>
    • <link href=&quot;http://example.com/show/video/123&quot; />
    • <link rel=&quot;alternate&quot; type=&quot;application/atom+xml&quot;
    • href=&quot;http://example.com/atom/123&quot; />
    • <id>tag:example.com,2006:video-123</id>
    • <updated>2003-12-13T18:30:02Z</updated>
    • <content type=&quot;text&quot;>
    • Vox rocks blah blah blah ...
    • </content>
    • <category term=&quot;Vox&quot; scheme=&quot;http://example.com/tags/Vox/&quot; label=&quot;Vox&quot; />
    • <category term=&quot;cat&quot; scheme=&quot;http://example.com/tags/Cat/&quot; label=&quot;cat&quot; />
    • <link rel=&quot;license&quot; type=&quot;text/html&quot;
    • href=&quot;http://creativecommons.org/licenses/by/2.5/&quot; />
    • <media:content url=&quot;http://example.com/data/123.flv&quot; fileSize=&quot;123456&quot;
    • type=&quot;video/x-flv&quot; />
    • <media:player url=&quot;http://example.com/data/123.swf&quot; height=&quot;200&quot; width=&quot;400&quot; />
    • <media:thumbnail url=&quot;http://example.com/thumb/1223.jpg&quot; width=&quot;75&quot; height=&quot;50&quot; />
    • </entry>
  • 9. GData, OpenSearch, Media RSS… GData OpenSearch Media RSS
  • 10. Open Media Profile GData OpenSearch Media RSS
  • 11. APIs: Outbound
    • Atom Publishing Protocol
    • Everything is Atom/RSS
    • Cool URIs
      • /library/posts/atom.xml
      • /library/posts/2007/03/atom.xml
      • /library/posts/2007/03/tags/yapc/atom.xml
  • 12. Ajax
    • JSON serialization
      • Lightweight
      • Normal data types (no need to invent syntax)
    • Catalyst + JSON-RPC
    • Everything is an API
    • Our own core JS libraries
    • http://search.cpan.org/~miyagawa/Catalyst-Plugin-JSONRPC-0.01/
    • http://code.sixapart.com/svn/js/trunk/
  • 13. How we Build Vox: a Web 2.0, Large-scale , Fast, Internationalized website
  • 14. Large-scale
    • We started with this:
  • 15. We added some stuff.
  • 16. Data::ObjectDriver
    • Movable Type and TypePad: custom ORM
    • We wanted more:
      • Built-in caching
      • Built-in partitioning
  • 17. Data::ObjectDriver: Caching
    • Built-in support for memcached
    • All primary key data maintained for you
    • Completely automatic
  • 18. One line of code (basically):
    • Data::ObjectDriver::Driver::Cache::Memcached->new(
    • cache => Cache::Memcached->new({ servers => [ ... ] }),
    • fallback => Data::ObjectDriver::Driver::DBI->new(
    • dsn => 'dbi:SQLite:dbname=global.db',
    • ),
  • 19. Data::ObjectDriver: Partitioning
    • Sharded data
    • Based on arbitrary criteria
    • Completely transparent
  • 20. One line of code:
    • Data::ObjectDriver::Driver::Cache::Cache->new(
    • cache => Cache::Memcached->new({ servers => [ ... ] }),
    • fallback => Data::ObjectDriver::Driver::SimplePartition->new(
    • using => 'Recipe',
    • ),
  • 21. Partitioning: traffic
  • 22. Example: loading user’s posts
    • my $user = ArcheType::M::User->lookup_by_email(
    • 'ben@sixapart.com’
    • );
    • my @assets = $user->assets({ type => 'Post' });
  • 23. Example
    • Loading $user hits the global
    • Loading @assets then does:
      • Get the partition number of $user
      • Connect to that partition
      • Runs a query like:
    • SELECT user_id, asset_id
    • FROM asset
    • WHERE user_id = ?
    • AND type = 6
  • 24. ID Allocation: Issues
    • Partitioned databases -> no more auto_increment
    • Master/master
    • Were UUIDs the answer? No.
  • 25. ID Allocation: yuidd
    • IDs unique a datacenter
    • 64-bit integers (fit in a BIGINT column)
    • yuidd is the server
    • Data::YUID::Client is the client
    • asynchronous, non-blocking, simple, fast
  • 26. Job Queueing
    • Offload processing from Apache
    • It’s big and heavy
  • 27. Job Queueing: TheSchwartz
    • We’ll probably rename it.
    • asynchronous, reliable job queue
    • N databases
    • Pool of workers to handle the jobs
  • 28. How we Build Vox: a Web 2.0, Large-scale, Fast , Internationalized website
  • 29. Fast
    • Need both large-scale and fast
  • 30. Catalyst
    • Vox uses Catalyst
    • Does what we want, allows us to do everything else
    • Want to use our own ORM, etc
  • 31. Is Catalyst fast?
    • A common question on the mailing list!
    • It’s fast enough (more on that later).
  • 32. Template Toolkit
    • Pretty fast…
    • But we’re probably overloading it.
  • 33. Template Toolkit: profile
    • [info] Request took 0.244932s (4.083/s)
    • .----------------------------------------------------------------+-----------.
    • | Action | Time |
    • +----------------------------------------------------------------+-----------+
    • | /auto | 0.005569s |
    • | -> /set_locale | 0.000854s |
    • | -> /set_locale | 0.000648s |
    • | /home/root | 0.072194s |
    • | -> /home/home_loggedout | 0.071337s |
    • | -> /home/load_thisisgoods | 0.009077s |
    • | -> /home/load_specials | 0.014897s |
    • | -> /home/load_featured_voxers | 0.044168s |
    • | /end | 0.143675s |
    • | -> Vox::App::V::TT->process | 0.140877s |
    • '----------------------------------------------------------------+-----------'
  • 34. Template Toolkit: profile
    • Wow! Template Toolkit takes 60% of the request time.
    • 4 times as long as 10-15 network requests.
    • Oh well.
  • 35. Template Toolkit: versioned caching
    • On-disk cache
    • Versioned with application version
    • Automatic cache bust
  • 36. Versioned caching
    • Template->new({
    • ...,
    • COMPILE_DIR => '/tmp/tt-cache-' . Vox->VERSION,
    • });
  • 37. Template Toolkit: syscalls
    • Lots of syscalls for files that don’t exist!
    • But we patched it.
  • 38. Caching
    • Data caching
    • Automatic caching using Data::ObjectDriver
    • Saves millions of lookups per day from reaching the database
  • 39. Caching: lists
    • Lists of things: tags on an asset.
    • Tag objects are automatically cached
    • Cache asset => list of tag IDs
  • 40. Like this:
    • asset<assetid>-tags => [ <tagid1>, <tagid2>, … ]
  • 41. Caching: lists
    • Grab list of tag IDs from memcached
    • Use get_multi to get back the tags
  • 42. Get back the tag objects:
    • get_multi <tagid1> <tagid2> …
  • 43. That’s not all! In bulk:
    • get_multi asset<assetid1>-tags asset<assetid2>-tags …
  • 44. Caching: lists
    • In a database: N one-to-many queries
    • In memcached: 2 queries
    • Use the right caching strategy
  • 45. Perlbal
    • Reverse-proxy setup
    • Like Apache 2/mod_proxy in front of mod_perl…
    • But much better!
  • 46. Perlbal: webserver mode
    • Serves CSS, images, JavaScript
    • Static stuff
    • Really fast
  • 47. Perlbal: Serving JS and CSS
    • We use a lot of JS and CSS!
    • 20 JS files per page, 10 CSS files per page
    • SLOW
  • 48. Perlbal: Serving JS and CSS
    • Added file concatenation support in a plugin
    • (it’s now core)
  • 49. Used to be this:
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Core.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/DOM.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/DOM/Proxy.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/JSON.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Timer.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Observer.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Cache.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Client.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Template.jsc&quot;></script>
    • ...
  • 50. And now it’s this!
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Core.jsc?/js/DOM.jsc,/js/DOM/Proxy.jsc,/js/JSON.jsc,/js/Timer.jsc,/js/Observer.jsc,/js/Cache.jsc,/js/Client.jsc,/js/Template.jsc,/js/Autolayout.jsc,/js/Component.jsc,/js/Dialog.jsc,/js/App.jsc,/js/List.jsc,/js/ArcheType.jsc,/js/ArcheType/Client.jsc,/js/ArcheType/Controller.jsc,/js/ArcheType/Autocomplete.jsc&quot;></script>
  • 51. Perlbal: Concatenation
    • Much faster
    • Much less latency
    • Perlbal handles Last-Modified/If-Modified-Since
  • 52. Lots of good stuff!
    • Tools to play with:
      • Perlbal
      • MogileFS
      • Memcached
      • TheSchwartz
      • yuidd
      • JavaScript libraries
      • etc.
  • 53. All available at…
    • http://code.sixapart.com/

×