• Save
How we build Vox
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • How nice to share such an experience with the rest of us
    Are you sure you want to
    Your message goes here
  • So nice of you to share this info. Thanks. I’m Ana Mui Stanley, working on my latest site on lyrics, www.lyrics-search.org/ . I enjoy reading the slide.
    Are you sure you want to
    Your message goes here
  • Very good discovering.

    John.
    www.freeringtones.ws/
    Are you sure you want to
    Your message goes here
  • Exceptional slideshow. Very clear and helpful

    Janie
    http://financejedi.com
    http://healthjedi.com
    Are you sure you want to
    Your message goes here
  • We support arbitrary HTML embed in the compose screen (beta feature), so we can just embed any slideshare slides on Vox, like seen on my blog: http://bulknews.vox.com/
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
24,583
On Slideshare
24,307
From Embeds
276
Number of Embeds
14

Actions

Shares
Downloads
3
Comments
6
Likes
47

Embeds 276

http://poorbuthappy.com 220
http://www.poorbuthappy.com 15
http://www.slideshare.net 11
http://www.lonerunners.net 8
http://a4.vox-data.com 8
http://www.linkedin.com 3
http://a2.vox-data.com 2
http://www.filescon.com 2
http://whitehalladvisory.org 2
http://a5.vox-data.com 1
http://static.slideshare.net 1
http://www.oplahol.com 1
http://www.thoughtbag.com 1
http://209.85.173.104 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. How we Build Vox April 4 2007
  • 2. Six Apart
    • Movable Type
    • TypePad
    • LiveJournal
    • Vox
  • 3. How we Build Vox: a Web 2.0, Large-scale, Fast, Internationalized website
  • 4. How we Build Vox: a Web 2.0 , Large-scale, Fast, Internationalized website
  • 5. Web 2.0
    • Overused…
    • But useful
  • 6. Vox talks to web services
  • 7. APIs: Tools
    • We use our own custom libraries
      • No Net::Amazon, Net::Flickr, etc
      • Why?
      • We don’t want to load 7 XML parsers
      • All of our tools use XML::LibXML
  • 8. APIs: Open Media Profile
    • <entry>
    • <title>Foo bar baz</title>
    • <link href=&quot;http://example.com/show/video/123&quot; />
    • <link rel=&quot;alternate&quot; type=&quot;application/atom+xml&quot;
    • href=&quot;http://example.com/atom/123&quot; />
    • <id>tag:example.com,2006:video-123</id>
    • <updated>2003-12-13T18:30:02Z</updated>
    • <content type=&quot;text&quot;>
    • Vox rocks blah blah blah ...
    • </content>
    • <category term=&quot;Vox&quot; scheme=&quot;http://example.com/tags/Vox/&quot; label=&quot;Vox&quot; />
    • <category term=&quot;cat&quot; scheme=&quot;http://example.com/tags/Cat/&quot; label=&quot;cat&quot; />
    • <link rel=&quot;license&quot; type=&quot;text/html&quot;
    • href=&quot;http://creativecommons.org/licenses/by/2.5/&quot; />
    • <media:content url=&quot;http://example.com/data/123.flv&quot; fileSize=&quot;123456&quot;
    • type=&quot;video/x-flv&quot; />
    • <media:player url=&quot;http://example.com/data/123.swf&quot; height=&quot;200&quot; width=&quot;400&quot; />
    • <media:thumbnail url=&quot;http://example.com/thumb/1223.jpg&quot; width=&quot;75&quot; height=&quot;50&quot; />
    • </entry>
  • 9. GData, OpenSearch, Media RSS… GData OpenSearch Media RSS
  • 10. Open Media Profile GData OpenSearch Media RSS
  • 11. APIs: Outbound
    • Atom Publishing Protocol
    • Everything is Atom/RSS
    • Cool URIs
      • /library/posts/atom.xml
      • /library/posts/2007/03/atom.xml
      • /library/posts/2007/03/tags/yapc/atom.xml
  • 12. Ajax
    • JSON serialization
      • Lightweight
      • Normal data types (no need to invent syntax)
    • Catalyst + JSON-RPC
    • Everything is an API
    • Our own core JS libraries
    • http://search.cpan.org/~miyagawa/Catalyst-Plugin-JSONRPC-0.01/
    • http://code.sixapart.com/svn/js/trunk/
  • 13. How we Build Vox: a Web 2.0, Large-scale , Fast, Internationalized website
  • 14. Large-scale
    • We started with this:
  • 15. We added some stuff.
  • 16. Data::ObjectDriver
    • Movable Type and TypePad: custom ORM
    • We wanted more:
      • Built-in caching
      • Built-in partitioning
  • 17. Data::ObjectDriver: Caching
    • Built-in support for memcached
    • All primary key data maintained for you
    • Completely automatic
  • 18. One line of code (basically):
    • Data::ObjectDriver::Driver::Cache::Memcached->new(
    • cache => Cache::Memcached->new({ servers => [ ... ] }),
    • fallback => Data::ObjectDriver::Driver::DBI->new(
    • dsn => 'dbi:SQLite:dbname=global.db',
    • ),
  • 19. Data::ObjectDriver: Partitioning
    • Sharded data
    • Based on arbitrary criteria
    • Completely transparent
  • 20. One line of code:
    • Data::ObjectDriver::Driver::Cache::Cache->new(
    • cache => Cache::Memcached->new({ servers => [ ... ] }),
    • fallback => Data::ObjectDriver::Driver::SimplePartition->new(
    • using => 'Recipe',
    • ),
  • 21. Partitioning: traffic
  • 22. Example: loading user’s posts
    • my $user = ArcheType::M::User->lookup_by_email(
    • 'ben@sixapart.com’
    • );
    • my @assets = $user->assets({ type => 'Post' });
  • 23. Example
    • Loading $user hits the global
    • Loading @assets then does:
      • Get the partition number of $user
      • Connect to that partition
      • Runs a query like:
    • SELECT user_id, asset_id
    • FROM asset
    • WHERE user_id = ?
    • AND type = 6
  • 24. ID Allocation: Issues
    • Partitioned databases -> no more auto_increment
    • Master/master
    • Were UUIDs the answer? No.
  • 25. ID Allocation: yuidd
    • IDs unique a datacenter
    • 64-bit integers (fit in a BIGINT column)
    • yuidd is the server
    • Data::YUID::Client is the client
    • asynchronous, non-blocking, simple, fast
  • 26. Job Queueing
    • Offload processing from Apache
    • It’s big and heavy
  • 27. Job Queueing: TheSchwartz
    • We’ll probably rename it.
    • asynchronous, reliable job queue
    • N databases
    • Pool of workers to handle the jobs
  • 28. How we Build Vox: a Web 2.0, Large-scale, Fast , Internationalized website
  • 29. Fast
    • Need both large-scale and fast
  • 30. Catalyst
    • Vox uses Catalyst
    • Does what we want, allows us to do everything else
    • Want to use our own ORM, etc
  • 31. Is Catalyst fast?
    • A common question on the mailing list!
    • It’s fast enough (more on that later).
  • 32. Template Toolkit
    • Pretty fast…
    • But we’re probably overloading it.
  • 33. Template Toolkit: profile
    • [info] Request took 0.244932s (4.083/s)
    • .----------------------------------------------------------------+-----------.
    • | Action | Time |
    • +----------------------------------------------------------------+-----------+
    • | /auto | 0.005569s |
    • | -> /set_locale | 0.000854s |
    • | -> /set_locale | 0.000648s |
    • | /home/root | 0.072194s |
    • | -> /home/home_loggedout | 0.071337s |
    • | -> /home/load_thisisgoods | 0.009077s |
    • | -> /home/load_specials | 0.014897s |
    • | -> /home/load_featured_voxers | 0.044168s |
    • | /end | 0.143675s |
    • | -> Vox::App::V::TT->process | 0.140877s |
    • '----------------------------------------------------------------+-----------'
  • 34. Template Toolkit: profile
    • Wow! Template Toolkit takes 60% of the request time.
    • 4 times as long as 10-15 network requests.
    • Oh well.
  • 35. Template Toolkit: versioned caching
    • On-disk cache
    • Versioned with application version
    • Automatic cache bust
  • 36. Versioned caching
    • Template->new({
    • ...,
    • COMPILE_DIR => '/tmp/tt-cache-' . Vox->VERSION,
    • });
  • 37. Template Toolkit: syscalls
    • Lots of syscalls for files that don’t exist!
    • But we patched it.
  • 38. Caching
    • Data caching
    • Automatic caching using Data::ObjectDriver
    • Saves millions of lookups per day from reaching the database
  • 39. Caching: lists
    • Lists of things: tags on an asset.
    • Tag objects are automatically cached
    • Cache asset => list of tag IDs
  • 40. Like this:
    • asset<assetid>-tags => [ <tagid1>, <tagid2>, … ]
  • 41. Caching: lists
    • Grab list of tag IDs from memcached
    • Use get_multi to get back the tags
  • 42. Get back the tag objects:
    • get_multi <tagid1> <tagid2> …
  • 43. That’s not all! In bulk:
    • get_multi asset<assetid1>-tags asset<assetid2>-tags …
  • 44. Caching: lists
    • In a database: N one-to-many queries
    • In memcached: 2 queries
    • Use the right caching strategy
  • 45. Perlbal
    • Reverse-proxy setup
    • Like Apache 2/mod_proxy in front of mod_perl…
    • But much better!
  • 46. Perlbal: webserver mode
    • Serves CSS, images, JavaScript
    • Static stuff
    • Really fast
  • 47. Perlbal: Serving JS and CSS
    • We use a lot of JS and CSS!
    • 20 JS files per page, 10 CSS files per page
    • SLOW
  • 48. Perlbal: Serving JS and CSS
    • Added file concatenation support in a plugin
    • (it’s now core)
  • 49. Used to be this:
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Core.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/DOM.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/DOM/Proxy.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/JSON.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Timer.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Observer.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Cache.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Client.jsc&quot;></script>
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Template.jsc&quot;></script>
    • ...
  • 50. And now it’s this!
    • <script type=&quot;text/javascript&quot; src=&quot;/.shared:v25.3:vox:en/js/Core.jsc?/js/DOM.jsc,/js/DOM/Proxy.jsc,/js/JSON.jsc,/js/Timer.jsc,/js/Observer.jsc,/js/Cache.jsc,/js/Client.jsc,/js/Template.jsc,/js/Autolayout.jsc,/js/Component.jsc,/js/Dialog.jsc,/js/App.jsc,/js/List.jsc,/js/ArcheType.jsc,/js/ArcheType/Client.jsc,/js/ArcheType/Controller.jsc,/js/ArcheType/Autocomplete.jsc&quot;></script>
  • 51. Perlbal: Concatenation
    • Much faster
    • Much less latency
    • Perlbal handles Last-Modified/If-Modified-Since
  • 52. Lots of good stuff!
    • Tools to play with:
      • Perlbal
      • MogileFS
      • Memcached
      • TheSchwartz
      • yuidd
      • JavaScript libraries
      • etc.
  • 53. All available at…
    • http://code.sixapart.com/