Yellowpagescom Behind The Curtain

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Yellowpagescom Behind The Curtain - Presentation Transcript

    1. YELLOWPAGES.COM: Behind the Curtain John Straw AT&T Interactive
    2. What is YELLOWPAGES.COM? Part of AT&T s A local search website, serving s 23 million unique visitors / month q 2 million searches / day q More than 48 million requests / day q More than 1500 requests / sec q 30 Mbit/sec (200 Mbit/sec from Akamai) q Entirely Ruby on Rails since June 2007 s
    3. How we were Website and API applications s written in Java Website in application server q API in Tomcat q Search code split between s application and search layer
    4. What was bad Problems with architecture s Separate search implementations in each application q Session-heavy website application design hard to scale horizontally q Pointless web accelerator layer q Problems with application server platform s Technologically stagnant q No usable session migration features q Hard to do SEO q
    5. And also ... Lots of code written by consultants 2004-2005 s Fundamental design problems s Code extended largely by copy-and-modify since 11/2005 (to s around 125K lines) No test code s New features hard to implement s
    6. The Big Rewrite Several projects combined to become the big rewrite s Replacement of Java application server q Redesign of site look-and-feel q Many other wish-list projects, some of which were difficult to q accomplish with existing application Project conception to completion: one year s Development took about four months s Project phases s 7/2006 - 12/2006: Thinking, early preparation q 12/2006: Rough architecture determination, kick-off q 1/2007 - 3/1/2007: Technology research and prototypes, business q rules exploration, UI design concepts 3/1/2007 - 6/28/2007: Site development and launch q
    7. Requirements for a new site architecture 1. Absolute control of urls Maximize SEO crawl-ability q 2. No sessions: HTTP is stateless Anything other than cookies is just self-delusion q Staying stateless makes scaling easier q 3. Be agile: write less code Development must be faster q 4. Develop easy-to-leverage core business services Eliminate current duplicated code q Must be able to build other sites or applications with common q search, personalization and business review logic Service-oriented architecture, with web tier utilizing common q service tier
    8. Rewrite team Cross-functional team of about 20 people s Assemble stakeholders, not requirements q Working closely together s Whole team sat together for entire project q Lunch provided several days per week q Team celebrations held for milestones q Core development team deliberately small s Four skilled developers can accomplish a lot q Cost of communication low q Low management overhead q
    9. Picking the platform Web tier: s Rails or Django q Utilizing common services for q search, personalization, and ratings Service tier: s Java application q Probably EJB3 in JBoss q Exposing a REST API and q returning JSON-serialized data Started writing prototypes ... s
    10. One
    11. Two
    12. Three
    13. And finally ...
    14. Why Rails? Considered Java frameworks didn't provide enough control of s url structure Web tier choice became Rails vs. Django s Rails best web tier choice due to s Better automated testing integration q More platform maturity q Clearer path (for us!) to C if necessary for performance q Developer comfort and experience q Team decided to go with Rails top-to-bottom s Evaluation of EJB3 didn't show real advantages over Ruby or q Python for our application Reasons for choosing Rails for web tier applied equally to service q tier Advantage of having uniform implementation environment q
    15. Separate or combined?
    16. Other considerations How many servers? s How many mongrels per server? s How much memory for memcached? s
    17. Production configuration Acquired 25 machines of identical configuration for each data s center Performance testing to size out each tier, and determine how s many mongrels 4 GB of memory on each service-tier machine set aside for s memcached Used 2 machines in each data center for database servers s
    18. Performance goals Sub-second average home page load time s Sub 4-second average search results page load time s Handle traffic without dying s
    19. Performance optimizations mongrel_handlers in service tier application s C library for parsing search cluster responses s Erubis rather than erb in web tier s
    20. Site at launch
    21. Database performance issues Machines inadequate to handle search load for ratings look-up s Additional caching added q Oracle doesn't like lots of connections s Use of named connections made this problem even worse s Additional memory required for database servers q All database look-up code moved to service tier q Changed to a single database connection q
    22. Page performance issues Slow page performance caused more by asset download times s than speed of web framework Worked through the Yahoo! performance guidelines s Minified and combined CSS and Javascript with s asset_packager Moved image serving to Akamai edge cache s Apache slow serving analytics tags -- moved to nginx for web s tier Started using some fragment caching s
    23. Slow requests, etc. Slow requests in the web tier caused mongrel queueing s Developed qrp (http://qrp.rubyforge.org/) q Allows you to establish a backup pool where requests get parked q until a mongrel is available Experimented with different malloc implementations s Started using a custom MRI build -- ypc_ruby s Started using a slightly-customized Mongrel s
    24. Overall performance Performance at launch was generally acceptable s After web server & hosting changes performance better than s previous site Extensive use of caching and elimination of obsolete queries s lowered load on search cluster compared to previous site Over a year later, we need to do more profiling s Traffic has more than doubled since launch q Hardware evolution has invalidated original profiling q We now want sub 2-second search result pages q
    25. What else? New applications on Rails s Server-side component of native iPhone application q Working on moving current Java API application q Other internal applications q Ruby but not Rails s Exploratory port of service tier to merb q Supporting development of Waves q Data-source ETL q Listing match-and-merge q

    + ConSanFrancisco123ConSanFrancisco123, 6 months ago

    custom

    307 views, 0 favs, 0 embeds more stats

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 307
      • 307 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 1
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories