Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Shopzilla - Performance By Design

12,845 views

Published on

A look at Shopzilla's path to high performance

Published in: Technology, Design
  • Be the first to comment

Shopzilla - Performance By Design

  1. 1. Performance By Design A look at Shopzilla's path to high performance Tim Morrow, Senior Architect 11/3/2009
  2. 2. Agenda <ul><li>Our Company and what we do </li></ul><ul><li>Design approach </li></ul><ul><li>Detailed site architecture </li></ul><ul><li>Architecture evolution </li></ul><ul><li>Front end performance techniques </li></ul>
  3. 3. Shopzilla, Inc. - Online Shopping Network 100M impressions/day 20-29M UV’s per Month 8,000+ searches per second 100M+ Products
  4. 4. Background <ul><li>In 2000, the business was simple - bizrate.com in the US </li></ul><ul><li>Co-brands, API and shopzilla.com </li></ul><ul><li>Multivariate testing, Europe and multiple data centers </li></ul>
  5. 5. Growth
  6. 6. Original Architecture
  7. 7. Performance and Scalability Issues <ul><li>High latency due to sequential resource access </li></ul><ul><li>Data fetching sometimes O(n) </li></ul><ul><li>Lengthy Time To First Byte </li></ul><ul><li>Memory constrained </li></ul><ul><li>Lacked visibility into performance issues </li></ul>
  8. 8. Development Issues <ul><li>Serving 9+ web site experiences </li></ul><ul><li>Different look-and-feel; localization </li></ul><ul><li>New features carry high risk </li></ul><ul><li>Shared code-base limited team autonomy </li></ul>
  9. 9. Our Build Approach <ul><li>Start over, stay simple </li></ul><ul><li>Build 2 weeks at a time, deliver every page ASAP </li></ul><ul><li>Manage risk by managing exposure </li></ul>
  10. 10. New Design Principals <ul><li>Simplify layers </li></ul><ul><li>Decompose architecture </li></ul><ul><li>Define SLAs </li></ul><ul><li>Continuous performance testing </li></ul><ul><li>Utilize caching </li></ul><ul><li>Apply best-practice UI performance techniques </li></ul>
  11. 11. New Site Architecture
  12. 12. Site Technologies <ul><li>Java 1.6, Tomcat 6 </li></ul><ul><li>Spring MVC </li></ul><ul><li>TAL-like templating engine </li></ul><ul><li>Services are JAX-RS implemented with Apache CXF </li></ul><ul><li>Hibernate 3 with Ehcache </li></ul><ul><li>Oracle 10g Database </li></ul><ul><li>Oracle Coherence Data Grid </li></ul>
  13. 13. Performance SLAs
  14. 14. Web Application Tier Service Calls
  15. 15. Web Application Tier
  16. 16. www.bizrate.com/digital-cameras
  17. 17. Decomposed Services <ul><li>http://…/service/ v3 /reviews?pid=419943686&show=30&sort=newestFirst </li></ul><ProdRevResponse> <ProductReviews> <ProductReview> <UserRating>8.0</UserRating> <Title>advanced and begining photographer alike will benefit</Title> ... </ProductReview> </ProductReviews> </ProdRevResponse>
  18. 18. Service Invocation <ul><li>Connection pooling </li></ul><ul><li>Stale connection checking </li></ul><ul><li>Hardware load balancers </li></ul><ul><li>Connection and Socket Timeouts </li></ul><ul><li>O(1) invocations on a single page </li></ul><ul><li>JAXB XML->Java unmarshaling </li></ul>
  19. 19. How Do We Performance Test? <ul><li>Highly Concurrent requests </li></ul><ul><li>Dozens of services </li></ul><ul><li>Database access </li></ul><ul><li>Search Engine invocations </li></ul><ul><li>Meta-data lookups </li></ul>
  20. 20. Performance Testing Environment
  21. 21. Analyzing Performance <ul><li>Emit server-side performance information </li></ul><ul><li>10.61.35.25 198.133.178.17 - - [15/Jun/2009:18:48:13 -0700] &quot;GET /digital-cameras/ HTTP/1.1&quot; 200 195063 - - www.bizrate.com unique_id= c52efc5b-740b-44d4-8693-587b6b756564!rst=22 </li></ul><ul><li>start=1245116893247; elapsed=19 ;requestId= c52efc5b-740b-44d4-8693-587b6b756564 ;startDate=2009-06-15 18:48:13.247-0700;url=http:///services/content/v7/topsearchService/BR/US/402/0/21; </li></ul><ul><li>start=1245116893248; elapsed=24 ;requestId= c52efc5b-740b-44d4-8693-587b6b756564 ;startDate=2009-06-15 18:48:13.248-0700;url=http:///search/v5/US/12/product_search?keyword=digital+cameras </li></ul><ul><li>Build server-side call graphs </li></ul>
  22. 22. Profiling <ul><li>YourKit Java Profiler </li></ul><ul><li>Identify Syncronization Bottlenecks </li></ul>
  23. 23. Production Performance Monitoring <ul><li>JMX Mbeans </li></ul><ul><li>Graphite graphing </li></ul>
  24. 24. Caching <ul><li>Started with a simple replicated local cache </li></ul><ul><li>Cached data stored in the service process </li></ul><ul><li>Cannot scale to large data sets </li></ul><ul><li>Read-through caching not always suitable </li></ul><ul><li>Moved to Oracle Coherence </li></ul>
  25. 25. Oracle Coherence <ul><li>Distributed data grid </li></ul><ul><li>Dynamic cluster membership </li></ul><ul><li>Automatic data partitioning </li></ul><ul><li>Continuous availability </li></ul><ul><li>Computation in the grid </li></ul>
  26. 26. Case Study #1 – URL Builder <ul><li>Rule set describing most optimal URL structures </li></ul><ul><li>High frequency access from site </li></ul><ul><li>Backoffice system to compute rules </li></ul>
  27. 27. Coherence Solution
  28. 28. Results <ul><li>Continue to meet performance SLAs </li></ul><ul><li>In production for over 6 months </li></ul><ul><li>Very successful project </li></ul>
  29. 29. Case Study #2 - Keyword Metadata <ul><li>Map of long IDs to object model describing landing page </li></ul><ul><li>Over 600 million entries </li></ul><ul><li>Entry point for the majority of our traffic </li></ul><ul><li>Home-grown partitioned, distributed cache </li></ul>
  30. 30. Challenges Time Consuming and intricate Restarts Required
  31. 31. Solution Automatic read-through Automatic Expiration Scalable
  32. 32. Results <ul><li>Faster acquisition of new paid placements </li></ul><ul><li>No restarts </li></ul><ul><li>Less software to maintain </li></ul><ul><li>Great performance </li></ul>
  33. 33. Cache Architecture <ul><li>6 physical instances </li></ul><ul><li>8-way, 32 Gb RAM </li></ul><ul><li>16 JVMs with 1.5 Gb heap </li></ul><ul><li>Distributed Cache </li></ul><ul><li>Database read-through </li></ul><ul><li>LRU expriy based on object count </li></ul>
  34. 34. Performance Testing Coherence <ul><li>Performance test application against the grid </li></ul><ul><li>Test scenarios such as server loss and data population </li></ul>
  35. 35. UI Performance Techniques <ul><li>80% of time spent rendering the page </li></ul><ul><li>Yahoo currently list 34 best practices </li></ul>
  36. 36. Minimize HTTP Requests <ul><li>Combined files </li></ul><ul><li>CSS Sprites http://spriteme.org/ </li></ul>
  37. 37. Use a CDN <ul><li>Move your content closer to end users </li></ul><ul><li>Reduce latency </li></ul><ul><li>Every resource except for dynamic HTML </li></ul><ul><li>Offloads 100s of gigabytes per day </li></ul>
  38. 38. Expiry, Compression and Minification <ul><li>Expiry headers instruct the Browser to use a cached copy </li></ul><ul><li>> 2 days considered “Far Futures” </li></ul><ul><li>Use versioning techniques to allow forced upgrades </li></ul><ul><li>Compressing reduces page weight </li></ul><ul><li>Minifying may still reduce size by 5% even with compression </li></ul>
  39. 39. Reduce DNS lookups <ul><li>Yahoo recommends 3 – 4 DNS lookups per page </li></ul><ul><li>Base page: www.bizrate.com </li></ul><ul><li>Javascript & CSS: file01.bizrate-images.com </li></ul><ul><li>Static images: img01.bizrate-images.com </li></ul><ul><li>Dynamic images: image01.bizrate-images.com </li></ul><ul><li>3 rd party Ads are a different story </li></ul>
  40. 40. Avoid Redirects <ul><li>Redirects delay your ability to server content </li></ul><ul><li>We strive for zero redirects </li></ul><ul><li>Exceptions: </li></ul><ul><ul><li>Redirect after POST </li></ul></ul><ul><ul><li>Handling legacy-styled URLs </li></ul></ul><ul><ul><li>Links off-site for tracking purposes </li></ul></ul>
  41. 41. Use a Cookie-free domain <ul><li>Don’t send cookies when requesting static resources </li></ul><ul><li>Buy a separate domain name </li></ul><ul><ul><li>bizrate-images.com </li></ul></ul><ul><li>Saves many Kb of upload bandwidth </li></ul><ul><li>Revenue increased by 0.8% </li></ul>
  42. 42. Do Not Scale Images in HTML <ul><li>Don’t request larger images only to shrink them </li></ul><ul><li>We utilize a dynamic image scaling server </li></ul><ul><li>CDN caches and delivers exact image size </li></ul>
  43. 43. Make favicon.ico small and cacheable <ul><li>Can interfere with download sequence </li></ul><ul><li>2Kb+ multi-layered version: </li></ul><ul><li>318 byte version: </li></ul><ul><li>We save 100s of megabytes per day </li></ul>
  44. 44. Flush the Buffer Early <ul><li>Delivers content to users sooner </li></ul><ul><li>By default Tomcat flushes every 8Kb of uncompressed content </li></ul><ul><li>Investigate more proactive flushing </li></ul>
  45. 45. Web Performance Measurement <ul><li>Continuous monitoring of full page load performance </li></ul>
  46. 46. Can you spot our release dates? shopzilla.com 10/13/08 bizrate.com 10/17/08
  47. 47. Yeah, yeah… but did we make money?
  48. 48. Site conversion rates increased 7-12% Revenue = sessions x conversion % x CPC
  49. 49. Performance Impacts Abandonment
  50. 50. Performance Penalties & bizrate.co.uk ~120% SEM Sessions
  51. 51. Performance Summary <ul><li>Conversion Rate + 7% - 12% </li></ul><ul><li>Page View’s + 25% </li></ul><ul><li>US SEM Sessions + 8% </li></ul><ul><li>Bizrate.co.uk SEM Sessions + 120% </li></ul><ul><li>Infrastructure Required (US) - 50% (200 vs 402 nodes) </li></ul><ul><li>Availability 99.71%  99.94% </li></ul><ul><li>Product Velocity + 225% </li></ul><ul><li>Release Cost $1,000’s  $80 </li></ul>
  52. 52. Is Performance Worth The Expense? YES!
  53. 53. Simplicity, quality, performance are design decisions
  54. 54. Questions?
  55. 55. Thank You! Blog: tech.shopzilla.com Email: [email_address] Jobs: jobs.shopzilla.com
  56. 56. References <ul><li>http://developer.yahoo.com/performance/ </li></ul><ul><li>http://www.oracle.com/technology/products/coherence/index.html </li></ul><ul><li>http://www.evidentsoftware.com/products/clearstone.aspx </li></ul><ul><li>http://www.yourkit.com/ </li></ul><ul><li>http://spriteme.org/ </li></ul><ul><li>http://www.keynote.com/ </li></ul><ul><li>http://jakarta.apache.org/jmeter/ </li></ul><ul><li>http://cxf.apache.org/ </li></ul><ul><li>http://graphite.wikidot.com/ </li></ul>

×