Your SlideShare is downloading. ×
0
Performance By Design A look at Shopzilla's path to high performance Tim Morrow, Senior Architect 11/3/2009
Agenda <ul><li>Our Company and what we do </li></ul><ul><li>Design approach </li></ul><ul><li>Detailed site architecture <...
Shopzilla, Inc. - Online Shopping Network 100M  impressions/day 20-29M  UV’s per Month 8,000+ searches per second 100M+ Pr...
Background <ul><li>In 2000, the business was simple - bizrate.com in the US </li></ul><ul><li>Co-brands, API and shopzilla...
Growth
Original Architecture
Performance and Scalability Issues <ul><li>High latency due to sequential resource access </li></ul><ul><li>Data fetching ...
Development Issues <ul><li>Serving 9+ web site experiences </li></ul><ul><li>Different look-and-feel; localization </li></...
Our Build Approach <ul><li>Start over, stay simple </li></ul><ul><li>Build 2 weeks at a time, deliver every page ASAP </li...
New Design Principals <ul><li>Simplify layers </li></ul><ul><li>Decompose architecture </li></ul><ul><li>Define SLAs </li>...
New Site Architecture
Site Technologies <ul><li>Java 1.6, Tomcat 6 </li></ul><ul><li>Spring MVC </li></ul><ul><li>TAL-like templating engine </l...
Performance SLAs
Web Application Tier Service Calls
Web Application Tier
www.bizrate.com/digital-cameras
Decomposed Services <ul><li>http://…/service/ v3 /reviews?pid=419943686&show=30&sort=newestFirst </li></ul><ProdRevRespons...
Service Invocation <ul><li>Connection pooling </li></ul><ul><li>Stale connection checking </li></ul><ul><li>Hardware load ...
How Do We Performance Test? <ul><li>Highly Concurrent requests </li></ul><ul><li>Dozens of services </li></ul><ul><li>Data...
Performance Testing Environment
Analyzing Performance <ul><li>Emit server-side performance information </li></ul><ul><li>10.61.35.25 198.133.178.17 - - [1...
Profiling <ul><li>YourKit Java Profiler </li></ul><ul><li>Identify Syncronization Bottlenecks </li></ul>
Production Performance Monitoring <ul><li>JMX Mbeans </li></ul><ul><li>Graphite graphing </li></ul>
Caching <ul><li>Started with a simple replicated local cache </li></ul><ul><li>Cached data stored in the service process <...
Oracle Coherence <ul><li>Distributed data grid </li></ul><ul><li>Dynamic cluster membership </li></ul><ul><li>Automatic da...
Case Study #1 – URL Builder <ul><li>Rule set describing most optimal URL structures </li></ul><ul><li>High frequency acces...
Coherence Solution
Results <ul><li>Continue to meet performance SLAs </li></ul><ul><li>In production for over 6 months </li></ul><ul><li>Very...
Case Study #2 - Keyword Metadata <ul><li>Map of long IDs to object model describing landing page </li></ul><ul><li>Over 60...
Challenges Time Consuming and intricate Restarts Required
Solution Automatic read-through Automatic Expiration Scalable
Results <ul><li>Faster acquisition of new paid placements </li></ul><ul><li>No restarts </li></ul><ul><li>Less software to...
Cache Architecture <ul><li>6 physical instances </li></ul><ul><li>8-way, 32 Gb RAM </li></ul><ul><li>16 JVMs with 1.5 Gb h...
Performance Testing Coherence <ul><li>Performance test application against the grid </li></ul><ul><li>Test scenarios such ...
UI Performance Techniques <ul><li>80% of time spent rendering the page </li></ul><ul><li>Yahoo currently list 34 best prac...
Minimize HTTP Requests <ul><li>Combined files </li></ul><ul><li>CSS Sprites  http://spriteme.org/   </li></ul>
Use a CDN <ul><li>Move your content closer to end users </li></ul><ul><li>Reduce latency </li></ul><ul><li>Every resource ...
Expiry, Compression and Minification <ul><li>Expiry headers instruct the Browser to use a cached copy </li></ul><ul><li>> ...
Reduce DNS lookups <ul><li>Yahoo recommends 3 – 4 DNS lookups per page </li></ul><ul><li>Base page:  www.bizrate.com </li>...
Avoid Redirects <ul><li>Redirects delay your ability to server content </li></ul><ul><li>We strive for zero redirects </li...
Use a Cookie-free domain <ul><li>Don’t send cookies when requesting static resources </li></ul><ul><li>Buy a separate doma...
Do Not Scale Images in HTML <ul><li>Don’t request larger images only to shrink them </li></ul><ul><li>We utilize a dynamic...
Make favicon.ico small and cacheable <ul><li>Can interfere with download sequence </li></ul><ul><li>2Kb+ multi-layered ver...
Flush the Buffer Early <ul><li>Delivers content to users sooner </li></ul><ul><li>By default Tomcat flushes every 8Kb of u...
Web Performance Measurement <ul><li>Continuous monitoring of full page load performance </li></ul>
Can you spot our release dates? shopzilla.com  10/13/08 bizrate.com  10/17/08
Yeah, yeah… but did we make money?
Site conversion rates increased 7-12% Revenue = sessions x  conversion %  x CPC
Performance Impacts Abandonment
Performance Penalties & bizrate.co.uk  ~120% SEM Sessions
Performance Summary <ul><li>Conversion Rate + 7% - 12%  </li></ul><ul><li>Page View’s + 25% </li></ul><ul><li>US SEM Sessi...
Is Performance Worth The Expense? YES!
Simplicity, quality, performance are design decisions
Questions?
Thank You! Blog:  tech.shopzilla.com Email: [email_address] Jobs: jobs.shopzilla.com
References <ul><li>http://developer.yahoo.com/performance/ </li></ul><ul><li>http://www.oracle.com/technology/products/coh...
Upcoming SlideShare
Loading in...5
×

Shopzilla - Performance By Design

6,379

Published on

A look at Shopzilla's path to high performance

Published in: Technology, Design
0 Comments
11 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,379
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
166
Comments
0
Likes
11
Embeds 0
No embeds

No notes for slide
  • Today I’d like to share with you Shopzilla’s redesign of our consumer site and content delivery infrastructure.
  • I’ll talk to you about Our company and why we started this project in the first place I’ll tell you about what we built – which is 1 part software and 2 parts obsession with performance and measurement A deep dive into the site architecture How we’re evolving our architecture to maintain performance while scaling our data Front End Performance techniques Then, we’ll look at our performance gains and our data which show a direct and quantifiable correlation between speed and money
  • Shopzilla is one of the largest and most comprehensive online shopping networks on the web through our leading comparison shopping sites Bizrate.com and Shopzilla.com and the Shopzilla Publisher Program We help shoppers find best value, for virtually anything from 1000’s retailers Across our network we serve more than 100M impressions per day to anywhere from 20-30M unique visitors searching as many as 8000x/second for more than 105M products
  • Our original comparison shopping marketplace was launched on Bizrate.com in the summer of 2000 * At that time, our business was relatively simple: Bizrate.com as comparison shopping in the US Over the next several years our business evolved into something significantly more complex * First we added a number of co-brands and syndication partners through our API Then we introduced a new product of our own into the market – Shopzilla.com * Finally, just when all that worked, we took a software ecosystem designed to run in a single location and deployed it active-active to several datacenters across the US and Europe
  • Its testament to a great engineering team that we were able to incrementally evolve our architecture to support our growth We were always in a rush to market Often took the “quickest” possible route when we built in all that change A site code base designed for a single brand in the US became this super complex brand-management and content delivery system – but never through a remodel – always through addition. Grew our user base; 20% growth year-over-year Grew our product inventory; doubling or more every year We had to develop proprietary systems to solve scalability problems We had to utilize exotic hardware to vertically scale
  • The previous incarnation of our site was implemented in Perl running under mod_perl on Apache 1.3 Largely a two tier architecture. Site communicated with the search engine, metadata storage to enrich search results and a database for other content and metadata All DB access utilized stored procedures to simplify queries Almost all reference data was cached during startup
  • Many pages have a lot of content, requiring many calls to the database and metadata stores to pull in that content; latency of a single request was poor No progressive rendering so the time to first byte / initial page display was very long Process model and large memory consumption severely limited the number of requests served by a single server instance We didn’t have great runtime instrumentation of request flow so it was very difficult to understand what was going on beyond looking at SQL queries and log statements Front end pages also had a lot of content requiring lots of requests to external resources to render a page
  • We were serving 9 or more web sites with different look and feel, localization and business logic from a single codebase Making changes to one site carried a great risk of affecting other sites High rate of change on a single code base Teams couldn’t easily take their sites in different directions
  • In 2007 we decided to rebuild our site. The first decision was do we refactor our current software or do we start over We decided to start over.  We have a fundamentally different business, we need fundamentally different software Our design principals were pretty basic:  Simple is the new “clever”; performance and quality are design decisions; and you get what you measure Shopzilla is a “scrum shop” This thing was too big to NOT have continuous feedback mainly because the site had grown over 7 years, nobody really even knew everything it did We decided that our scrum sprints would be 2 weeks Most importantly, we decided that we had to have continuous feedback from our users This turned out to be a hugely important decision #1 it gave us a huge tool to manage risk.  Since we decided to maintain the compatibility of the URL structure, we used a proxy by A10 networks to serve up our new site infrastructure, one page at a time! #2 it allowed us to keep up a constant drumbeat of progress for the company. Momentum was key for the company and actual, live, production launches were key for the team As a result we launched our first page for our first site on December of 07 Of course, this wasn’t just a page, it was the first version of the site framework as well Over the first 2 quarters of 2008, we gradually released more pages and increased the % of traffic we exposed to the new site until the full launch of Shopzilla on July 1 st .  Since the sites were supposed to be functionally identical, we were able to monitor all our same business metrics as key indicators of any issue and course-correct along the way – KNOWING we could bail back to the old site with a simple configuration change. With the release of Shopzilla in July we started the development of Bizrate With the Bizrate release you’ll notice we had far fewer public releases We were confident in our site framework and our risk strategy shifted from proving the approach to getting Bizrate live by our holiday shopping peak Finally in mid-november, we shifted 100% of our US site traffic to our s2 platform We’ll refer to this timeline again as we look at our performance gains
  • I’ll dive into detail about all these topics Simplify the web application layer Decompose site into functionality separate, individually testable loosely coupled services Define performance SLAs Load test before every release; failure to meet an SLA is a defect Instrument and measure production code Cache where appropriate Apply best-practice UI performance techniques
  • Loosely coupled web services Each layer is independently development, tested Independently scalable; redundancy built in Hardware load balancers used for every cluster
  • Web application is Java 1.6, Tomcat 6 Spring MVC Custom TAL templating engine Services are JAX-RS utilizing Apache CXF framework Database Access via Hibernate with Ehcache L2 caching Oracle 10g database We’re incorporating Oracle Coherence data grid for distributed caching
  • We picked 1.5 seconds full page load as an aggressive number based on the size and weight of our pages With streaming HTTP responses we figured an approximate 650ms server side response time to still allow 1.5s full page load
  • Web application tier is a mashup of data from numerous sources All network communications via HTTP; no direct database access
  • We utilized the Java Concurrency API to implement an asynchronous, concurrent service invocation framework Independent services are invoked in parallel Dependent service invocations may be chained Future results only used during rendering of the template; so no blocking until the results are actually required for rendering
  • Some pages may request data from up to 30 sources Helps reduce latency of a single request Streamed HTTP responses ensures HTML is returned to clients as it becomes available
  • Each service defines a coarse-grained API to return enriched data to the site The service may consult numerous data sources to produce its results Each service is implemented generically to work for all sites and countries Defined a versioning strategy to permit incremental updates even with backwards incompatible schema changes Greatly increases parallel development Each service defines its own SLA which is dictated by all the site clients that require access to the service Each service is tested and measured to its SLA
  • Each service defines an XSD and produces a Maven jar artifact that provides a java client API Use Apache HttpClient with multi-threaded connection pools with stale connection checking enabled Set connection and socket timeouts in many cases to ensure that even when a service chokes the site may degrade gracefully by omitting the failed content and rendering the rest of the page Service endpoints must support multiple inputs returning a list of results; the number of service invocations is a constant instead of proportional to the number of data elements on the page We use JAXB to unmarshal XML data to Java objects
  • So we’ve built an architecture that we believe to be performant and scalable. How do you go about testing this? There are a lot of moving parts * Highly concurrent requests, dozens of services, resource accesses Strategy: Each service is performance tested in isolation to its SLA. Then the full stack is performance tested
  • We have a pre-production environment that mimics production on a smaller scale We replay urls from our logs to generate sample load We utilize JMeter to generate load against the site Scripts emit graphs showing 95 th percentile response times and throughput at varying layers of concurrency We designed to meet our seasonal peak load and then 50% more traffic on top of that; a significant increase for a mature web site Each data center is built to carry 100% of traffic, but in practice it is split evenly across all data centers
  • So what do you do if you’re not meeting performance and you’ve isolated the problem to a particular product? Emit as much information as you can to identify timings when crossing layer boundaries Assign each request a unique id and access log it Write timing information to a performance log and correlate it to the unique id Pass the unique id through HTTP headers to all downstream services to allow correlation across layers
  • Utilize YourKit Java Profiler to identify poorly performing requests and concurrency bottlenecks Some lessons learned: Turn console logging off; log to files instead Reduce logging output; ERROR level for us Use StAX instead of Xerces; the latter has numerous synchronization blocks that cause it to choke under high concurrency XML parsing Retest service layers to ensure they continue to meet SLA Ensure that sequential service dependencies do not cause high latency requests; sometimes re-implement endpoints to provide richer content
  • We utilize JMX beans to emit performance information Every service call is annotated to emit moving average response time and moving 95 th percentile response time We graph service calls over time using Graphite A small percentage of requests in production emit logging information describing service calls We can produce waterfall graphs of server-side service invocations
  • We began leveraging Ehcache as an L2 cache with Hibernate Advantage: Easy to get started Disadvantages: Each service instance caches the exact same data so does not scale well to large data sets Query cache keys are essentially based on query inputs; not useful if there is a sparse mapping from inputs to data We have some extremely large data sets that we wanted to cache We evaluated and subsequently implemented Oracle Coherence on a number of projects
  • Oracle Coherence is an in-memory distributed data grid solution for clustered applications and application servers. Automatically and dynamically partitions data in memory across multiple servers Provides continuous data availability even in the event of a server failure Can perform in-memory grid computations and parallel transaction and event processing
  • We wanted to build an automated system that would allow us to create the most optimal URLs on our web pages to improve Search Engine Optimization; any link rendered on our site might be subject to a rule. We needed to build a backoffice system to crunch large volumes of data to produce the rule set We needed a solution that would allow us to publish an evolving data set of rules, made available to dozens of clients that would query the rule set at a high rate.
  • We worked with Oracle consultants to develop a data model and an architecture for both parts of the solution We built the backoffice system utilizing a small grid to compute the rule set configured with write-behind to persist the data to a database We publish new rules to the site grid as they become available Dozens of stateless client processes access the distributed site grid simultaneously We access the cache thousands or perhaps tens of thousands of times per second We thought near-caching might be necessary; turns out remote access is just fine
  • The ability to specifically and directly manage our URLs have created significant opportunity for our business One of the most successful SEO projects to date
  • We maintain a repository of keywords that we bid on with the major search engines. It’s a map whose keys are simply unique IDs and values are a rich object that describes the experience we’d like to direct the user to Every paid ad clicked on requires us to consult the map The data has grown to more than 600 million entries We started with a single in-memory cache on a single server When we outgrew that we devised our own partitioning scheme and fronted it with a (Java) service to provide seamless access to the distributed data set
  • Our data was copied to our remote datacenters by scheduled jobs It took a long time to publish new data sets and was prone to failure We ultimately reduced the number of times we published data to once per week Making the data live required restarts of multiple cache processes The simple act of restarting something can often end in tears
  • Coherence allowed us to scale our data beyond a single physical server using a distributed cache Automatically partitions our data We implemented read-through to transparently cache new data We configured the eviction policy to keep enough data to satisfy all the unique requests over a 90 day period No have no batch processes to ship gigabytes of data No delay in publishing new data Always on
  • Faster turnaround time for new paid placements No restarts lowers risks of errors resulting in bad user experiences; fewer alarms Performance is consistent, even when data is changing and we have much less software to maintain No batch jobs to maintain No partitioning logic
  • We operate one site grid per data center. The site grid has multiple functional caches. We’re currently upgrading it to double its capacity: 6 physical instances twin socket, dual core low power CPUs 32 Gb RAM Currently based on Coherence 3.4, upgrading to Coherence 3.5 16 JVM nodes with 1.5 Gb Heap each (Up from 8 JVMs on our previous grid) Distributed Cache Configurations Multiple functionally-separate caches There are 40+ client nodes connecting to each site grid.
  • We development against isolated instances of the grid We performance test against a pre-production grid configured identically to production, but on a smaller scale We’ve tested various scenarios like killing an entire server and watching Coherence repartition backup data dynamically to ensure no performance degradation
  • Steve Souders indicated that only 10% of the time is spent on the server side. Netflix have documented 20% at most It takes a lot of effort to get to that ratio But front-end performance enhancements can give great improvements I’m going to talk about some of the YSlow techniques that we’ve implemented
  • We’ve only recently begun to tackle this in earnest We took all our static images and sprited them to minimize the number of requests If you need help with that visit http://spriteme.org/ Next step will be to reduce CSS and Javascript resources; there are tradeoffs between deferred loading and progressive enhancement
  • “ It’s the Latency, Stupid” in 1996 Stuart Cheshire noted that if the speed of light in fibre is 66% of the speed of light in a vacuum, the round-trip time from California to Boston can never be less than 43ms. And that doesn’t count equipment latency and packet loss Move your content closer to your end users Every resource except dynamic HTML pages are served from a CDN We offload 100s of Gb data transfer to our CDN
  • Our origin server sets expiry headers; recently working to increase that to far-futures Resource requests include a unique id that allows us to effectively expire content during releases to avoid browsers caching stale CSS or JS Everything is compressed; our hardware load balancers gzip non-CDN content for us CSS and Javascript is minified
  • Yahoo recommends 2 – 4 hostname lookups per page. We use 3 different hostnames for JS+CSS, static, dynamic images plus 1 for the base page. Unfortunately 3 rd party advertisements add in a bunch more
  • We instituted a rule that requires every rendered link to result in zero redirects on navigation or resource downloads. 3 rd party advertisements do their own thing
  • A simple HTTP request might be 500 bytes. Our site utilizes a cookie to store around 1.5Kb of session data On a page with 60 external resources, we’d be sending 90Kb of data upstream for no reason Removing improved our top line revenue by 0.8%
  • Our CDN origin server allows server-side dynamic resizing of images We ensure our &lt;img&gt; elements always specify the exact width and height of the requested image
  • We accidentally discovered we had a multi-layered ICO We removed the layers and it dropped from over 2Kb to 318 bytes
  • Flushing the buffer early allows HTML to be shipped to the end user sooner We haven&apos;t experimented much with this, but the default Tomcat configuration maintains an 8Kb (uncompressed) buffer, so we do get auto-flushing at 8Kb boundaries We&apos;re looking into flushing more explicitly, such as post-header, post major section on the page.
  • We’re currently implementing Keynote to continuously monitor our performance from outsite the firewall On a periodic basis it visits a variety of URLs on our site and measures full page load time from different geographic areas Measures on both backbone (T1) connections and real connections such as DSL and Cable Provides detailed waterfall graphs, screenshots in the event of errors and alarms
  • So, What was our actual user experience page load time before and after? (can you spot) Webmetrics as the external monitor Performance times do not measure banner load time SLA’s for page load don’t include banners since via iframes our content is there for the user even while the banners are still loading
  • Did we make any money?
  • Site conversion increased from 8-12% (conversion slide)
  • Actually getting people to our site is obviously another big aspect of our financial performance (sem session slide) We can look at our Google SEM Sessions as a way to visualize the relationship between performance and abandonment
  • With our launch of s2 in biz.co.uk we also learned the significance of performance on the SEM relevance algo (biz UK slide) We launched on 5/18 and by 5/29 Google had figured out that our site was fast again This isn’t organic improvement obviously, we believe we were in a penalty box
  • In addition to conversion rates and sessions, we saw a number of other benefits (SUMMARY slide) PV’s up by about 25% But there were a number of other behind the scenes improvements: 50% less infrastructure Our site is significantly more available And while keeping it up, we are able to change the product at more than twice our previous pace
  • Is performance worth it? - YES
  • Simplicity, quality, performance are design decisions We Get What We Measure
  • Transcript of "Shopzilla - Performance By Design"

    1. 1. Performance By Design A look at Shopzilla's path to high performance Tim Morrow, Senior Architect 11/3/2009
    2. 2. Agenda <ul><li>Our Company and what we do </li></ul><ul><li>Design approach </li></ul><ul><li>Detailed site architecture </li></ul><ul><li>Architecture evolution </li></ul><ul><li>Front end performance techniques </li></ul>
    3. 3. Shopzilla, Inc. - Online Shopping Network 100M impressions/day 20-29M UV’s per Month 8,000+ searches per second 100M+ Products
    4. 4. Background <ul><li>In 2000, the business was simple - bizrate.com in the US </li></ul><ul><li>Co-brands, API and shopzilla.com </li></ul><ul><li>Multivariate testing, Europe and multiple data centers </li></ul>
    5. 5. Growth
    6. 6. Original Architecture
    7. 7. Performance and Scalability Issues <ul><li>High latency due to sequential resource access </li></ul><ul><li>Data fetching sometimes O(n) </li></ul><ul><li>Lengthy Time To First Byte </li></ul><ul><li>Memory constrained </li></ul><ul><li>Lacked visibility into performance issues </li></ul>
    8. 8. Development Issues <ul><li>Serving 9+ web site experiences </li></ul><ul><li>Different look-and-feel; localization </li></ul><ul><li>New features carry high risk </li></ul><ul><li>Shared code-base limited team autonomy </li></ul>
    9. 9. Our Build Approach <ul><li>Start over, stay simple </li></ul><ul><li>Build 2 weeks at a time, deliver every page ASAP </li></ul><ul><li>Manage risk by managing exposure </li></ul>
    10. 10. New Design Principals <ul><li>Simplify layers </li></ul><ul><li>Decompose architecture </li></ul><ul><li>Define SLAs </li></ul><ul><li>Continuous performance testing </li></ul><ul><li>Utilize caching </li></ul><ul><li>Apply best-practice UI performance techniques </li></ul>
    11. 11. New Site Architecture
    12. 12. Site Technologies <ul><li>Java 1.6, Tomcat 6 </li></ul><ul><li>Spring MVC </li></ul><ul><li>TAL-like templating engine </li></ul><ul><li>Services are JAX-RS implemented with Apache CXF </li></ul><ul><li>Hibernate 3 with Ehcache </li></ul><ul><li>Oracle 10g Database </li></ul><ul><li>Oracle Coherence Data Grid </li></ul>
    13. 13. Performance SLAs
    14. 14. Web Application Tier Service Calls
    15. 15. Web Application Tier
    16. 16. www.bizrate.com/digital-cameras
    17. 17. Decomposed Services <ul><li>http://…/service/ v3 /reviews?pid=419943686&show=30&sort=newestFirst </li></ul><ProdRevResponse> <ProductReviews> <ProductReview> <UserRating>8.0</UserRating> <Title>advanced and begining photographer alike will benefit</Title> ... </ProductReview> </ProductReviews> </ProdRevResponse>
    18. 18. Service Invocation <ul><li>Connection pooling </li></ul><ul><li>Stale connection checking </li></ul><ul><li>Hardware load balancers </li></ul><ul><li>Connection and Socket Timeouts </li></ul><ul><li>O(1) invocations on a single page </li></ul><ul><li>JAXB XML->Java unmarshaling </li></ul>
    19. 19. How Do We Performance Test? <ul><li>Highly Concurrent requests </li></ul><ul><li>Dozens of services </li></ul><ul><li>Database access </li></ul><ul><li>Search Engine invocations </li></ul><ul><li>Meta-data lookups </li></ul>
    20. 20. Performance Testing Environment
    21. 21. Analyzing Performance <ul><li>Emit server-side performance information </li></ul><ul><li>10.61.35.25 198.133.178.17 - - [15/Jun/2009:18:48:13 -0700] &quot;GET /digital-cameras/ HTTP/1.1&quot; 200 195063 - - www.bizrate.com unique_id= c52efc5b-740b-44d4-8693-587b6b756564!rst=22 </li></ul><ul><li>start=1245116893247; elapsed=19 ;requestId= c52efc5b-740b-44d4-8693-587b6b756564 ;startDate=2009-06-15 18:48:13.247-0700;url=http:///services/content/v7/topsearchService/BR/US/402/0/21; </li></ul><ul><li>start=1245116893248; elapsed=24 ;requestId= c52efc5b-740b-44d4-8693-587b6b756564 ;startDate=2009-06-15 18:48:13.248-0700;url=http:///search/v5/US/12/product_search?keyword=digital+cameras </li></ul><ul><li>Build server-side call graphs </li></ul>
    22. 22. Profiling <ul><li>YourKit Java Profiler </li></ul><ul><li>Identify Syncronization Bottlenecks </li></ul>
    23. 23. Production Performance Monitoring <ul><li>JMX Mbeans </li></ul><ul><li>Graphite graphing </li></ul>
    24. 24. Caching <ul><li>Started with a simple replicated local cache </li></ul><ul><li>Cached data stored in the service process </li></ul><ul><li>Cannot scale to large data sets </li></ul><ul><li>Read-through caching not always suitable </li></ul><ul><li>Moved to Oracle Coherence </li></ul>
    25. 25. Oracle Coherence <ul><li>Distributed data grid </li></ul><ul><li>Dynamic cluster membership </li></ul><ul><li>Automatic data partitioning </li></ul><ul><li>Continuous availability </li></ul><ul><li>Computation in the grid </li></ul>
    26. 26. Case Study #1 – URL Builder <ul><li>Rule set describing most optimal URL structures </li></ul><ul><li>High frequency access from site </li></ul><ul><li>Backoffice system to compute rules </li></ul>
    27. 27. Coherence Solution
    28. 28. Results <ul><li>Continue to meet performance SLAs </li></ul><ul><li>In production for over 6 months </li></ul><ul><li>Very successful project </li></ul>
    29. 29. Case Study #2 - Keyword Metadata <ul><li>Map of long IDs to object model describing landing page </li></ul><ul><li>Over 600 million entries </li></ul><ul><li>Entry point for the majority of our traffic </li></ul><ul><li>Home-grown partitioned, distributed cache </li></ul>
    30. 30. Challenges Time Consuming and intricate Restarts Required
    31. 31. Solution Automatic read-through Automatic Expiration Scalable
    32. 32. Results <ul><li>Faster acquisition of new paid placements </li></ul><ul><li>No restarts </li></ul><ul><li>Less software to maintain </li></ul><ul><li>Great performance </li></ul>
    33. 33. Cache Architecture <ul><li>6 physical instances </li></ul><ul><li>8-way, 32 Gb RAM </li></ul><ul><li>16 JVMs with 1.5 Gb heap </li></ul><ul><li>Distributed Cache </li></ul><ul><li>Database read-through </li></ul><ul><li>LRU expriy based on object count </li></ul>
    34. 34. Performance Testing Coherence <ul><li>Performance test application against the grid </li></ul><ul><li>Test scenarios such as server loss and data population </li></ul>
    35. 35. UI Performance Techniques <ul><li>80% of time spent rendering the page </li></ul><ul><li>Yahoo currently list 34 best practices </li></ul>
    36. 36. Minimize HTTP Requests <ul><li>Combined files </li></ul><ul><li>CSS Sprites http://spriteme.org/ </li></ul>
    37. 37. Use a CDN <ul><li>Move your content closer to end users </li></ul><ul><li>Reduce latency </li></ul><ul><li>Every resource except for dynamic HTML </li></ul><ul><li>Offloads 100s of gigabytes per day </li></ul>
    38. 38. Expiry, Compression and Minification <ul><li>Expiry headers instruct the Browser to use a cached copy </li></ul><ul><li>> 2 days considered “Far Futures” </li></ul><ul><li>Use versioning techniques to allow forced upgrades </li></ul><ul><li>Compressing reduces page weight </li></ul><ul><li>Minifying may still reduce size by 5% even with compression </li></ul>
    39. 39. Reduce DNS lookups <ul><li>Yahoo recommends 3 – 4 DNS lookups per page </li></ul><ul><li>Base page: www.bizrate.com </li></ul><ul><li>Javascript & CSS: file01.bizrate-images.com </li></ul><ul><li>Static images: img01.bizrate-images.com </li></ul><ul><li>Dynamic images: image01.bizrate-images.com </li></ul><ul><li>3 rd party Ads are a different story </li></ul>
    40. 40. Avoid Redirects <ul><li>Redirects delay your ability to server content </li></ul><ul><li>We strive for zero redirects </li></ul><ul><li>Exceptions: </li></ul><ul><ul><li>Redirect after POST </li></ul></ul><ul><ul><li>Handling legacy-styled URLs </li></ul></ul><ul><ul><li>Links off-site for tracking purposes </li></ul></ul>
    41. 41. Use a Cookie-free domain <ul><li>Don’t send cookies when requesting static resources </li></ul><ul><li>Buy a separate domain name </li></ul><ul><ul><li>bizrate-images.com </li></ul></ul><ul><li>Saves many Kb of upload bandwidth </li></ul><ul><li>Revenue increased by 0.8% </li></ul>
    42. 42. Do Not Scale Images in HTML <ul><li>Don’t request larger images only to shrink them </li></ul><ul><li>We utilize a dynamic image scaling server </li></ul><ul><li>CDN caches and delivers exact image size </li></ul>
    43. 43. Make favicon.ico small and cacheable <ul><li>Can interfere with download sequence </li></ul><ul><li>2Kb+ multi-layered version: </li></ul><ul><li>318 byte version: </li></ul><ul><li>We save 100s of megabytes per day </li></ul>
    44. 44. Flush the Buffer Early <ul><li>Delivers content to users sooner </li></ul><ul><li>By default Tomcat flushes every 8Kb of uncompressed content </li></ul><ul><li>Investigate more proactive flushing </li></ul>
    45. 45. Web Performance Measurement <ul><li>Continuous monitoring of full page load performance </li></ul>
    46. 46. Can you spot our release dates? shopzilla.com 10/13/08 bizrate.com 10/17/08
    47. 47. Yeah, yeah… but did we make money?
    48. 48. Site conversion rates increased 7-12% Revenue = sessions x conversion % x CPC
    49. 49. Performance Impacts Abandonment
    50. 50. Performance Penalties & bizrate.co.uk ~120% SEM Sessions
    51. 51. Performance Summary <ul><li>Conversion Rate + 7% - 12% </li></ul><ul><li>Page View’s + 25% </li></ul><ul><li>US SEM Sessions + 8% </li></ul><ul><li>Bizrate.co.uk SEM Sessions + 120% </li></ul><ul><li>Infrastructure Required (US) - 50% (200 vs 402 nodes) </li></ul><ul><li>Availability 99.71%  99.94% </li></ul><ul><li>Product Velocity + 225% </li></ul><ul><li>Release Cost $1,000’s  $80 </li></ul>
    52. 52. Is Performance Worth The Expense? YES!
    53. 53. Simplicity, quality, performance are design decisions
    54. 54. Questions?
    55. 55. Thank You! Blog: tech.shopzilla.com Email: [email_address] Jobs: jobs.shopzilla.com
    56. 56. References <ul><li>http://developer.yahoo.com/performance/ </li></ul><ul><li>http://www.oracle.com/technology/products/coherence/index.html </li></ul><ul><li>http://www.evidentsoftware.com/products/clearstone.aspx </li></ul><ul><li>http://www.yourkit.com/ </li></ul><ul><li>http://spriteme.org/ </li></ul><ul><li>http://www.keynote.com/ </li></ul><ul><li>http://jakarta.apache.org/jmeter/ </li></ul><ul><li>http://cxf.apache.org/ </li></ul><ul><li>http://graphite.wikidot.com/ </li></ul>
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×