Your SlideShare is downloading. ×
6910   week 8 - testing & optimization
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

6910 week 8 - testing & optimization

128
views

Published on

Published in: Technology, Business

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
128
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Testing & OptimizationISM 6910 – Week 8
  • 2. Week 8 Topics• Testing• End Action• Attribution• Media Mix Modeling
  • 3. Testing
  • 4. TestingA/B Tests:Multi Variety Tests:vs.Its TimeExpediaHeroNumbersOfferonlyIts TimeExpediaHeroNumbersOfferonlyIts TimeExpediaHeroNumbersOfferonlyGoSearchNowBookNow= Test cells executed during test= Test cells evaluated but not executedOffer - Up to 50% offOffer - Book together and SaveOffer - Start Saving
  • 5. A/B Test Example:Videos continue to score higher in NSAT and PSAT, but under perform in conversions.NSAT Results - FYQ22011With video results were stat. significantly higherUniqueVisitorsNSAT PSATFPP UpgradeConv. RateAvg.Revenue(All SKUs)Compare w/oVideo186,330 108 134 0.35% $1.13Compare withVideo187,185 124 136 0.30% $0.95Lift 16 2 (0.05%) ($0.18)*PSAT lift only has a stat. significance of 91%, all other are +99%FYQ2 results are inline with Septembersfindings, which showed adding a video to the comparepage has had a positive impact on visitor‟s NSAT andPSAT scores.A possible down side to adding more videos is theymay serve as a distraction, causing visitors to miss theBuy Now button and lowering conversion rates.Numbers have been doctored to hide client sensitive data
  • 6. Multivariate TestsFull factorial – test every possible variation. For example if you are testing fourdifferent elements and four variations of three element you are looking at 4 x 3 x 3 =36 combinations.Partial factorial – Partial factorial tests can be set up in a way that allows you to inferresults, the Taguchi method is probably the most commonly used method.Its TimeExpediaHeroNumbersOfferonlyIts TimeExpediaHeroNumbersOfferonlyIts TimeExpediaHeroNumbersOfferonlyGoSearchNowBookNow= Test cells executed during test= Test cells evaluated but not executedOffer - Up to 50% offOffer - Book together and SaveOffer - Start Saving
  • 7. OfferCall to ActionMessage*Shopping Rate by Call to Action165191180150180210Book Now Search Now GoShoppingRatepermillioncookiesShopping Rate by Offer157200179150180210Tanning Beach Cruise Bay BridgeShoppingRatepermillioncookiesImageMultivariate Example:Shopping Rate by Call to Action176159201179150180210Expedia as Hero Its Time Offer Only NumbersShoppingRatepermillioncookiesShopping Rate by Offer174183180150180210Book Together and Save 50% off Hotels GenericShoppingRatepermillioncookies_____
  • 8. Pros and ConsPros ConsA/B Test • Set up is relatively easy• Analysis is easier• Don‟t need much of a statsbackground to interpret results• Easy to get sucked into testing toomany things at once• A and B need to be differentenough to get results• Time consuming to test onelement at a timeMultivariate Test • Less political push back,everyone gets to test their idea• Get all of the analysis done inone shot• Easy to mess up• Tools are a black box, or do it yourself + your best PhD stats buddy• Need a lot of volume or time
  • 9. Testing RecommendationsStart with high impact tests:• Test home/landing pages• Test Conversions i.e. sign-up forms, cart/purchasepages, etc.• Test ad design• Price tests (a hard one politically to pull off)Other great things to tests:• Test landing page/deep linking• Page Heros
  • 10. Testing Best Practices• Start with a hypothesis – Don‟t just start testing random stuff likecolors unless you have a good reason.• Set goals – Looking to improve conversion rate by x%• What is significant – We‟re not testing drugs, no one‟s life is on theline so 99.9% statistical significance is probably over kill, but whatabout say 60%???
  • 11. More Testing Tips• Get help – Setting up the test, searching through your old statsnotes can be a challenge. Don‟t• Make it fun/interesting – It takes a lot to pull of a good test:UX, creative team, site dev, analysts, and maybe more. Plussomeone‟s budget. Everyone has an opinion and/or theory, you canuse that to get momentum for a testing project.At Getty we held a company wide contest to see who could pick thewinner of a multivariate test. There were +300 possible combinationsand everyone got to vote on which one they thought would be thewinner.
  • 12. End Action – Site Surveys
  • 13. How It Works• NSAT• PSAT• Value Prop• Purch. Intent
  • 14. Attitudinal End Action ProcessEA captures both behavioral and attitudinal data and correlates shifts in attitude with end actionstaken on site:
  • 15. What is it good for
  • 16. Combining attitudinal & behavioral dataThe End Action scorecard was originally designed to value experiences based on shifts in attitudes.For Q4 2010, we added Microsoft Store Purchase behavior as well:
  • 17. End Action conversion ratesUsing End Action cookie data we can report on a more accurate conversionrate.If we assume most site visits don‟t last longer than 30 minutes, we can conclude less than half of Store buyers (43%) make apurchase during their first site visit. The remaining purchasers return later (sometimes days later) to complete their purchase. UsingEnd Action cookie data site visitors who read a product review, leave the Shop page, and return later to finally make a purchase willstill be counted when reporting on site visitors who read a product review and then made a purchase.Numbers have been doctored to hide client sensitive data
  • 18. Challenges
  • 19. Measurement overloadNumbers have been doctored to hide client sensitive data
  • 20. EA measures correlations, not causationExample: People who watch 7 Second demos have 10% higher Win.com NSAT than people who don‟t watch demosHowever, EA quantifies correlation, not causation• Cannot immediately say: Watching videos makes people more 10% satisfied• This requires additional information such as specific testing, observation, and insight
  • 21. Survey timing can create respondent biasesSite visitors are invited to take the EA survey as soon as they leave the windows domain. So, as site visitorsmove further down the funnel, survey respondents start to look more like visitors who are abandoning their cartvs. purchasers. This can be seen in the illustration below.In this example, Visitor A takes the survey and will be included in the NSAT results for the Visit Shop EA, but doesnot purchase. While Visitor B completes the purchase process but by doing so, never receives a survey invite.
  • 22. Key Insights
  • 23. Compare Page VideoVideos continue to score higher in NSAT and PSAT, but under perform in conversions.NSAT Results - FYQ22011With video results were stat. significantly higherUniqueVisitorsNSAT PSATFPP UpgradeConv. RateAvg.Revenue(All SKUs)Compare w/oVideo186,330 108 134 0.35% $1.13Compare withVideo187,185 124 136 0.30% $0.95Lift 16 2 (0.05%) ($0.18)*PSAT lift only has a stat. significance of 91%, all other are +99%FYQ2 results are inline with Septembersfindings, which showed adding a video to the comparepage has had a positive impact on visitor‟s NSAT andPSAT scores.A possible down side to adding more videos is theymay serve as a distraction, causing visitors to miss theBuy Now button and lowering conversion rates.Numbers have been doctored to hide client sensitive data
  • 24. Attitudes Influence Buying BehaviorAs site visitors move deeper into the site and further down the purchase funnel, we start to see an increase inboth site satisfaction (NSAT) and the Windows 7 Upgrade conversion rate. Based on the EA survey data, weknow that we have some levers for improving site satisfaction – using video or interactive experiences, providingvalue added downloads etc. From this data, we can see that by first improving NSAT, we can push more peopleinto a transactional mode on the site.(1)(2)(1) NSAT % ∆ from FYQ4 EAA Scorecard(2) FPP Upgrade # ∆ from FYQ4 Sales Trans ScorecardNote: Purchase NSAT & Video Conv. Rate were not statistically significant by +/-5%*Conversion rate = purchasers who took the end action / count of unique cookies who took the end action.Numbers have been doctored to hide client sensitive data
  • 25. Target Content = Higher ScoresWindows 7 visitors – Visitors who visited the Compare pages, Anytime Upgrade, and Features pages had higher NSAT scoreswhile less relevant pages like the Upgrade Advisor and the Get win7 default page scored lower.Vista Visitors – Vista users who visited the Compare pages and Upgrade Advisor related pages had higher NSAT scores. Theless relevant Anytime Upgrade pages scored lower.XP Visitors – Similar to the Vista users, the Compare pages and Upgrade Advisor related pages had higher NSATscores, while the less relevant Anytime Upgrade pages scored lower.Numbers have been doctored to hide client sensitive data
  • 26. Multi Touch Attribution
  • 27. Ad ConversionsWhen a user clicks on an Adthey re-directed throughAtlas to the destination page.AtlasAtlas records the clickand re-directs the userto the destination page.AtlasAd Server(img server,CDN)Atlas1x1If the site has an action tag on thelanding page the visit can now bedirectly tied back to the ad.Atlas can then tie each ad impressions and click back to the action tag, (per cookie).This is data is then used to optimize the ad campaign.
  • 28. GA Videohttp://youtu.be/Cz4yHOKE5j8
  • 29. Advanced Attribution: DetailsProblem: Ad-server rules are heavily biased in favor of click-based and „last-touch‟ exposures (i.e. brandedsearch) and undervalue a person‟s history of exposure to display media.Objective: Correct this bias by reallocating credit for conversions in proportion to the relative contribution of pastexposures.Approach: Model cookie-exposure history to estimate relative contribution. Use model estimates to „score‟ theindividual placements; awarding each placement some, all, or no credit for a cookies conversion.Action: Media-planners may optimize online media budget, either during or after a campaign, towards thosepublishers and engagements that drive the greatest ROI.
  • 30. There are several approaches Method II: Recency-weighted Attribution Score is assigned according to its time distance to conversion Special weight might be given to the first and last touch point Method III: Probabilistic Attribution Weight is given according to conversion probability change from exposure to the ads Probability is calculated from predicting models on ads frequency and attributes1/n C1/n1/n 1/n 1/n 1/n 1/n 1/nCS.t1S.t2S.t3S.t4S.t5S.tn S.tn-1 S.tn-2CΔP1ΔP2ΔP3ΔP4ΔP5ΔP8 ΔP7 ΔP6Simple approach, but flawed in that it‟s really a “welfarestate” for media that does not address relative efficacyMore nuanced approach differentiates by recency, but doesnot account for relative performance differences of differentformatsMore complex performance-based approach uses thechange in historical conversion probability per exposure toallocate credit Method I: Even Distribution Score = 1/ n (n is total exposure frequency)
  • 31. Outcome ExampleUsing the conversion rates under the attribution model certain placements and networks look better or worse, this directly effectshow and where the media team purchases ad placements.
  • 32. Incremental revenue from attributionIncremental revenue increase is calculatedby comparing attribution mediaoptimization against last touch mediaoptimization.Incremental revenue increase varies withthe degree of optimization shift from leastto most efficient media.5% optimization: +$15 million (+2.67%). 10% optimization: +$29 million (+5.09%)incremental revenue increase. 15% optimization: +$45 million (+7.95%)incremental revenue increase.$564$576 $579 $582$564$591$608$628$520$540$560$580$600$620$640Base Lowest 5% Lowest 10% Loweest 15%MillionsRevenue With OptimizationStandard Last Touch Razorfish Advanced Attribution$15 MM$29MM$45 MM
  • 33. Case Study0.691.02ControlTest0.69%1.02%48% lift in Paid Search Click-Through Rate due toBanner Ad ExposureTest group was exposedto client media whenencountering campaignplacementsControl group wasexposed to PSA mediawhen encounteringcampaign placements• Across clients and advertisers, banner exposureconsistently drives incremental search clicks andconversions• Clearly, some portion of credit for searchconversion belongs to prior display (and othermedia exposure)• Attribution quantifies the relative contributions ofeach touch point and allocates credit accordinglyExample is from an apparel retailer. We ran a “true lift test” – where we held out a random control from all display media for aperiod, and evaluated performance differences between control and exposed. These results are consistent with other similartest run for other clients.
  • 34. Media Mix Models
  • 35. ConversionsTVRadioDisplay MobileCinemaPrintMedia Mix ModelsProblem: When multi-channel marketing efforts occur simultaneously it can be hard to identify which of thesechannels responsible for conversions. Answers are difficult to come by when direct measurement of individual-level exposure is not feasible (i.e. OOH, TV etc.).Objective: Create a model that accurately reflects how well each channel operates within a generalbusiness/marketing environment.Approach: Use daily (or weekly) tracking data to specify the relationship between channel activity and conversionvolume. Incorporate into the models channel-specific accumulation and decay effects as well asrelevant, macroeconomic indicators and historical events.Action: Using the results to estimate the channel specific point of diminishing returns, the optimal spend per channelis appraised for future campaigns.
  • 36. Factors and media effectsThe most important aim of the attribution analysis is to get to the relationship between media spend and the KPIthat we are optimizing for. In order to get there, we need to understand each media type impacts KPIs and eachother
  • 37. Ad Stocking effectsAdding the ad stocking effect of media to the model helps account for the diminishing effects of an ad over time.The chart below shows the approximate half life of each media type modeled. Note some media types have alonger half life than others, i.e. the effect of TV ads tend to last longer than a banner ad for example.OptimizerAd StockingEffectsEffectivenessCurvesMedia CostCurvesTotal BudgetTypical Half Lifes by Media Types
  • 38. Media effectiveness curvesThe effectiveness of media diminishes as the volume of exposure is increased. Eventually the incrementalchange in media will have little to no effect on the reached audience, the saturation point. Each media typereaches its saturation point at different levels of exposure (GRPs).Diminishing Returns by Media TypeOptimizerAd StockingEffectsEffectivenessCurvesMedia CostCurvesTotal Budget
  • 39. Media cost effectsMedia Reach Curves:• Inventory constraints for each media type.• Planner judgment on maximum feasible investment levels.Media Cost Curves:• These reflect how media costs scale as spend scales.• These need to capture realities such as increasing costs per reachpoint, seasonality etc. in order to pragmatically reflect the media landscape.OptimizerAd StockingEffectsEffectivenessCurvesMedia CostCurvesTotal Budget
  • 40. Budget effectsBecause the saturation point and level of effectiveness changes at a different rate for each media type the overalloptimal mix for each channel will change with the overall media budget. In the example below shows how optimalmix in spend shifts from one media type to another depending on the level of spend.Diminishing Returns by Media SpendBudget ABudget BOptimizerAd StockingEffectsEffectivenessCurvesMedia CostCurvesTotal Budget
  • 41. OptimizationThe optimizer takes into account all of the factors, ad stocking, diminishing returns, cost and inventory constraintsand through and through an iterative process chooses the optimal media channel for each incremental dollarspent.Diminishing Returns by Media Spend Final Optimized ResultsOptimizerAd StockingEffectsEffectivenessCurvesMedia CostCurvesTotal Budget