Google_Controlled Experimentation_Panel_The Hive


Published on

1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • - At LinkedIn we A/B tested on everything: new feature, new algorithm, user experience (user flow, UI)From simple samples to highly targeted samples such as all users that have come to the site in the last 30days, working for US companies that have at least 500 employees and have not uploaded their email address book in the last 90 days….Demographics: job seekers, recruiters, outbound professionals, content providers, content consumers, networkers, branders..
  • Complex metrics beyond CTR, engagement component context dependent, short-term proxies to avoid long terms A/B testsI will illustrate each with real problems we had on LinkedIn
  • Same applies to cannibalization
  • Social Referral: Leveraging Network Connections toDeliver Recommendations‘Wisdom of your friend’ social proofs
  • Google_Controlled Experimentation_Panel_The Hive

    1. 1. (A Few) Key Lessons Learned Building LinkedIn Online Experimentation Platform Experimentation Panel 3-20-13
    2. 2. Experimentation at LinkedIn• Essential part of the release process• 1000s of concurrent experiments• Complex range of target populations based on content, behavior and social graph data• Cater to a wide demographic• Large set of KPIs
    3. 3. The next frontier• KPIs – Beyond CTR• Multiple objective optimization• KPIs reconciliation• User visit imbalance• Virality preserving A/B testing• Context dependent novelty effect• Explicit feedback vs. implicit feedback
    4. 4. Picking the right KPI can be tricky• Example: engagement measured by # comments on posts on a blog website• KPI1 = average # comments per user – B wins by 30%• KPI2 = ratio of active (at least one posting) to inactive users – A wins by 30%• How is this possible? KPI1 KPI2Do you want a smaller, highlyengaged community, or a larger, lessengaged community?
    5. 5. Winback campaign• Definition – Returning to the web site at least once? – Returning to the web site with a certain level of engagement, possible comparable, more or a bit less than before the account went dormant?• Example: reminder email at 30 days after registration Registered 335 Days Ago 4000 3500 3000 2500 2000 1500 Occurrence 1000 500 0 Came back once at 30 days 3 17 31 45 59 73 87 101 115 129 143 157 171 185 199 213 227 241 255 269 283 297 311 325 339 then went dormant Loyalty Distribution: Time since last visit
    6. 6. Multiple competing objectives Suggest relevant groups … that one is more likely to participate inTalentMatch(Top 24 matches of a posted job for sale) Suggest skilled candidates … who will likely respond to hiring managers inquiries Semantic + engagement objectives 6
    7. 7. TalentMatch use case• KPI: Repeated TM buyers 6m-1y window!• Short-term proxy with predictive power: – Optimize for InMail response rate while controlling for booking rate and InMail sent rate 7
    8. 8. KPIs reconciliation• How do you compare apples and oranges? – E.g. People vs. Job recommendations swap – X% lift in job apps vs Y% drop in invitations – Value of an invitation vs. value of a job application?• Long term cascading effect on a set of site-wide KPIs
    9. 9. User visit imbalance• Observed sample ≠ intended random sample• Consider an A/B test on the homepage lasting L days. Your likely observed sample will have – Repeated (>> L) obs for super power users – ≈ L obs for daily users – ≈ L/7 obs for weekly users – NO obs for users coming less than every L days• κ statistics• Random effects models
    10. 10. Virality preserving A/B testing• Random sampling destroys social graph• Critical for social referrals – ‘Warm’ recommendations – ‘Wisdom of your friends’ social proof• Core + fringe to mimimize – WWW’11 FB, ‘12 Yahoo Group recommendations
    11. 11. Context dependent novelty effect• Job recommendation algorithms A/B test – first 2 weeks: 2X long term stationary lift• TalentMatch – no short-term novelty effect
    12. 12. Explicit feedback A/B testing• Enable you to understand usefulness of a product/feature/algorithm with unequal depth • Text based A/B test! Sentiment analysis• Reveal unexpected complexities • E.g. ‘Local’ means different things for different members• Prevent misinterpretation of implicit user feedback!• Help prioritize future improvements 12
    13. 13. References• C. Posse, 2012: A (Few) Key Lessons Learned Building Recommender Systems for Large-Scale Social Networks. Invited Talk, Industry Practice Expo, 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Beijing, China• M. Rodriguez, C. Posse and E. Zhang. 2012. Multiple Objective Optimization in Recommendation Systems. Proceedings of the Sixth ACM Conference on Recommender Systems, pp. 11-18• M. Amin, B. Yan, S. Sriram, A. Bhasin and C. Posse. 2012. Social Referral: Using Network Connections to Deliver Recommendations. Proceedings of the Sixth ACM Conference on Recommender Systems, pp. 273-276• X. Amatriain, P. Castells, A. de Vries, C. Posse, 2012. Workshop on Recommendation Utility Evaluation: Beyond RMSE, Proceedings of the Sixth ACM Conference on Recommender Systems, pp. 351-352 13