Experimental Meta-analyses

•

0 likes•156 views

Weitao Duan

Dominic Coey at LinkedIn Experimentation Meetup on Dec 13, 2018

Technology

Experimental Meta-analyses
Dominic Coey, coey@fb.com
Tom Cunningham, tomcunningham@fb.com
Facebook

Why Meta-analysis? Part I.
Suppose you have two metrics, A and B.
• Metric A is significant at the 5% level in 5% of all
experiments.
• Metric B is significant at the 5% level in 40% of all
experiments.
All else equal, which should you trust more?

Why Meta-analysis? Part I.
Metric A suggests no experiment ever has any effect. All
noise! Our best guess of the true effect = zero.
Metric B suggests some experiments have an effect. Our
best guess of the true effect = estimated effect x 0.82 (if
everything is normal and mean zero).

Why Meta-analysis? Part II.
Imagine this is the histogram
of estimated effect sizes from
historical experiments.
You see a 2% lift in your new
experiment. What should you
infer?

Why Meta-analysis? Part II.
Each observed effect, y, is the sum of
• the true treatment effect, t
• sampling error, e
If y is very large, likely in part due to a large draw of e. So
should adjust y downwards ("shrink") to offset this, and get a
better estimate of t.

Why Meta-analysis? Part III.
Consider a test which improves metric A ("comments") but
degrades a related metric B ("posts").
What, if anything, can we conclude from this?
Can the movement in B give us more information about the
movement in A?

Why Meta-analysis? Part III.
Silly example:
• the true lift in each experiment is t ~iid F, equal for both
metrics
• observed values are yA = t + eA, yB = t + eB, for
independent eA, eB
• metric B's contains extra information about metric A
More generally, might have some joint distribution of (tA, tB,
yA, yB), estimated on past experiments.

Conclusion
Tech companies run lots of experiments, but they often fall
into a small number of experiment types.
Ignoring the information in past, highly related experiments
is leaving a lot on the table!
Our paper on experiment splitting develops some of these
issues.

$Appendix Where does the 0.82 number come from? Consider the model where • the true effect t ~ N(0, vt) • the sampling error e ~ N(0, ve) • the observed outcome is y = t + e Can show • E(t | y) = y x vt/(vt + ve). • If the fraction of rejections at the 5% level is p, then vt/(vt + ve) = 1 - (Φ-1(p/2)/1.96)2.$

Similar to Experimental Meta-analyses

Toward an Ethical ExperimentYusuke Narita

Intro to Approximate Bayesian Computation (ABC)Umberto Picchini

12 13 h2_measurement_pptTan Hong

Introduction to Machine Learning and Deep LearningTerry Taewoong Um

Linear Modeling Survival Analysis Statistics Assignment HelpStatistics Assignment Experts

Principal componentsHutami Endang

Chapter 5 t-testJevf Shen

Factorial ExperimentsHelpWithAssignment.com

ISSTA'16 Summer School: Intro to StatisticsAndrea Arcuri

Chapter 3.pptxmahamoh6

sigir2018tutorialTetsuya Sakai

Lab manual uoh_ee370slatano

Statistical hypothesis testing in e commerceAnatoliy Vuets

Data Science Interview Questions | Data Science Interview Questions And Answe...Simplilearn

2008 JSM - Meta Study Data vs Patient DataTerry Liao

Nber Lecture FinalNBER

Matlab lab.pdfstirlingvwriters

Class9_PCA_final.pptMaTruongThanh002937

MLlectureMethod.pptbutest

Similar to Experimental Meta-analyses (20)

Toward an Ethical Experiment

Intro to Approximate Bayesian Computation (ABC)

12 13 h2_measurement_ppt

Introduction to Machine Learning and Deep Learning

Linear Modeling Survival Analysis Statistics Assignment Help

Principal components

Chapter 5 t-test

Factorial Experiments

ISSTA'16 Summer School: Intro to Statistics

Chapter 3.pptx

sigir2018tutorial

Lab manual uoh_ee370

Statistical hypothesis testing in e commerce

Data Science Interview Questions | Data Science Interview Questions And Answe...

2008 JSM - Meta Study Data vs Patient Data

Nber Lecture Final

Matlab lab.pdf

Class9_PCA_final.ppt

MLlectureMethod.ppt

Recently uploaded

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

"ML in Production",Oleksandr BaganFwdays

Gen AI in Business - Global Trends Report 2024.pdfAddepto

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

Artificial intelligence in the post-deep learning eraDeakin University

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer

Bluetooth Controlled Car with Arduino.pdfngoud9212

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Pigging Solutions Piggable Sweeping ElbowsPigging Solutions

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Install Stable Diffusion in windows machinePadma Pradeep

Recently uploaded (20)

SIP trunking in Janus @ Kamailio World 2024

"ML in Production",Oleksandr Bagan

Gen AI in Business - Global Trends Report 2024.pdf

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

Artificial intelligence in the post-deep learning era

Benefits Of Flutter Compared To Other Frameworks

Scanning the Internet for External Cloud Exposures via SSL Certs

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

My INSURER PTE LTD - Insurtech Innovation Award 2024

Bluetooth Controlled Car with Arduino.pdf

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Designing IA for AI - Information Architecture Conference 2024

WordPress Websites for Engineers: Elevate Your Brand

Pigging Solutions Piggable Sweeping Elbows

Vertex AI Gemini Prompt Engineering Tips

Dev Dives: Streamline document processing with UiPath Studio Web

"Debugging python applications inside k8s environment", Andrii Soldatenko

Install Stable Diffusion in windows machine

Experimental Meta-analyses

1. Experimental Meta-analyses Dominic Coey, coey@fb.com Tom Cunningham, tomcunningham@fb.com Facebook

2. Why Meta-analysis? Part I. Suppose you have two metrics, A and B. • Metric A is significant at the 5% level in 5% of all experiments. • Metric B is significant at the 5% level in 40% of all experiments. All else equal, which should you trust more?

3. Why Meta-analysis? Part I. Metric A suggests no experiment ever has any effect. All noise! Our best guess of the true effect = zero. Metric B suggests some experiments have an effect. Our best guess of the true effect = estimated effect x 0.82 (if everything is normal and mean zero).

4. Why Meta-analysis? Part II. Imagine this is the histogram of estimated effect sizes from historical experiments. You see a 2% lift in your new experiment. What should you infer?

5. Why Meta-analysis? Part II. Each observed effect, y, is the sum of • the true treatment effect, t • sampling error, e If y is very large, likely in part due to a large draw of e. So should adjust y downwards ("shrink") to offset this, and get a better estimate of t.

6. Why Meta-analysis? Part III. Consider a test which improves metric A ("comments") but degrades a related metric B ("posts"). What, if anything, can we conclude from this? Can the movement in B give us more information about the movement in A?

7. Why Meta-analysis? Part III. Silly example: • the true lift in each experiment is t ~iid F, equal for both metrics • observed values are yA = t + eA, yB = t + eB, for independent eA, eB • metric B's contains extra information about metric A More generally, might have some joint distribution of (tA, tB, yA, yB), estimated on past experiments.

8. Conclusion Tech companies run lots of experiments, but they often fall into a small number of experiment types. Ignoring the information in past, highly related experiments is leaving a lot on the table! Our paper on experiment splitting develops some of these issues.

9. Appendix Where does the 0.82 number come from? Consider the model where • the true effect t ~ N(0, vt) • the sampling error e ~ N(0, ve) • the observed outcome is y = t + e Can show • E(t | y) = y x vt/(vt + ve). • If the fraction of rejections at the 5% level is p, then vt/(vt + ve) = 1 - (Φ-1(p/2)/1.96)2.

Experimental Meta-analyses

Recommended

Recommended

More Related Content

Similar to Experimental Meta-analyses

Similar to Experimental Meta-analyses (20)

Recently uploaded

Recently uploaded (20)

Experimental Meta-analyses