Introduction to Principal Stratification.
Presented Papers:
Principal Stratification in Causal Inference (Frangkais & Rubin, 2002)
Estimation of Causal Effects via Principal Stratification When Some Outcomes Are Truncated by "Death" (Zhang & Rubin, 2003)
A Refreshing Account of Principal Stratification (Mealli & Mattei, 2012)
Presentation given to students in Harvard University STAT 286: Causal Inference
Matching Weights to Simultaneously Compare Three Treatment Groups: a Simulati...Kazuki Yoshida
Presentation at the Epidemiology Congress of Americas 2016.
https://epiresearch.org/2016-meeting/submitted-abstract-sessions/pharmacoepidemiology-estimation-of-treatment/
Paper: http://journals.lww.com/epidem/Abstract/publishahead/Matching_weights_to_simultaneously_compare_three.98901.aspx (email me at kazukiyoshida@mail.harvard.edu)
Simulation code: https://github.com/kaz-yos/mw
Tutorial: http://rpubs.com/kaz_yos/matching-weights
1) The document presents a statistical modeling approach called targeted smooth Bayesian causal forests (tsbcf) to smoothly estimate heterogeneous treatment effects over gestational age using observational data from early medical abortion regimens.
2) The tsbcf method extends Bayesian additive regression trees (BART) to estimate treatment effects that evolve smoothly over gestational age, while allowing for heterogeneous effects across patient subgroups.
3) The tsbcf analysis of early medical abortion regimen data found the simultaneous administration to be similarly effective overall to the interval administration, but identified some patient subgroups where effectiveness may vary more over gestational age.
Comparison of Privacy-Protecting Analytic and Data-sharing Methods: a Simulat...Kazuki Yoshida
This document describes a simulation study comparing different privacy-protecting analytic and data-sharing methods in a distributed data network setting. The study aims to provide a framework for classifying previously suggested privacy-protecting methods and to assess their relative performance. Specifically, it examines how different levels of data sharing (e.g. individual-level data, risk-set data, summary tables) can affect analysis performance. The document outlines the classification of methods, levels of data sharing, simulation design, implementation, and assessment metrics that will be used to compare the performance of various methods.
This document discusses important concepts for screening data, including detecting and handling errors, missing data, outliers, and ensuring assumptions of analyses are met. It describes why data screening is important to obtain accurate results and avoid bias. Key topics covered include identifying patterns of missing data, different types of missing data (MCAR, MAR, MNAR), and various methods for treating missing values. Outliers are defined and their impact explained. Common transformations are presented to achieve normality, linearity, and homoscedasticity. Checklists are provided for conducting data screening.
A Method for Meta-Analytic Confirmatory Factor AnalysisKamden Strunk
Research presentation by Kamden Strunk on A Method for Meta-Analytic Confirmatory Factor Analysis. Originally presented at the Southwestern Psychological Association in 2013.
- A sample is a small group selected from a population to represent that population. Sampling provides benefits like being less time-consuming, less expensive, and allowing results to be repeated.
- There are two main types of samples: probability and non-probability. Probability samples include simple random, systematic, stratified, and cluster samples. Sample size is determined based on factors like the type of study, expected results, costs, and available resources.
- Inferential statistics allow generalization from a sample to a population through hypothesis testing and significance tests. Tests include t-tests, F-tests, chi-squared tests, and correlation/regression to analyze relationships between variables. Significant results suggest differences are likely not due to chance
Matching Weights to Simultaneously Compare Three Treatment Groups: a Simulati...Kazuki Yoshida
Presentation at the Epidemiology Congress of Americas 2016.
https://epiresearch.org/2016-meeting/submitted-abstract-sessions/pharmacoepidemiology-estimation-of-treatment/
Paper: http://journals.lww.com/epidem/Abstract/publishahead/Matching_weights_to_simultaneously_compare_three.98901.aspx (email me at kazukiyoshida@mail.harvard.edu)
Simulation code: https://github.com/kaz-yos/mw
Tutorial: http://rpubs.com/kaz_yos/matching-weights
1) The document presents a statistical modeling approach called targeted smooth Bayesian causal forests (tsbcf) to smoothly estimate heterogeneous treatment effects over gestational age using observational data from early medical abortion regimens.
2) The tsbcf method extends Bayesian additive regression trees (BART) to estimate treatment effects that evolve smoothly over gestational age, while allowing for heterogeneous effects across patient subgroups.
3) The tsbcf analysis of early medical abortion regimen data found the simultaneous administration to be similarly effective overall to the interval administration, but identified some patient subgroups where effectiveness may vary more over gestational age.
Comparison of Privacy-Protecting Analytic and Data-sharing Methods: a Simulat...Kazuki Yoshida
This document describes a simulation study comparing different privacy-protecting analytic and data-sharing methods in a distributed data network setting. The study aims to provide a framework for classifying previously suggested privacy-protecting methods and to assess their relative performance. Specifically, it examines how different levels of data sharing (e.g. individual-level data, risk-set data, summary tables) can affect analysis performance. The document outlines the classification of methods, levels of data sharing, simulation design, implementation, and assessment metrics that will be used to compare the performance of various methods.
This document discusses important concepts for screening data, including detecting and handling errors, missing data, outliers, and ensuring assumptions of analyses are met. It describes why data screening is important to obtain accurate results and avoid bias. Key topics covered include identifying patterns of missing data, different types of missing data (MCAR, MAR, MNAR), and various methods for treating missing values. Outliers are defined and their impact explained. Common transformations are presented to achieve normality, linearity, and homoscedasticity. Checklists are provided for conducting data screening.
A Method for Meta-Analytic Confirmatory Factor AnalysisKamden Strunk
Research presentation by Kamden Strunk on A Method for Meta-Analytic Confirmatory Factor Analysis. Originally presented at the Southwestern Psychological Association in 2013.
- A sample is a small group selected from a population to represent that population. Sampling provides benefits like being less time-consuming, less expensive, and allowing results to be repeated.
- There are two main types of samples: probability and non-probability. Probability samples include simple random, systematic, stratified, and cluster samples. Sample size is determined based on factors like the type of study, expected results, costs, and available resources.
- Inferential statistics allow generalization from a sample to a population through hypothesis testing and significance tests. Tests include t-tests, F-tests, chi-squared tests, and correlation/regression to analyze relationships between variables. Significant results suggest differences are likely not due to chance
ENAR 2018 Matching Weights to Simultaneously Compare Three Treatment Groups: ...Kazuki Yoshida
This document summarizes a simulation study comparing matching weights to other propensity score methods for simultaneously comparing three treatment groups. The simulation found that matching weights provided the best covariate balance, similarly small bias as matching, and smaller mean squared error than matching. Matching weights were also more robust to scenarios with rare events, unequal group sizes, and poor covariate overlap compared to other methods. An empirical example using Medicare data to compare opioids, COX-2 inhibitors, and NSAIDs further demonstrated the covariate balance and outcome results achieved with matching weights.
Propensity Score Methods for Comparative Effectiveness Research with Multiple...Kazuki Yoshida
My dissertation research (and a little more) as presented at the Study Design and Biostatistics Center, Department of Population Health Sciences, University of Utah.
GLMM in interventional study at Require 23, 20151219Shuhei Ichikawa
This document provides an overview of generalized linear mixed models (GLMMs) for medical research:
- GLMMs combine generalized linear models, which handle non-normal data distributions using link functions, and linear mixed models, which incorporate random effects.
- When reporting GLMM analyses, it is important to adequately describe the statistical methods, research design, causal inference approach, model validation, and software/functions used. Previous reviews found much room for improvement in GLMM reporting quality.
- Guidelines for standardized GLMM reporting in medical journals could help ensure the validity of conclusions by documenting the analysis methods generating the results.
Final Presentation given at the conclusion of the 2018 IMSM by the Savvysherpa Student Working Group.
Group Members: Chixiang Chen, Ashley Gannon, Duwani Katmullage, Miaoqi Li, Mengfe Liu, Rebecca North and Jialu Wang
Statistical Methods for Removing Selection Bias In Observational StudiesNathan Taback
The slide deck is from a talk I delivered at a Dana Farber / Harvard Cancer Center outcomes seminar. It presents an overview of currently available statistical methods to remove bias in observational studies.
This document outlines the concepts and methods of multiple-treatments meta-analysis (MTM). MTM allows for the simultaneous comparison of multiple interventions for a condition by combining both direct and indirect evidence from randomized controlled trials. Key advantages of MTM include the ability to rank treatments, comprehensively use all available data, and compare interventions not directly compared in trials. The document discusses MTM approaches using frequentist meta-regression and Bayesian statistics.
This document summarizes methods for subgroup identification in clinical trials. It begins by distinguishing predictive from prognostic biomarkers. It then provides a taxonomy of four main approaches to subgroup identification: global outcome modeling, global treatment effect modeling, modeling individual treatment regimes, and local treatment effect modeling (subgroup search). The document discusses several examples and methods under each approach. It concludes by noting important considerations for evaluating subgroup identification methods, such as the number of predictors handled, model complexity control, type I error control, and obtaining honest effect size estimates.
This document discusses investigating heterogeneity in meta-analyses through subgroup analysis and meta-regression. It outlines when and how to use these techniques to explore reasons for variability in study results. Key challenges include having enough studies, selecting explanatory variables carefully to avoid false positives, and accounting for confounding and aggregation bias in study-level data. Meta-regression allows for random effects but interpretation requires caution given observational relationships between study characteristics and effects.
Statistical modelling is of prime importance in each and every sphere of data analysis. This paper reviews the justification of fitting linear model to the collected data. Inappropriateness of the fitted model may be due two reasons 1.wrong choice of the analytical form, 2. Suffers from the adverse effects of outliers and/or influential observations. The aim is to identify outliers using the deletion technique. In I extend the result of deletion diagnostics to the ex- changeable model and reviews some results of model analytical form checking and the technique illustrated through an example.
This document provides an overview of propensity score matching as a method to control for confounding in observational studies. It discusses estimating propensity scores using logistic regression, common matching methods like nearest neighbor and optimal matching within calipers, and assessing balance after matching. An example matches patients who received a blood thinner versus usual care alone after a medical procedure using a SAS macro for propensity score matching. Balance is improved after matching on propensity scores.
The document provides guidance on improving the chances of getting a manuscript accepted for publication. It discusses key methodological concepts like uncertainty of measurement and sampling, statistical assumptions, multiplicity, and proper reporting of different study types including case reports, experiments, observational studies, and randomized trials. The key recommendations are to 1) clearly describe statistical methods and sample sizes, 2) present both data and interpretation of results considering clinical and statistical significance, and 3) properly adjust for confounding and comply with reporting standards for different study designs.
Does transfer to intensive care units reduce mortality for deteriorating ward...cheweb1
1) The study uses instrumental variable methods to examine the effect of transfer to intensive care units (ICU) on mortality for deteriorating ward patients using data from the UK.
2) The instrumental variable is ICU bed availability, which is hypothesized to directly affect the probability of ICU transfer but be independent of patient mortality.
3) Propensity score matching is also used to increase the strength of the instrumental variable by matching patients exposed to many versus few available ICU beds who are similar on observed characteristics.
The Rothamsted school meets Lord's paradoxStephen Senn
Lords ‘paradox’ is a notoriously difficult puzzle that is guaranteed to provoke discussion, dissent and disagreement. Two statisticians analyse some observational data and come to radically different conclusions, each of which has acquired defenders over the years since Lord first proposed his puzzle in 1967. It features in the recent Book of Why by Pearl and McKenzie, who use it to demonstrate the power of Pearl’s causal calculus, obtaining a solution they claim is unambiguously right. They also claim that statisticians have failed to get to grips with causal questions for well over a century, in fact ever since Karl Pearson developed Galton’s idea of correlation and warned the scientific world that correlation is not causation.
However, only two years before Lord published his paradox John Nelder outlined a powerful causal calculus for analyzing designed experiments based on a careful distinction between block and treatment structure. This represents an important advance in formalizing the approach to analysing complex experiments that started with Fisher 100 years ago, when he proposed splitting variability using the square of the standard deviation, which he called the variance, continued with Yates and has been developed since the 1960s by Rosemary Bailey, amongst others. This tradition might be referred to as The Rothamsted School. It is fully implemented in Genstat® but, as far as I am aware, not in any other package.
With the help of Genstat®, I demonstrate how the Rothamsted School would approach Lord’s paradox and come to a solution that is not the same as the one reached by Pearl and McKenzie, although given certain strong but untestable assumptions it would reduce to it. I conclude that the statistical tradition may have more to offer in this respect than has been supposed.
Talk given at ISCB 2016 Birmingham
For indications and treatments where their use is possible, n-of-1 trials represent a promising means of investigating potential treatments for rare diseases. Each patient permits repeated comparison of the treatments being investigated and this both increases the number of observations and reduces their variability compared to conventional parallel group trials.
However, depending on whether the framework for analysis used is randomisation-based or model- based produces puzzling difference in inferences. This can easily be shown by starting on the one hand with the randomisation philosophy associated with the Rothamsted school of inference and building up the analysis through the block + treatment structure approach associated with John Nelder’s theory of general balance (as implemented in GenStat®) or starting on the other hand with a plausible variance component approach through a mixed model. However, it can be shown that these differences are related not so much to modelling approach per se but to the questions one attempts to answer: ranging from testing whether there was a difference between treatments in the patients studied, to predicting the true difference for a future patient, via making inferences about the effect in the average patient.
This in turn yields interesting insight into the long-run debate over the use of fixed or random effect meta-analysis.
Some practical issues of analysis will also be covered in R and SAS®, in which languages some functions and macros to facilitate analysis have been written. It is concluded that n-of-1 hold great promise in investigating chronic rare diseases but that careful consideration of matters of purpose, design and analysis is necessary to make best use of them.
Acknowledgement
This work is partly supported by the European Union’s 7th Framework Programme for research, technological development and demonstration under grant agreement no. 602552. “IDEAL”
The document provides an overview of causal inference techniques including:
1. Simpson's paradox and how propensity score matching and inverse probability of treatment weighting can resolve it.
2. The Rubin causal model and potential outcomes framework for defining average treatment effects.
3. Propensity score theory and how propensity scores can create balanced groups to estimate causal effects.
4. Inverse probability of treatment weighting to eliminate confounding by reweighting samples.
5. Heterogeneous treatment effect estimation using single and two model approaches, and transformed outcome modeling.
6. Applications of causal inference techniques like instrumental variables and counterfactual frameworks for ranking and churn analysis.
The document discusses parametric and non-parametric tests. It provides examples of commonly used non-parametric tests including the Mann-Whitney U test, Kruskal-Wallis test, and Wilcoxon signed-rank test. For each test, it gives the steps to perform the test and interpret the results. Non-parametric tests make fewer assumptions than parametric tests and can be used when the data is ordinal or does not meet the assumptions of parametric tests. They provide a distribution-free alternative for analyzing data.
Ct lecture 7. comparing two groups cont dataHau Pham
The document summarizes a workshop on analyzing clinical studies that compares two groups with continuous data. It discusses estimating differences between groups, hypothesis testing using t-tests, and interpreting the results, including presenting confidence intervals and p-values. R code is provided to demonstrate performing t-tests and other analyses. Non-parametric tests for continuous data are also briefly covered.
This document provides an outline for a presentation on biostatistics and epidemiology. It covers key principles of using biostatistics in research, including distinguishing different variable types, understanding data distributions, hypothesis testing, statistical tests, measures of association, regression, diagnostic tests, and systematic reviews. Statistical concepts like p-values, confidence intervals, and odds ratios are defined. Examples are provided for statistical tests like t-tests, chi-square tests, survival analysis, and diagnostic test metrics.
ENAR 2018 Matching Weights to Simultaneously Compare Three Treatment Groups: ...Kazuki Yoshida
This document summarizes a simulation study comparing matching weights to other propensity score methods for simultaneously comparing three treatment groups. The simulation found that matching weights provided the best covariate balance, similarly small bias as matching, and smaller mean squared error than matching. Matching weights were also more robust to scenarios with rare events, unequal group sizes, and poor covariate overlap compared to other methods. An empirical example using Medicare data to compare opioids, COX-2 inhibitors, and NSAIDs further demonstrated the covariate balance and outcome results achieved with matching weights.
Propensity Score Methods for Comparative Effectiveness Research with Multiple...Kazuki Yoshida
My dissertation research (and a little more) as presented at the Study Design and Biostatistics Center, Department of Population Health Sciences, University of Utah.
GLMM in interventional study at Require 23, 20151219Shuhei Ichikawa
This document provides an overview of generalized linear mixed models (GLMMs) for medical research:
- GLMMs combine generalized linear models, which handle non-normal data distributions using link functions, and linear mixed models, which incorporate random effects.
- When reporting GLMM analyses, it is important to adequately describe the statistical methods, research design, causal inference approach, model validation, and software/functions used. Previous reviews found much room for improvement in GLMM reporting quality.
- Guidelines for standardized GLMM reporting in medical journals could help ensure the validity of conclusions by documenting the analysis methods generating the results.
Final Presentation given at the conclusion of the 2018 IMSM by the Savvysherpa Student Working Group.
Group Members: Chixiang Chen, Ashley Gannon, Duwani Katmullage, Miaoqi Li, Mengfe Liu, Rebecca North and Jialu Wang
Statistical Methods for Removing Selection Bias In Observational StudiesNathan Taback
The slide deck is from a talk I delivered at a Dana Farber / Harvard Cancer Center outcomes seminar. It presents an overview of currently available statistical methods to remove bias in observational studies.
This document outlines the concepts and methods of multiple-treatments meta-analysis (MTM). MTM allows for the simultaneous comparison of multiple interventions for a condition by combining both direct and indirect evidence from randomized controlled trials. Key advantages of MTM include the ability to rank treatments, comprehensively use all available data, and compare interventions not directly compared in trials. The document discusses MTM approaches using frequentist meta-regression and Bayesian statistics.
This document summarizes methods for subgroup identification in clinical trials. It begins by distinguishing predictive from prognostic biomarkers. It then provides a taxonomy of four main approaches to subgroup identification: global outcome modeling, global treatment effect modeling, modeling individual treatment regimes, and local treatment effect modeling (subgroup search). The document discusses several examples and methods under each approach. It concludes by noting important considerations for evaluating subgroup identification methods, such as the number of predictors handled, model complexity control, type I error control, and obtaining honest effect size estimates.
This document discusses investigating heterogeneity in meta-analyses through subgroup analysis and meta-regression. It outlines when and how to use these techniques to explore reasons for variability in study results. Key challenges include having enough studies, selecting explanatory variables carefully to avoid false positives, and accounting for confounding and aggregation bias in study-level data. Meta-regression allows for random effects but interpretation requires caution given observational relationships between study characteristics and effects.
Statistical modelling is of prime importance in each and every sphere of data analysis. This paper reviews the justification of fitting linear model to the collected data. Inappropriateness of the fitted model may be due two reasons 1.wrong choice of the analytical form, 2. Suffers from the adverse effects of outliers and/or influential observations. The aim is to identify outliers using the deletion technique. In I extend the result of deletion diagnostics to the ex- changeable model and reviews some results of model analytical form checking and the technique illustrated through an example.
This document provides an overview of propensity score matching as a method to control for confounding in observational studies. It discusses estimating propensity scores using logistic regression, common matching methods like nearest neighbor and optimal matching within calipers, and assessing balance after matching. An example matches patients who received a blood thinner versus usual care alone after a medical procedure using a SAS macro for propensity score matching. Balance is improved after matching on propensity scores.
The document provides guidance on improving the chances of getting a manuscript accepted for publication. It discusses key methodological concepts like uncertainty of measurement and sampling, statistical assumptions, multiplicity, and proper reporting of different study types including case reports, experiments, observational studies, and randomized trials. The key recommendations are to 1) clearly describe statistical methods and sample sizes, 2) present both data and interpretation of results considering clinical and statistical significance, and 3) properly adjust for confounding and comply with reporting standards for different study designs.
Does transfer to intensive care units reduce mortality for deteriorating ward...cheweb1
1) The study uses instrumental variable methods to examine the effect of transfer to intensive care units (ICU) on mortality for deteriorating ward patients using data from the UK.
2) The instrumental variable is ICU bed availability, which is hypothesized to directly affect the probability of ICU transfer but be independent of patient mortality.
3) Propensity score matching is also used to increase the strength of the instrumental variable by matching patients exposed to many versus few available ICU beds who are similar on observed characteristics.
The Rothamsted school meets Lord's paradoxStephen Senn
Lords ‘paradox’ is a notoriously difficult puzzle that is guaranteed to provoke discussion, dissent and disagreement. Two statisticians analyse some observational data and come to radically different conclusions, each of which has acquired defenders over the years since Lord first proposed his puzzle in 1967. It features in the recent Book of Why by Pearl and McKenzie, who use it to demonstrate the power of Pearl’s causal calculus, obtaining a solution they claim is unambiguously right. They also claim that statisticians have failed to get to grips with causal questions for well over a century, in fact ever since Karl Pearson developed Galton’s idea of correlation and warned the scientific world that correlation is not causation.
However, only two years before Lord published his paradox John Nelder outlined a powerful causal calculus for analyzing designed experiments based on a careful distinction between block and treatment structure. This represents an important advance in formalizing the approach to analysing complex experiments that started with Fisher 100 years ago, when he proposed splitting variability using the square of the standard deviation, which he called the variance, continued with Yates and has been developed since the 1960s by Rosemary Bailey, amongst others. This tradition might be referred to as The Rothamsted School. It is fully implemented in Genstat® but, as far as I am aware, not in any other package.
With the help of Genstat®, I demonstrate how the Rothamsted School would approach Lord’s paradox and come to a solution that is not the same as the one reached by Pearl and McKenzie, although given certain strong but untestable assumptions it would reduce to it. I conclude that the statistical tradition may have more to offer in this respect than has been supposed.
Talk given at ISCB 2016 Birmingham
For indications and treatments where their use is possible, n-of-1 trials represent a promising means of investigating potential treatments for rare diseases. Each patient permits repeated comparison of the treatments being investigated and this both increases the number of observations and reduces their variability compared to conventional parallel group trials.
However, depending on whether the framework for analysis used is randomisation-based or model- based produces puzzling difference in inferences. This can easily be shown by starting on the one hand with the randomisation philosophy associated with the Rothamsted school of inference and building up the analysis through the block + treatment structure approach associated with John Nelder’s theory of general balance (as implemented in GenStat®) or starting on the other hand with a plausible variance component approach through a mixed model. However, it can be shown that these differences are related not so much to modelling approach per se but to the questions one attempts to answer: ranging from testing whether there was a difference between treatments in the patients studied, to predicting the true difference for a future patient, via making inferences about the effect in the average patient.
This in turn yields interesting insight into the long-run debate over the use of fixed or random effect meta-analysis.
Some practical issues of analysis will also be covered in R and SAS®, in which languages some functions and macros to facilitate analysis have been written. It is concluded that n-of-1 hold great promise in investigating chronic rare diseases but that careful consideration of matters of purpose, design and analysis is necessary to make best use of them.
Acknowledgement
This work is partly supported by the European Union’s 7th Framework Programme for research, technological development and demonstration under grant agreement no. 602552. “IDEAL”
The document provides an overview of causal inference techniques including:
1. Simpson's paradox and how propensity score matching and inverse probability of treatment weighting can resolve it.
2. The Rubin causal model and potential outcomes framework for defining average treatment effects.
3. Propensity score theory and how propensity scores can create balanced groups to estimate causal effects.
4. Inverse probability of treatment weighting to eliminate confounding by reweighting samples.
5. Heterogeneous treatment effect estimation using single and two model approaches, and transformed outcome modeling.
6. Applications of causal inference techniques like instrumental variables and counterfactual frameworks for ranking and churn analysis.
The document discusses parametric and non-parametric tests. It provides examples of commonly used non-parametric tests including the Mann-Whitney U test, Kruskal-Wallis test, and Wilcoxon signed-rank test. For each test, it gives the steps to perform the test and interpret the results. Non-parametric tests make fewer assumptions than parametric tests and can be used when the data is ordinal or does not meet the assumptions of parametric tests. They provide a distribution-free alternative for analyzing data.
Ct lecture 7. comparing two groups cont dataHau Pham
The document summarizes a workshop on analyzing clinical studies that compares two groups with continuous data. It discusses estimating differences between groups, hypothesis testing using t-tests, and interpreting the results, including presenting confidence intervals and p-values. R code is provided to demonstrate performing t-tests and other analyses. Non-parametric tests for continuous data are also briefly covered.
This document provides an outline for a presentation on biostatistics and epidemiology. It covers key principles of using biostatistics in research, including distinguishing different variable types, understanding data distributions, hypothesis testing, statistical tests, measures of association, regression, diagnostic tests, and systematic reviews. Statistical concepts like p-values, confidence intervals, and odds ratios are defined. Examples are provided for statistical tests like t-tests, chi-square tests, survival analysis, and diagnostic test metrics.
Subgroup identification for precision medicine. a comparative review of 13 me...SuciAidaDahhar
This document reviews and compares 13 methods for identifying patient subgroups in precision medicine based on their differential treatment effects. It summarizes the statistical properties and performance of each method based on simulations using seven criteria, including bias in variable selection, probability of false discovery, and subgroup stability. The results show that many methods fare poorly on at least one criterion. Interaction trees and SIDEScreen tend to have selection bias, while GUIDE and MOB are better able to identify predictive subgroups and variables without bias.
The document discusses quantitative synthesis and meta-analysis methods. It defines key terms like effect measures, heterogeneity, and fixed and random effects models. It also covers combining data across studies, including calculating weighted averages and addressing issues like Simpson's paradox and heterogeneity that can impact meta-analyses. Worked examples are provided for binary outcomes, risk ratios, and calculating treatment effects from studies.
Hypothesis testings on individualized treatment rulesYoung-Geun Choi
Invited talk in Joint Statistical Meetings 2017, Baltimore, Maryland.
Individualized treatment rules (ITR) assign treatments according to different patient's characteristics. Despite recent advances on the estimation of ITRs, much less attention has been given to uncertainty assessments for the estimated rules. We propose a hypothesis testing procedure for the estimated ITRs from a general framework that directly optimizes overall treatment benefit. Specifically, we construct a local test for testing low dimensional components of high-dimensional linear decision rules. Our test extends the decorrelated score test proposed in Nang and Liu (2017) and is valid no matter whether model selection consistency for the true parameters holds or not. The proposed methodology is illustrated with numerical study and data examples.
This document discusses estimating and optimizing composite outcomes from observational data when clinical decisions involve balancing multiple outcomes. It presents a method to estimate an underlying composite outcome function that clinicians aim to optimize, and determine which patient factors predict receiving optimal treatment. The method is demonstrated on data from a bipolar disorder study, finding history of substance abuse predicts less benefit from antidepressants. Estimated treatment policies improve upon standard care in optimizing the composite of depression and mania symptoms.
Individualized treatment rules (ITR) assign treatments according to different patients' characteristics. Despite recent advances on the estimation of ITRs, much less attention has been given to uncertainty assessments for the estimated rules. We propose a hypothesis testing procedure for the estimated ITRs from a general framework that directly optimizes overall treatment bene t equipped with sparse penalties. Specifically, we construct a local test for testing low dimensional components of high-dimensional linear decision rules. The procedure can apply to observational studies by taking into account the additional variability from the estimation of propensity score. Theoretically, our test extends the decorrelated score test proposed in Nang and Liu (2017, Ann. Stat.) and is valid no matter whether model selection consistency for the true parameters holds or not. The proposed methodology is illustrated with numerical studies and a real data example on electronic health records of patients with Type-II Diabetes.
A Causal Framework for Meta-Analysis, drafty Wei Wang
This document proposes a potential outcomes framework for causal meta-analysis. It discusses issues like heterogeneity between studies and presents assumptions needed for identification. Random effects models are commonly used but may not have a clear causal interpretation. The document demonstrates through an example how including individual-level data can explain away heterogeneity between studies. Overall, it argues for establishing a solid causal foundation in meta-analysis to better understand questions being asked.
The two statistical cornerstones of replicability: addressing selective infer...jemille6
Tukey’s last published work in 2020 was an obscure entry on multiple comparisons in the
Encyclopedia of Behavioral Sciences, addressing the two topics in the title. Replicability
was not mentioned at all, nor was any other connection made between the two topics. I shall demonstrate how these two topics critically affect replicability using recently completed studies. I shall review how these have been addressed in the past. I shall
review in more detail the available ways to address selective inference. My conclusion is that conducting many small replicability studies without strict standardization is the way to assure replicability of results in science, and we should introduce policies to make this happen.
This document discusses descriptive statistics and hypothesis testing. It defines key concepts in descriptive statistics like measures of central tendency, variability, and shape. It also defines key concepts in hypothesis testing like significance level, test statistics, p-values, and types of errors. It provides examples of using t-tests, z-tests, F-tests, and chi-square tests for hypothesis testing. It includes examples of finding descriptive statistics, testing for differences before and after treatment, comparing group averages, and testing association between variables.
Individualized treatment rules (ITR) assign treatments according to different patients' characteristics. Despite recent advances on the estimation of ITRs, much less attention has been given to uncertainty assessments for the estimated rules. We propose a hypothesis testing procedure for the estimated ITRs from a general framework that directly optimizes overall treatment bene t equipped with sparse penalties. Specifically, we construct a local test for testing low dimensional components of high-dimensional linear decision rules. The procedure can apply to observational studies by taking into account the additional variability from the estimation of propensity score. Theoretically, our test extends the decorrelated score test proposed in Nang and Liu (2017, Ann. Stat.) and is valid no matter whether model selection consistency for the true parameters holds or not. The proposed methodology is illustrated with numerical studies and a real data example on electronic health records of patients with Type-II Diabetes.
Medical research relies heavily on statistical inference for generalization of findings, for assessing the uncertainty in applying these findings on new patients. SPSS and similar packages has made complex statistical calculations possible with no or very little understanding of statistical inference. As a consequence, research findings are misunderstood, the presentation of them confusing, and their reliability massively overestimated.
International Journal of Mathematics and Statistics Invention (IJMSI) inventionjournals
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Modelling differential clustering and treatment effect heterogeneity in paral...Karla hemming
Cluster randomized trials are frequently used in health service evaluation. It is common practice to use an analysis model with a random effect to combine between cluster information about treatment effects. It is increasingly being acknowledged that intervention effects might vary across clusters, or the variation between clusters might differ across the randomized arms. It has been proposed in both parallel cluster trials, stepped-wedge and other crossover designs that this heterogeneity can be allowed for by incorporating additional random effect(s) into the model. Here we show that the choice of model parameterization needs careful consideration as some parameterizations for additional heterogeneity induce unnecessary assumptions. We suggest more appropriate parameterizations, discuss their relative advantages and demonstrate the implications of these model choices using practical examples of a parallel cluster trial and a simulated stepped-wedge trial.
Stability criterion of periodic oscillations in a (4)Alexander Decker
1) The authors establish that the distribution of the harmonic mean of group variances is a generalized beta distribution through simulation.
2) They show that the generalized beta distribution can be approximated by a chi-square distribution.
3) This means that the harmonic mean of group variances is approximately chi-square distributed, though the degrees of freedom need not be an integer. Using the harmonic mean in place of the pooled variance allows hypothesis testing when group variances are unequal.
P-Value: a true test of significance in agricultural researchJiban Shrestha
This document discusses the use of p-values and significance levels in statistical analysis. It explains that p-values represent the probability of obtaining results at least as extreme as the observed results of a study, given that the null hypothesis is true. A lower p-value indicates stronger evidence against the null hypothesis. By convention, p-values of 0.05 or lower are considered statistically significant. The document cautions that statistical significance does not necessarily imply practical or clinical significance. It also discusses the concept of least significant difference tests and notes some limitations of relying solely on p-values to guide decisions.
The ASA president Task Force Statement on Statistical Significance and Replic...jemille6
Yoav Benjamini's slides "The ASA president Task Force Statement on Statistical Significance and Replicability” for Special Session of the (remote) Phil Stat Forum: “Statistical Significance Test Anxiety” on 11 January 2022
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
Generative Classifiers: Classifying with Bayesian decision theory, Bayes’ rule, Naïve Bayes classifier.
Discriminative Classifiers: Logistic Regression, Decision Trees: Training and Visualizing a Decision Tree, Making Predictions, Estimating Class Probabilities, The CART Training Algorithm, Attribute selection measures- Gini impurity; Entropy, Regularization Hyperparameters, Regression Trees, Linear Support vector machines.
Enhanced data collection methods can help uncover the true extent of child abuse and neglect. This includes Integrated Data Systems from various sources (e.g., schools, healthcare providers, social services) to identify patterns and potential cases of abuse and neglect.
We are pleased to share with you the latest VCOSA statistical report on the cotton and yarn industry for the month of May 2024.
Starting from January 2024, the full weekly and monthly reports will only be available for free to VCOSA members. To access the complete weekly report with figures, charts, and detailed analysis of the cotton fiber market in the past week, interested parties are kindly requested to contact VCOSA to subscribe to the newsletter.
A gentle exploration of Retrieval Augmented Generation
Basic Concepts in Principal Stratification
1. Basic Concepts in Principal Stratification
Kojin Oshiba & Wenshuo Wang
Harvard University
March 28, 2018
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 1 / 43
2. Overview
1 Review of the papers
Principal Stratification in Causal Inference (Frangkais & Rubin, 2002)
Estimation of Causal Effects via Principal Stratification When Some
Outcomes Are Truncated by ”Death” (Zhang & Rubin, 2003)
A Refreshing Account of Principal Stratification
(Mealli & Mattei, 2012)
2 Discussion
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 2 / 43
3. Review of the papers
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 3 / 43
4. Principal Stratification in Causal Inference
(Frangkais & Rubin, 2002)
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 4 / 43
5. Summary
Scholars have defined net treatment effect using posttreatment
variables. But this is not a causal effect.
Principal stratification lets us define principal effect, which is a causal
effect within each stratum.
One application of principal stratification is surrogate endpoints that
are useful when the outcome is too expensive to measure.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 5 / 43
6. Definition of a Causal Effect
Units i = 1, 2, . . . , n ∈ A
Control (z = 0) or treatment (z = 1)
Yi (z): value of Y if unit i is assigned treatment z
Causal effect of assignment on the outcome Y is the comparison of:
{Yi (0) : i ∈ A} and {Yi (1) : i ∈ A}. (1)
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 6 / 43
7. Post-treatment Variables
Post-treatment variable Sobs
i : variable observed after treatment
assignment in addition to the main outcome Y .
Assume Sobs
i is binary for simplicity.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 7 / 43
8. Net Treatment Effect
Net Treatment Effect (NTE) is the comparison of:
Y obs
i |Sobs
i = s, zi = 0 and Y obs
i |Sobs
i = s, zi = 1. (2)
which, under complete randomization, reduces to
Yi (0)|Si (0) = s and Yi (1)|Si (1) = s. (3)
NTE is not a causal effect if treatment affects post-treatment variable
(post-treatment selection bias).
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 8 / 43
9. Principal Stratification
Basic principal stratification P0: partition s.t. all units i have the
same vector (Si (0), Si (1)) within any partition of P0.
Principal stratification P: partitions are unions of partitions in P0.
Example: P = {{i : Si (0) = Si (1)}, {i : Si (0) = Si (1)}} (4)
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 9 / 43
10. Principal Effect
SP
i : the stratum of P to which unit i belongs.
Principal Effect: A comparison of potential outcomes under control vs
treatment within a principal stratum θ in P
{Yi (0) : SP
i = θ} and {Yi (1) : SP
i = θ}. (5)
The stratum SP
i is unaffected by treatment for any principal
stratification P.
Therefore, any principal effect is a causal effect.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 10 / 43
11. Missing Data in Principal Strata
Usually, one of the post-treatment variables and the potential
outcomes are missing.
Smis
= {Si (z) : all i; z = Zi }, Y mis
= {Yi (z) : all i; z = Zi } (6)
Estimate using Hobs = (Y obs, Sobs, z):
L(Hobs
; θS
, θY
) (7)
Additional assumptions/restrictions needed for a unique MLE of
(θS , θY ).
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 11 / 43
12. Surrogate Endpoints
The primary outcome Y may be too expensive or unfeasible to obtain
in a practical time span.
Surrogate variable: a post-treatment variable used as a ”surrogate”
for the treatment effects on Y . It should satisfy,
(Causal Necessity) Treatment effect on Y can occur only if there’s a
treatment effect on S.
(Statistical Generalizability) Sobs
should well predict Y obs
in an
application study.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 12 / 43
13. Statistical Surrogate
S is a statistical surrogate if, for all fixed s,
Y obs
i |Sobs
i = s, zi = 0 ∼ Y obs
i |Sobs
i = s, zi = 1 (8)
Statistical surrogacy does not satisfy causal necessity.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 13 / 43
14. Principal Surrogate
S is a principal surrogate if, for all fixed s,
Yi (0)|Si (0) = Si (1) = s ∼ Yi (1)|Si (0) = Si (1) = s (9)
or, under randomization,
Y obs
i |Si (0) = Si (1) = s, Zi = 0 and Y obs
i |Si (0) = Si (1) = s, Zi = 1
(10)
Principal surrogacy satisfies causal necessity.
S being a statistical surrogate doesn’t imply it being a principal
surrogate, vice versa.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 14 / 43
15. Associative and Dissociative Effects
Dissociative effect is a comparison between
{Yi (0) : Si (0) = Si (1)} and {Yi (1) : Si (0) = Si (1)}. (11)
Associative effect is a comparison between
{Yi (0) : Si (0) = Si (1)} and {Yi (1) : Si (0) = Si (1)}. (12)
Comparison of (11) and (12) measures the association of surrogate
endpoints and treatment outcomes. If the association is high,
surrogate is a good target.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 15 / 43
16. Summary
Scholars have defined net treatment effect using posttreatment
variables. But this is not a causal effect.
Principal stratification lets us define principal effect, which is a causal
effect within each stratum.
One application of principal stratification is surrogate endpoints that
are useful when the outcome is too expensive to measure.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 16 / 43
17. Estimation of Causal Effects via Principal Stratification
When Some Outcomes Are Truncated by ”Death”
(Zhang & Rubin, 2003)
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 17 / 43
18. Summary
Truncation by death is different from censor by death. Should be
taken care of using principal stratification.
Using principal stratification, we can estimate a causal effect for
stratum without truncation by death.
We can find upper/lower bounds for such causal effect.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 18 / 43
19. Truncation by Death
”Missing”,”Censored” = ”Truncated”
The causal effect is defined on R for ”Censored by Death”.
The causal effect is defined on {R, ∗} for ”Truncation by Death”.
Previous approaches have treated ”Truncation” as ”Censoring”:
Ignore truncated values.
Impute truncated outcomes in R.
Model a missing-data mechanism due to ”censoring”.
Principal stratification addresses this issue.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 19 / 43
20. Example: Educational Program Assessment
Two educational programs: Treatment (T) and Control (C)
Graduation Indicators: Si (T), Si (C) ∈ {G, D}
Principal stratification by the graduation indicator:
T
C
G D
G GG GD
D DG DD
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 20 / 43
21. Truncation and Causal Effect
Causal effect not defined on GD and DG due to truncation.
¯Y obs(T) − ¯Y obs(C) measures the effect of the mixture of strata,
which is misleading if either GD or DG exists. We should adjust for
the pair of indicators (Si (T), Si (C)) instead.
What we want to know: ¯YGG (T) − ¯YGG (C).
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 21 / 43
22. Large-Sample Bounds
Unfortunately, we don’t directly observe the principal strata. What we
do observe are
OBS(T, G) = {i : Zi = T, Sobs
i = G}
OBS(T, D) = {i : Zi = T, Sobs
i = D}
OBS(C, G) = {i : Zi = C, Sobs
i = G}
OBS(C, D) = {i : Zi = C, Sobs
i = D}
Large-sample bounds for the average causal effect on Y in the GG
principal stratum can be derived.
This can be sharpened with additional assumptions:
Assumption 1. (Monotonicity) No DG group.
Assumption 2. (Ranked average score) When assigned treatment, GG
performs better than GD; when assigned control, GG performs better
than DG.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 22 / 43
23. Large-Sample Bounds - Calculation
What we want to know: ¯YGG (T) − ¯YGG (C).
OBS(T, G) is the πGG
πGG +πGD
and πGD
πGG +πGD
mixture of the GG and GD.
¯YGG (T)’s upper (lower) bound can be found by averaging over the
largest (smallest) πGG
πGG +πGD
fraction of OBS(T, G).
¯YGG (C)’s bounds can be found analogously on OBS(C, G)
Together, we can bound ¯YGG (T) − ¯YGG (C).
Additional assumptions can further bound ¯YGG (T) − ¯YGG (C).
Monotonicity: πDG = 0.
Ranked average score: ¯YGG (T) achieves minimum when it equals
¯YGD(T); ¯YGG (C) achieves minimum when it equals ¯YDG (C).
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 23 / 43
24. Large-Sample Bounds - Summary
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 24 / 43
25. Summary
Truncation by death is different from censor by death. Should be
taken care of using principal stratification.
Using principal stratification, we can estimate a causal effect for
stratum without truncation by death.
We can find upper/lower bounds for such causal effect.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 25 / 43
26. A Refreshing Account of Principal Stratification
(Mealli & Mattei, 2012)
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 26 / 43
27. Summary
The paper formalizes the framework of principal stratification analysis.
Causal mediation analysis is used when post-treatment variables can
be intervened.
In causal mediation analysis, we can potentially mix information
across principal strata to infer values of missing data.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 27 / 43
28. Advantages of Principal Strata
Parsimony achieved by classifying units by principal strata instead of
baseline features (Pearl 2011).
The coarsest choice of subpopulations to maintain ignorability of the
treatment Zi :
Yi (0), Yi (1) ⊥⊥ Zi |Si (0), Si (1), Xi (13)
Considering Yi (z) not Yi (z, s) simplifies the estimation:
Assume that S is not manipulatable.
Disregard ’a priori counterfactuals’ (Yi (z, Si (1 − z))).
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 28 / 43
29. Reformalization of Effects on Principal Strata
From now on, S takes values in its support S (not necessarily binary).
Principal Causal Effect (PCE)
PCE(s0, s1) = E[Yi (1) − Yi (0)|Si (0) = s0, Si (1) = s1]. (14)
Principal Strata Direct Effect (PSDE) of Z on Y at level s ∈ S
PSDE(s) = E[Yi (1) − Yi (0)|Si (0) = Si (1) = s]. (15)
PSDE is a.k.a dissociative effect.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 29 / 43
30. Reformalization of Effects on Principal Strata (cont’d)
Average Natural Direct Effect:
NDE(z) = E[Yi (1, Si (z)) − Yi (0, Si (z))] (16)
Average Natural Indirect Effect:
NIE(z) = E[Yi (z, Si (1)) − Yi (z, Si (0))] (17)
Average Natural Direct Effect within a subpopulation P,
NDEP(z)
=
s0=s1=s
PSDE(s)πP
s,s
+
s0=s1
E[Yi (1, Si (z)) − Yi (0, Si (z))|Si (0) = s0, Si (1) = s1]πP
s0,s1
,
(18)
where πP
s0,s1
is the proportion of subjects with Si (0) = s0 and
Si (1) = s1 in P.
PSDEP(s) = 0 does not imply NDEP(z) = 0.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 30 / 43
31. Reformalization of Effects on Principal Strata (cont’d)
Average Total Causal Effect:
ACE = NDE(z) + NIE(1 − z) (19)
ACE =E[Yi (1) − Yi (0)] =
(s0,s1)
PCE(s0, s1)πs0,s1
=
s0=s1=s
PSDE(s)πs,s +
s0=s1
PCE(s0, s1)πs0,s1 .
(20)
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 31 / 43
32. CACE vs ACE
If S = {0, 1}, Compliers Average Causal Effect (CACE)
CACE = E[Yi (1) − Yi (0)|Si (0) = 0, Si (1) = 1] (21)
To identify ACE, we also need to extrapolate CACE to non-compliers
with additional assumptions.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 32 / 43
33. Causal Mediation Analysis
Instead of Yi (z), investigate the potential outcomes Yi (z, s).
S is regarded as an additional treatment that can be intervened.
A priori counterfactuals: Yi (0, s(1)), Yi (1, s(0))
The goal is to estimate the effect of intervention (aka indirect effect)
from a data with no intervention.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 33 / 43
34. Sequential Ignorability Assumptions
Sequential ignoralibility assumptions (Imai 2010):
{Yi (z , s), Si (z)} ⊥⊥ Zi |Xi = x (22)
Yi (z , s) ⊥⊥ Si (z)|Zi = z, Xi = x (23)
Under S.I.A, we can extrapolate the information on Yi (z, Si (z)) to
Yi (z, Si (1 − z)).
Extrapolation across principal strata (even for units with little data) is
possible.
But the treatment received should be confounded by S, so S.I.A may
not be credible!
Instead, start from preliminary principal stratification analysis and mix
information across strata if reasonable.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 34 / 43
35. Mixing Information Across Principal Strata: Si(0) = Si(1)
Simple case: When Si (0) = Si (1) for the subpopulation of units P,
Yi (z, Si (1 − z)) can be observed. For this subpopulation,
NDEP(z)
=
s0=s1=s
PSDE(s)πP
s,s
+
s0=s1
E[Yi (1, Si (z)) − Yi (0, Si (z))|Si (0) = s0, Si (1) = s1]πP
s0,s1
=
s0=s1=s
PSDE(s)πP
s,s,
(24)
which is a weighted average of the principal strata direct effects
PSDE(s).
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 35 / 43
36. Mixing Information Across Principal Strata: Si(0) = Si(1)
Harder case: When Si (0) = Si (1) for the subpopulation of units P,
Yi (z, Si (1 − z)) cannot be observed.
Let’s think about NDE(0) first.
If we find that
different principal strata had similar covariate distributions and/or
outcome levels are similar under one of the treatment levels,
the problem can be simplified. For example, if we let
uv = {i : Si (0) = u, Si (1) = v} for u, v ∈ {0, 1} and find evidence
that
E[Yi (0)|i ∈ 01] = E[Yi (0)|i ∈ 00], (25)
then we might assume
E[Yi (1, Si (0))|i ∈ 01] = E[Yi (1, Si (0))|i ∈ 00], (26)
where the right hand side can be estimated. Similarly,
E[Yi (1, Si (0))|i ∈ 10] = E[Yi (1, Si (0))|i ∈ 11]. (27)
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 36 / 43
37. Mixing Information Across Principal Strata
Under the assumptions in the previous slide, NDE01(0) = NDE00(0)
and NDE10(0) = NDE11(0). Thus,
NDE(0) =NDE00(0)π00 + NDE11(0)π11 + NDE10(0)π10 + NDE01(0)π01
=NDE00(0)(π00 + π01) + NDE11(0)(π11 + π10)
=PSDE(0)(π00 + π01) + PSDE(1)(π11 + π10).
(28)
NIE(1) = ACE − NDE(0) (29)
NIE(0) can be estimated analogously from NDE(1).
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 37 / 43
38. Surrogate Endpoints Revisited
We did not quite understand...
What is full principal stratification?
What is an example of a ’surrogate paradox’?
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 38 / 43
39. Summary
The paper formalizes the framework of principal stratification analysis.
Causal mediation analysis is used when post-treatment variables can
be intervened.
In causal mediation analysis, we can potentially mix information
across principal strata to infer values of missing data.
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 39 / 43
41. Discussion
Comparison of distributions versus comparison of means.
Rubin carefully defines ”effect” using a generic term, ”comparison”
between two distributions.
Page 8 on Mealli&Mattei: If PSDE(z) = 0 for each z ∈ S then there is
no evidence on the direct effect of the treatment after controlling for
the mediator
Isn’t this an overstatement because PSDE is defined as an expected
value of differences in two distributions?
If the post-treatment S is continuous, then what?
When we stratify, do we split S into bins?
If so, what is the optimal splitting? How many bins?
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 41 / 43
42. Discussion
How do we assess that interventions on S are conceivable?
If we don’t have a ”large-sample”, how can we get a limited-sample
bound?
Will we estimate π’s using a validation experiment and then estimate
bounds on an application experiment?
Kojin Oshiba & Wenshuo Wang (Harvard) STAT 286 March 28, 2018 42 / 43