Modern methods for causal modeling in health science provide both opportunities and cautions. New tools like directed acyclic graphs and algorithmic treatment modeling can help identify bias sources and adjust for confounders, but require skill to apply well. Causal inference involves predicting outcomes under interventions, so both classical and machine learning methods apply, but current algorithms cannot match human cognition. Subjective elements and values inevitably influence statistical analyses and model choices.
Controversy Over the Significance Test Controversyjemille6
Deborah Mayo (Professor of Philosophy, Virginia Tech, Blacksburg, Virginia) in PSA 2016 Symposium: Philosophy of Statistics in the Age of Big Data and Replication Crises
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
Conference presentation at ISCB 41 in the session
"Biostatistical inference in practice: moving beyond false
dichotomies"
A comment in Nature, signed by over 800 researchers, called for the scientific community to “retire statistical significance”. The responses included a call to halt the use of the term „statistically significant”, and changes in journal’s author guidelines. The leading discourse among statisticians is that inadequate statistical training of clinical researchers and publishing practices are to blame for the misuse of statistical testing. In this presentation, we search our collective conscience by reviewing ethical guidelines for statisticians in light of the p-value crisis, examine what this implies for us when conducting analyses in collaborative work and teaching, and whether the ATOM (accept uncertainty; be thoughtful, open and modest) principles can guide us.
Controversy Over the Significance Test Controversyjemille6
Deborah Mayo (Professor of Philosophy, Virginia Tech, Blacksburg, Virginia) in PSA 2016 Symposium: Philosophy of Statistics in the Age of Big Data and Replication Crises
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
Conference presentation at ISCB 41 in the session
"Biostatistical inference in practice: moving beyond false
dichotomies"
A comment in Nature, signed by over 800 researchers, called for the scientific community to “retire statistical significance”. The responses included a call to halt the use of the term „statistically significant”, and changes in journal’s author guidelines. The leading discourse among statisticians is that inadequate statistical training of clinical researchers and publishing practices are to blame for the misuse of statistical testing. In this presentation, we search our collective conscience by reviewing ethical guidelines for statisticians in light of the p-value crisis, examine what this implies for us when conducting analyses in collaborative work and teaching, and whether the ATOM (accept uncertainty; be thoughtful, open and modest) principles can guide us.
When it comes to AI use for prediction, diagnosis and treatment of medical conditions, reality is often replaced with a hype. Limitations should be known. A review of AI failures and challenges in healthcare showing why it is not likely for algorithms to replace physicians in the nearest future.
Cervical cancer is the leading gynecological malignancy worldwide. This paper presents diverse classification techniques and shows the advantage of feature selection approaches to the best predicting of cervical cancer disease. There are thirty-two attributes with eight hundred and fifty-eight samples. Besides, this data suffers from missing values and imbalance data. Therefore, over-sampling, under-sampling and embedded over and under sampling have been used. Furthermore, dimensionality reduction techniques are required for improving the accuracy of the classifier. Therefore, feature selection methods have been studied as they divided into two distinct categories, filters and wrappers. The results show that age, first sexual intercourse, number of pregnancies, smokes, hormonal contraceptives, and STDs: genital herpes are the main predictive features with high accuracy with 97.5%. Decision Tree classifier is shown to be advantageous in handling classification assignment with excellent performance.
Running head CRITIQUE QUANTITATIVE, QUALITATIVE, OR MIXED METHODS.docxtodd271
Running head: CRITIQUE QUANTITATIVE, QUALITATIVE, OR MIXED METHODS DESIGN
5
CRITIQUE OF QUANTITATIVE, QUALITATIVE, OR MIXED METHODS DESIGN
Critiquing Quantitative, Qualitative, or Mixed Methods Studies
Adenike George
Walden University
NURS 6052: Essentials of Evidence-Based Practice
April 11, 2019
Critique of Quantitative, Qualitative, or Mixed Method Design
Both quantitative and qualitative methods play a pivotal role in nursing research. Qualitative research helps nurses and other healthcare workers to understand the experiences of the patients on health and illness. Quantitative data allows researchers to use an accurate approach in data collection and analysis. When using quantitative techniques, data can be analyzed using either descriptive statistics or inferential statistics which allows the researchers to derive important facts like demographics, preference trends, and differences between the groups. The paper comprehensively critiques quantitative and quantitative techniques of research. Furthermore, the author will also give reasons as to why qualitative methods should be regarded as scientific.
The overall value of quantitative and Qualitative Research
Quantitative studies allow the researchers to present data in terms of numbers. Since data is in numeric form, researchers can apply statistical techniques in analyzing it. These include descriptive statistics like mean, mode, median, standard deviation and inferential statistics such as ANOVA, t-tests, correlation and regression analysis. Statistical analysis allows us to derive important facts from data such as preference trends, demographics, and differences between groups. For instance, by conducting a mixed methods study to determine the feeding experiences of infants among teen mothers in North Carolina, Tucker and colleagues were able to compare breastfeeding trends among various population groups. The multiple groups compared were likely to initiate breastfeeding as follows: Hispanic teens 89%, Black American teens 41%, and White teens 52% (Tucker et al., 2011).
The high strength of quantitative analysis lies in providing data that is descriptive. The descriptive statistics helps us to capture a snapshot of the population. When analyzed appropriate, the descriptive data enables us to make general conclusions concerning the population. For instance, through detailed data analysis, Tucker and co-researchers were able to observe that there were a large number of adolescents who ceased breastfeeding within the first month drawing the need for nurses to conduct individualized follow-ups the early days after hospital discharge. These follow-ups would significantly assist in addressing the conventional technical problems and offer support in managing back to school transition (Tucker et al., 2011).
Qualitative research allows researchers to determine the client’s perspective on healthcare. It enables researchers to observe certain behaviors and experiences amo.
MAKING SENSE OFSTATISTICSWhat statistics tell you an.docxsmile790243
MAKING SENSE OF
STATISTICS
What statistics tell
you and how to ask
the right questions.
Published in 2010
Making Sense of Testing: why scans and
health tests for well people aren’t always a
good idea
Making Sense of Screening: a guide to
weighing up the benefits and harms of
health screening programmes
Making Sense of radiation: a guide to
radiation and its health effects
Making Sense of Chemical Stories:
a briefing document for the lifestyle sector
on misconceptions about chemicals
Making Sense of gM: what is the genetic
modification of plants and why are scientists
doing it?
Making Sense of Weather & Climate: an
introduction to forecasts and predictions of
weather events and climate change
“I’ve got nothing to lose by trying it”: a
guide to weighing up claims about cures
and treatments
Science and Celebrities review
(2006, 2007, 2008)
Standing up for Science: a guide to the
media for early career scientists
Standing up for Science II: the nuts
and bolts
There goes the Science Bit... a hunt for
the evidence
“I don’t know what to believe”: a short guide
to peer review
Peer review and the Acceptance of New
Scientific Ideas
Other publications by Sense About Science
All are available as free downloads from www.senseaboutscience.org
Publications
Introduction
good statistics, bad statistics
Statistics are used to measure and make sense of the world. They
are produced by the Government, political parties, the civil service,
the Bank of England, opinion polls, campaign groups, social research,
scientific papers, newspapers and more. But when confronted with
stories such as “Crime rate rising again”, “Polls put Tories up to 7%
ahead”, “Child heart surgery halted at hospital after four deaths” or
“Swine flu ‘could kill up to 120m’”, how can we work out whether to
believe them and what they really mean?
Statistics can be hyped and sensationalised by the use of an extreme
value to make a story more dramatic or by reporting a relative
increase in risk without including the absolute change. Data may be
analysed and presented in different ways to support contradictory
arguments or to reach different conclusions, whether deliberately or
by mistake.
But while statistics can be misrepresented, they can also unpick
arguments. By knowing the right questions to ask we can discriminate
between the proper use of statistics and their misuse. We asked
statisticians, journalists and scientists to tell us how they make
sense of statistics and what pitfalls to look out for. They gave us the
following insights:
Statistics borrow from mathematics an air of precision and
certainty but also call on human judgment and so are subject to
bias and imprecision
Knowing what has been counted, and how, tells us whether a
study can really answer the question it addresses
Like words, numbers and statistics mean different things in
different contexts
Just because something is statistically significant it does ...
Slides prepared for beginners of nursing research or novice researchers. it will enhance and clear there basic understanding about using research designs.
Data Visualization in Biomedical Sciences: More than Meets the EyeNils Gehlenborg
In science, data visualization serves two primary purposes. The first is to explore data sets interactively and the second is to communicate discoveries. However, the requirements for visualizations employed in these activities are very different. Therefore, the software tools used for these purposes are typically disconnected, creating significant challenges for reproducibility and effective communication of discoveries in data-driven biomedical science. In this presentation, I will address how a new approach to creating data visualization tools can connect data analysts and other stakeholders inside and outside the scientific community. I will introduce and demonstrate the "Vistories" approach that was motivated by these question.
Presented at the 5th Cancer Research UK Big Data Analytics Conference on Data Visualization.
Most people have heard the statistic that heart disease is a leading cause of death in America today source: Centers for Disease Control. But how do we know this fact to be a true where did that information come from
Back in 1948, when a lot wasn't known about the factors leading to the heart disease and stroke, a health research study -- known as the Framingham Heart Study -- was done the on 5,209 people living in the town of Framingham, Mass. These are participants hadn't developed any known symptoms of cardiovascular disease and hadn't had a stroke or heart attack.
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
2. New(‐ish) tools to aid causal inference
• To aid identification of bias sources and sets
of adjustment covariates DAGsof adjustment covariates: DAGs.
• For adjustment of measured confounders:
algorithmic treatment modeling (PS, IPTW,
or OTW) combined with outcome modelingor OTW) combined with outcome modeling
to achieve “double robustness”.
T t f t i t b t• To account for uncertainty about
unmeasured confounders and other
uncontrolled bias sources: bias analysis.
5 June 2013 Greenland, Modern methods 2
4. Background readings:
G l d S (2010) O h i h• Greenland S (2010). Overthrowing the tyranny
of null hypotheses hidden in causal diagrams.
Ch 22 in: Dechter R Geffner H and HalpernCh. 22 in: Dechter, R., Geffner, H., and Halpern,
J.Y. (eds.). Heuristics, Probabilities, and
Causality: A Tribute to Judea Pearl. London:Causality: A Tribute to Judea Pearl. London:
College Publications, 365‐382.
• Greenland S (2012). Causal inference as aGreenland S (2012). Causal inference as a
prediction problem: Assumptions, identification,
and evidence synthesis. Ch. 5 in Berzuini C, y
Dawid AP, Bernardinelli L, eds. Causal Inference:
Statistical Perspectives and Applications. Wiley,
h hChichester, 43‐58.
5 June 2013 4Greenland, Modern methods
5. “Mathematics is one necessary tool [but] any
i i i h ll i histatistician who actually practices his art
must possess many additional resources…the
mathematical tail has been allowed to wag
the statistical dog for far too long… I think g g
that the built‐in mathematical bias of many
statistics departments and of much that westatistics departments and of much that we
are presently teaching is not innocuous; it is
in fact antiscientific ”in fact antiscientific.
– George Box, Statistical Science 1990
5 June 2013 Greenland: Is causal inference more 5
6. Cautions (conclusions, 2011)
C t f l “ l i f ”• Current formal “causal inference”
approaches are mostly about modeling
effects in single studies, and projection to
conditionally exchangeable populations. y g p p
• As technically sophisticated as current
causal inference methods may seem theycausal‐inference methods may seem, they
are far too simple to encompass the
di it f id th t h t bdiversity of evidence that has to be
synthesized in most real health and medical
decision problems.
5 June 2013 6Greenland, Modern methods
9. • Intuition is notoriously faulty, full of biases,
i l d i d isome innate, some value driven, and is
horrific at probability logic.
• Cognitive psychology and behavioral
economics provides books full of dramatic p
examples – which can be used to recognize
biases (e g double counting confirmationbiases (e.g., double counting, confirmation
bias, overconfidence, and wish bias):
“My colleagues they study artificialMy colleagues, they study artificial
intelligence; me, I study natural stupidity.”
‐Amos Tversky
5 June 2013 9Greenland: Is causal inference more
11. • Even in situations with clear risks, those
ith t ti ti l hi ti ti h twith statistical sophistication have not
outperformed those without (Susser, AJE
1977 i l i h lth i l )1977 gives classic health‐science examples).
• Similar lessons are seen in econometrics,
where pseudo‐Nobel laureates with
impressive mathematical skills have lost
fortunes for investors (e.g., the 1998 LTCM
fund disaster in which Merton and Scholes
lost nearly $5 billion; the 2009 Trinsum
bankruptcy; etc.): do(X=x), X= buy, sell
5 June 2013 11Greenland: Is causal inference more
12. Subjective elements and values play a
decisive role in all statistical analysesdecisive role in all statistical analyses
• There is an illusory sense of objectivity
induced when there is great overconfidence,
as when individuals feel infallible or there is
strong social agreement.
• Feelings of objectivity in turn feed back toFeelings of objectivity in turn feed back to
create overconfidence. This is well illustrated
historically by scientists, statisticians, andhistorically by scientists, statisticians, and
often entire fields being certain of
hypotheses later refuted.hypotheses later refuted.
5 June 2013 Greenland ‐ Bayes Workshop 12
13. Classic statistician examples:
• Fisher against smoking causing lung cancer
• Jeffreys against continental drifty g
Classic clinician‐researcher examples:
F i t i & H it i t t• Feinstein & Horwitz against estrogen
therapy causing much endometrial cancer
• Indiscriminate promotion of trans‐fat
margarine and low‐fat diets in the 1970s andmargarine and low fat diets in the 1970s and
1980s for weight loss and CHD prevention,
along with dismissal of the sugar relationalong with dismissal of the sugar relation.
5 June 2013 Greenland ‐ Bayes Workshop 13
14. Some facts of statistical life:
• Data alone do not convey information; they• Data alone do not convey information; they
are interpreted via models for their
generationgeneration.
• Models are sets of assumptions about the
d t ti (DGP)data‐generation process (DGP).
• Models are analogous to language
grammars: No model, no meaning.
• Unfortunately, unlike bad grammar, bad stat y g
modeling may not produce gibberish, even
if the models and outputs are very wrong.
5 June 2013 Greenland, Modern methods 14
p y g
15. The classical tensions
• Bias vs. precision: Assumptions introduce
bias to the extent they are incorrect butbias to the extent they are incorrect, but
increase precision to the extent that they
l d d i ht t lt tiexclude or down‐weight most alternatives.
• Procedures are “optimal” only under meta‐p y
assumptions, some untestable.
• Models for causal inference always include• Models for causal inference always include
untestable terminal randomization (no
id l f di “i bili ”)residual confounding, “ignorability”).
5 June 2013 15Greenland, Modern methods
17. Neyman’s (1923) potential‐outcome
(“ t f t l”) l t d l(“counterfactual”) causal meta‐model:
• Say X and Y are the treatment and outcome
variables of interest. Then Y is replaced by
a list (vector) of the outcomes that would
follow under different treatments. So if X =
1 or 0, Y is replaced by the potential‐
outcome vector (Y1,Y0) where
Y1 = outcome if X is 1, Y0 = outcome if X is 01 0
• Yx can be replaced by a parameter θx , e.g.,
the outcome probability (risk)the outcome probability (risk)
5 June 2013 Greenland, Modern methods 17
18. Causal inference under the potential‐outcome
l b blmodel becomes a prediction problem:
• Causal‐inference (CI) problems are• Causal‐inference (CI) problems are
isomorphic to missing‐data problems:
At most only one potential outcome is
observed; the rest are missing (Rubin, Ann
Stat 1978).
• Thus the vast predictive (imputational)• Thus the vast predictive (imputational)
machinery of statistics can be used for
i f b t l tinference about causal parameters.
5 June 2013 Greenland, Modern methods 18
20. From consistency, we get a precise definition
of sufficiency for confounding control:of sufficiency for confounding control:
A set of covariates Z is sufficient for control
f f di if h bof confounding if the outcomes we observe
when X=x follow the distribution of Yx given Z:
p(yobs|x,z) ≡ p(yx|x,z) = p(yx|z)
which is independence of X and Y given Z:which is independence of X and Yx given Z:
For all x and z, X ╨ Yx | z,
(“ id l f di ” “ d(“no residual confounding”, “no unmeasured
confounding”, “weak ignorability”); Z is also
minimal sufficient if no s bset of Z is s fficientminimal sufficient if no subset of Z is sufficient.
5 June 2013 Greenland, Modern methods 20
21. Further insights from potential outcomes:
B h 1960 h d l i• By the 1960s, methodologists were
developing methods for summarizing
f d i di i iconfounder sets using discriminant or
regression scores. The performance of the
i l l hvarious proposals was not clear, however.
• Rosenbaum & Rubin (1983) showed that,
given a sufficient set Z, the conditional
treatment distribution p(x|z) is itself
sufficient to control confounding of marginal
(total‐population) X effects by covariates in Z.
5 June 2013 Greenland, Modern methods 21
22. • For binary X, p(1|z) is usually called the
“ it ” (PS) t l f thi“propensity score” (PS); control of this score
will remove confounding when Z is sufficient.
• For other X, Robins, Mark & Newey (1992)
showed that, when Z is sufficient, control of
the regression score E(X|z) is sufficient for
control of confounding of additive effects of
X on Y. (note: PS = E(X|z) when X is binary)
Nonetheless, the missing‐data viewpoint leads g p
to other, more general ways to adjust for
confounding using treatment probabilities. g g p
5 June 2013 Greenland, Modern methods 22
23. • Inverse probability of treatment weighting
(IPTW) was adapted from survey weighting(IPTW) was adapted from survey weighting
ideas (Robins, Hernán, Brumback 2000).
I l b d i d f l i l di• It can also be derived from classical direct
standardization (Sato and Matsuyama 2003):
p(y|x) = ∑z p(y|x,z)p(z) = ∑z p(y,x,z)p(z)/p(x,z)
= ∑z p(y,x,z)/p(x|z) = ∑z wzp(y,x,z), ∑z p(y,x,z)/p(x|z) ∑z wzp(y,x,z),
where wz=1/p(x|z).
Th if Z i ffi i t th IPTW• Thus, if Z is sufficient, then IPTW removes
marginal confounding by averaging using the
i ht f ll ( t d di ti )same weights for all x (standardization).
5 June 2013 Greenland, Modern methods 23
24. Despite PO/PS/IPTW theory providing
landmark insights it is far from completelandmark insights, it is far from complete
for most health/med analyses:
I d h d l• It does not say how to model treatment,
but mismodeling can render the estimated
PS i ffi i d bi h ff iPS insufficient and bias the effect estimate;
• It does not address sampling variation or
how to balance bias vs. variance, e.g., in an
RCT, the randomization indicator predicts p
treatment perfectly so controlling it yields
infinite variance yet adjusts for no bias;y j
5 June 2013 Greenland, Modern methods 24
25. • It focuses on marginal (population‐
averaged) effects (ACE LATE CACE) It doesaveraged) effects (ACE, LATE, CACE). It does
not guide accurate estimation of effect
heterogeneity (modification) or conditionalheterogeneity (modification) or conditional
effects (e.g., effects in men vs. women),
which are essential for clinical practice;which are essential for clinical practice;
• It defines but does not operationalize how
f d ff l ffto find a sufficient or minimal sufficient Z.
These deficiencies are largely traceable to g y
omitting the outcome from modeling
(which Rubin AAS 2008 strongly advises). ( g y )
5 June 2013 Greenland, Modern methods 25
26. A simple solution: Treatment modeling
followed by outcome modelingfollowed by outcome modeling
Classical modeling for causal inference
th t Y X d Z f if Zregresses the outcome Y on X and Z, for if Z
is sufficient, E(Yobs|x,z) ≡ E(Yx|x,z) = E(Yx|z).
• The model for potential means E(Yx|z) is
called a structural model or structural
equation.
• This approach estimates conditional effects pp
as well as marginal effects (by averaging
over Z). As with PS, however, it will be ) , ,
biased by mismodeling.
5 June 2013 Greenland, Modern methods 26
27. By combining treatment modeling with
outcome modeling we can create estimatesoutcome modeling, we can create estimates
that are at least approximately doubly
robust (DR): If Z is sufficient the estimatedrobust (DR): If Z is sufficient, the estimated
effect of X on Y will be unconfounded if
either of the models is correcteither of the models is correct.
The simplest DR approaches either
• regress Y on X, Z, and PS as a covariate,
• regress Y on X, Z in a PS‐matched sample, orregress Y on X, Z in a PS matched sample, or
• regress Y on X, Z using IPT or OT weights.
E h f th h h dEach of these approaches have pros and cons.
5 June 2013 Greenland, Modern methods 27
28. Treating PS as a covariate:
Th l i f h PS i k b hi hl• The relation of the PS to risk can be highly
nonlinear and can be discontinuous when
i di Th i h PScovariates are discrete. Thus entering the PS
as a few terms may not retain sufficiency.
Hi hl fl ibl f l i b d dHighly flexible formulations may be needed
(e.g., many category indicators for the PS, or
h l d )machine‐learning procedures).
• The PS is a composite of Z; it thus can be p
highly collinear with Z terms in the outcome
model, leading to imprecision.g p
5 June 2013 Greenland, Modern methods 28
30. Weighted outcome regression:
• Ordinary fitting methods for estimating
treatment probabilities tend to produce very p p y
small values for some subjects, resulting in
huge highly unstable weights There arehuge, highly unstable weights. There are
several approaches to weight stabilization:
h b d1. Restore the X margin: Robins and crew
use wz = p(x)/p(x|z), but this weight may still
be too unstable, leading to crude fixes like
weight trimming to obtain sensible results. g g
5 June 2013 Greenland, Modern methods 30
31. 2. Ridgeway & McCaffrey (2004, 2007)
weight by the odds of X=1 vs X=X :weight by the odds of X=1 vs. X=Xobs:
wz=1 if X=1, wz= p(1|z)/p(0|z) if X=0.
• This odds‐of‐treatment weighting (OTW)
standardizes to the treated (X=1), as in PS
matching to the exposed.
• They fit these odds with a machine‐learningThey fit these odds with a machine learning
algorithm (boosted lasso).
Their approach eliminates stability problemsTheir approach eliminates stability problems.
Similar results have been reported using
related algorithms to fit probabilities for IPTWrelated algorithms to fit probabilities for IPTW.
5 June 2013 Greenland, Modern methods 31
35. Directed acyclic graphs and causal diagrams
• A DAG shows the factors in the problem as
nodes linked by arrows only, with no y y,
feedback loops.
• A graph is a causal diagram if the arrowsA graph is a causal diagram if the arrows
are interpreted as links in causal chains
(formalization is a bit controversial; R&R)(formalization is a bit controversial; R&R).
• Causal effects of one variable on another
are transmitted by causal sequences whichare transmitted by causal sequences, which
are directed (head‐tail) paths: X→Y→Z
means X can affect Z
2 Feb 2012 Greenland 35
means X can affect Z
37. Colliders vs. noncolliders on a path
P th l d (bl k d) t llid• Paths are closed (blocked) at colliders:
Associations cannot be transmitted across
a collider (→C←) on a path unless we
stratify (condition) on it or something it y ( ) g
affects (such as F in C→F).
• Paths are open (unblocked) at noncolliders:• Paths are open (unblocked) at noncolliders:
Associations can be transmitted across a
llid ( di →C→ f knoncollider (a mediator →C→ or a fork
←C→) on a path unless we stratify on it
2 Feb 2012 Greenland 37
completely.
38. Think of associations as signals flowing
h h h hthrough the graph
• A variable can transmit associations along g
some open (unblocked) directions but not
along closed (blocked) directionsalong closed (blocked) directions.
• The open and closed directions are
h d d b dswitched around by conditioning
(stratifying) on the variable, and are
partially switched by partially or indirectly
conditioning.
2 Feb 2012 Greenland 38
co d t o g
40. “Control” of bias in causal modeling
• Target path: A path that transmits some of
the effect we want to estimate; it is athe effect we want to estimate; it is a
directed path from cause to effect.
Bi i h A h h b• Biasing path: Any other open path between
the cause and effect variables.
• By judicious conditioning, we must close all
biasing paths without closing target pathsbiasing paths without closing target paths
or opening new biasing paths. (This isn’t
always possible with available data )
2 Feb 2012 Greenland 40
always possible with available data.)
41. Graphical sufficiency
• If conditioning on Z closes all biasing paths
while leaving all target paths open Z iswhile leaving all target paths open, Z is
sufficient for control of bias.
• If Z is sufficient (for control of bias) but no
subset is sufficient, Z is minimal sufficient.
Like almost all graphical concepts and results,
these are qualitative (topological); they dothese are qualitative (topological); they do
not address extent of bias. But they can aid
i i i l i i dinitial covariate screening and more.
5 June 2013 Greenland, Modern methods 41
42. Example: inadequacy of statistical criteria
Among traditional statistical criteria for
defining or detecting confounders are:defining or detecting confounders are:
• C is associated with E and with D given E
Adj t t f C h th E D• Adjustment for C changes the E‐D
association (noncollapsibility).
These are equivalent in linear systems.
(Often added: C must precede E and D.)( p )
Graphs illustrate how both criteria can fail,
leading to adjustment that increases biasleading to adjustment that increases bias.
5 June 2013 Greenland, Modern methods 42
44. Instrumental variables in a linear system:
A and F assoc with E and D|E yet worse bias if youA and F assoc with E and D|E, yet worse bias if you
adjust conventionally for A or F
A (B)A (B)
F E
A may be intent‐to‐treat D
2 Feb 2012 Greenland 44
49. Confounding paths from E to D: None!Confounding paths from E to D: None!
A [B]A [B]
[C][C]
FF
E D
2 Feb 2012 Greenland 49
50. What if essential variables are not
d ( ff l bl )measured? (no sufficient Z available)
We then have to turn to sensitivity analysis ofWe then have to turn to sensitivity analysis of
bias (bias analysis; see Ch. 19 of ME3) to
get an idea of how much bias is left afterget an idea of how much bias is left after
adjustment for measured covariates, and
how much uncertainty is appropriatehow much uncertainty is appropriate.
• Ordinary statistics ignore uncertainty about
unmeasured or mismeasured variables andunmeasured or mismeasured variables, and
so are grossly overconfident (intervals much
too narrow P values much too small)too narrow, P‐values much too small).
5 June 2013 Greenland, Modern methods 50
51. All the usual validity problems
can be viewed bias due to missing datacan be viewed bias due to missing data
• Confounding: nonrandomly missing
potential outcomes
• Selection bias: nonrandomly missingSelection bias: nonrandomly missing
subjects
M i i l• Measurement error: missing actual
variables of interest, so we use proxies in
their place (which may produce bias even if
the error is random)
5 June 2013 Greenland ‐ Bayes Workshop 51
)
52. This view enables use of imputation methods
for bias analysis (Greenland, 2009):
Completed data = observed + imputed dataCompleted data observed + imputed data
• To make any inference beyond what we see
(th b d) t h d l th t(the observed), we must have a model that
projects from the observed data to the
missing data (or to aspects of the data, like
means) to get the completed data.) g p
• In bias analysis, however, key parameters
are not identified by the observations
5 June 2013 Greenland ‐ Bayes Workshop 52
are not identified by the observations.
53. As a result, bias analysis can have far more
impact on results than other methods Yet itimpact on results than other methods. Yet it
has seen the least adoption. Possible reasons:
It i f i ti t ff t t• It requires far more investigator effort to
specify the model and inputs (one group is
t i t f l t id li t thi )trying to formulate guidelines to ease this),
• Once specified, it is nowhere near as easy to
run with commercial software as other
methods,
• It can completely ruin any hint of
decisiveness or “significance” of results.g
5 June 2013 Greenland, Modern methods 53
54. III. Conclusion:
Some modern tools you should know
• For identification of bias sources and• For identification of bias sources and
sufficient adjustment sets: DAGs.
• For adjustment of measured confounders:
algorithmic treatment modeling (PS, IPTW,
or OTW) combined with outcome modeling
to achieve double robustness.to achieve double robustness.
• To account for uncertainty about
t ll d bi bi l iuncontrolled bias: bias analysis.
5 June 2013 Greenland, Modern methods 54