ICECAP Influence of Cooling Duration on Efficacy in Cardiac Arrest Patients
1. ICECAP
Influence of Cooling duration on Efficacy in
Cardiac Arrest Patients
Barsan Research Forum 2019
NIH SIREN
Emergency
Trials
Network
2. Disclosures – William Meurer, MD, MS
• All non-profit except episodic medico-legal consulting
• Funding from NINDS -PI of clinical trials methodology course
– Funded co-investigator on NETT/SIREN-CCC and SHINE trial
• NIH-NIDCD (cluster RCT studying dizziness intervention)
• NIH-NIMHD - PI of trial of hypertension
– Co-investigator to improve stroke care in Flint
• NIH-NHLBI (co-investigator in cardiac arrest expedited transport/ECMO trial)
• AHRQ (dizziness self treatment)
• Massey Foundation (studying pupillary response in TBI)
• FDA and NIH (reviewer)
4. Why Clinical Trials Stink
ANALOGY*: Clinical Trial = Diagnostic Test
*Note: My analogy does not stink
Clinical Trial Diagnostic Test
Looking For Effective Treatment Disease
1 – Type II Error (Beta) “Power” Sensitivity OR True
Positive Rate
Type I Error (Alpha) Significance Level 1- Specificity OR
FALSE Positive Rate
5. Clinical Trials are Models with Tons of Guesses
Assumptions
Dose from animal models is close
No heterogeneity of effect
Subgroups respond equally
Some subgroups excluded
Effect size to create “reasonable” sample size
“Noise” in outcomes can be understood and overcome
Duration of treatment practical
LESSON: Make many compromises to reduce number of
parameters to make model “solvable”
8. ICECAP
A randomized, response-adaptive, duration-finding, comparative
effectiveness clinical trial with blinded outcome assessment.*
Note: This trial DOES NOT stink
9. Ann Neurol 2006;59:467–477
0
10
20
30
40
50
60
70
0 20 40 60
LevelofProtection(%)
Rank Order by Level of Protection
1,026 Experimental treatments in acute stroke
Experimental
Contrasts (n)
Focal
Ischemia
Models
94 favors treatment
28 neutral
0 favors control
Global
Ischemia
Models
77 favors treatment
28 neutral
0 favors control
Culture
Models
13 favors treatment
3 neutral
1 favors control
10.
11. In preparing for battle I have
always found that plans are
useless, but planning is
indispensable.
12. Specific Aims
In each of two populations of adult comatose survivors of cardiac arrest
(those with initial shockable rhythms and those with PEA/asystole)
• Can longer durations of hypothermia improve patient outcomes?
• Can the efficacy of hypothermia be confirmed by evaluating
duration response?
• Does duration have a differential effect on safety and quality of
recovery?
13. Design overview
• Initially randomization to 12, 24, or 48 hours
• Randomization adapts independently for each rhythm type:
– Allocation is to more successful original and interposed durations
– If curve is flat, randomization opens to 6 hr duration
– If curve is positive, randomization opens to 60 and 72 hr durations
• Ends early if duration-response is flat or negative
14. Primary Outcome
• Modified Rankin scale at 90 days after ROSC
• Functional outcome incorporating mortality
– Analyzed non-parametrically to explore both the proportion achieving a good
neurological outcome and deficit among those with good outcomes
• Blinded assessment
Primary Outcome for Resuscitation Science Studies Consensus Conference
May 5-6, 2008
17. “I have always
considered it more
desirable to kill
computer-generated
patients than real
ones when calibrating
design parameters.”
Peter Thall
Chance 2001;14:23-8
19. Example Simulation
Grey Bars are number
of randomized
subjects at each
duration
Blue x’s are
randomization
vectors for the
next 50
Solid dots
represent the
current model,
dashes show CI
Hollow dots are
the observed
responses so far
Duration (hrs) Duration (hrs)
Shockable Non-shockable
20. Example Simulation
Duration (hrs) Duration (hrs)
Shockable Non-shockable
Adaptive
randomization has
put subjects where
they are needed
The response
model closely fits
the observations
with narrow
confidence limits
21. Example Simulation
Duration (hrs) Duration (hrs)
Shockable Non-shockable
Black lines show the
“truth”, the scenario
being simulated, which
the model has
estimated.
22.
23. Conclusions
• Adaptive designs improve discovery through clinical trials*
• A LOT more planning is needed
• Our patients and our future patients need better treatments for
cerebral ischemia
• We must bring our A game when being stewards of limited resources
to learn
*Adaptive designs allow better calibration of the scientific question and
the trial conduct. They aren’t always smaller and aren’t always better. It
can be a challenge to disentangle the PLANNING from the PLAN
Editor's Notes
I want to thank all of you for having us here today. Thanks, in particular to Jeremy Brown for being the catalyst and organizer.
Leaders from our team here today includes my co-principal investigators, Dr. Romer Geocadin and Dr. Will Meurer on the clinical side, and Dr. Valerie Durkalski here and Dr. Ramesh Ramakrishnan on the phone from the data center. We also have Dr. Scott Berry, a key co-I working with the DCC.
Romer is a leader in the neuro critical community and an accomplished clinical investigator. He is past president of the Neurocritical Care Society, and first author of the AAN clinical guideline on reducing brain injury after cardiopulmonary resuscitation published earlier this year, and co-author on prior AHA Scientific Statements in cardiac arrest.
Will is an emergency physician and part of the leadership of the Neurological Emergencies Treatment Trials network. He is also a leader in clinical trials methodology, the co-program chair of this years International Clinical Trials Methodology Conference, and co-PI of the NINDS clinical trials methodology course.
I am Robert Silbergleit, I am also an emergency physician, part of the leadership of the NETT, and a co-PI of the SIREN clinical coordinating center. I am a clinical trialist now, serving as a PI or leading co-I in several large multicenter trials, but I am also a recovering translational researcher with years of past experience working on animal models of cardiac arrest.
Valerie is an experienced clinical trial biostatistician, co-PI of the NETT and the SIREN data coordinating centers, and a PI on more clinical trials than I was able to easily count. She serves on FDA advisory panels, multiple DSMB, and is a Director of the Society for Clinical Trials.
Ramesh is an experienced clinical trial biostatistician, and a co-I of the NETT SDMC. He has expertise in both frequentist and Bayesian methods, and has a has a special interest in adaptive study designs.
Scott is President & Senior Statistical Scientist at Berry Consultants, LLC, the premier Bayesian consulting company in the world. He is an international expert in innovative adaptive clinical trial designs and trial simulation. He was key to design of the ICECAP trial, as a leading co-investigator in the FDA-NIH cooperative grant supporting that effort..
My co-PI’s at the SIREN CCC are also here, Bill Barsan in person, and Clif Callaway on the phone.
Bill is a leader in the specialty of Emergency Medicine and emergency care research, he was a PI of the FDA-NIH cooperative grant from which the ICECAP trial was designed. He is a member of the National Academy of Medicine.
Clif has been a PI within both NETT and the ROC. He is a leader in Cardiac Arrest science internationally and within the AHA as well, including serving as chair of the AHA’s Emergency Cardiovascular Care Committee.
.
We also have Dr. Jan Claassen with us today. Jan is a NETT Hub PI and an internationally recognized expert in neurological intensive care, and an outstanding clinical investigator. His research characterizes physiologic changes following acute brain injury, focusing on novel treatment approaches to potentially improve patient outcomes.
In a commentary by Lewis Sheiner from UCSF, he describes the idea of what we are looking for in a clinical trials. This is a gross oversimplification, but you can think of the regimen axis as containing information on everything about the drug or device including dose, how long to treat, concomitant therapies. The prognostic axis contains all the important ways that patients differ in their outcome either with or without the treatment. The vertical, benefit axis signifies the NET efficacy with toxicity subtracted off. This plane will be flat with zero benefit for a treatment that has no effect on anyone. It is important to point out that this is a biological description and when we conduct clinical trials we are trying to find parts of this surface, but we will never estimate something like this fully with a high degree of accuracy.
This meeting, with folks from both NHBLI and NINDS is particularly exciting to us in the SIREN leadership because it represents the synergy upon which this new network is built. Cardiac Arrest is emblematic of that synergy because it truly and profoundly is both a cardiac emergency and neurological emergency. I’ve heard Walter describe cardiac arrest as the biggest stroke you can have. We expect to have a broad portfolio of clinical trials representing the full scope of both IC’s, but cardiac arrest represents a core upon which we can create an identify for SIREN that readily embodies both institutes.
ICECAP is a proposed trial in comatose survivors of cardiac arrest to find out if longer durations of cooling can make more people better. It will also confirm whether or not therapeutic cooling works at all in patients with shockable rhythms, and separately in patients with initially non-shockable rhythms. The trial will use adaptive dose-finding methods to learn the shape of the curve that relates duration of induced hypothermia to the rate of good clinical outcome.
In terms of hypothermia, “dose” is mostly driven by how long you cool and how cold. Depth of cooling has a very limited tolerable range and doesn’t appear to be scientifically or clinically important, but duration can be varied quite a bit and appears mechanistically significant. This study looks at only duration.
Graphically, their data looks like this. The interventions are ranked on effectiveness and the size of the bubble shows the number of trials analyzed. The large green dot represents hypothermia, with a high level of efficacy, only beaten by a few other interventions that were never ultimately reproduced.
The table on the right shows that efficacy was in both focal and global cerebral ischemia animal models in most experiments, and in few cell culture models.
The papers they didn’t report on were basic science papers exploring mechanism rather than outcome. So what do those show?
From those questions we have developed these specific aims and study objectives.
In each of two populations of adult comatose survivors of cardiac arrest (those with initial shockable rhythms and those with PEA/asystole):
Can longer durations of hypothermia improve patient outcomes?
Can the efficacy of hypothermia be confirmed by evaluating duration response?
Does duration have a differential effect on safety and quality of recovery?
ICECAP is a duration-seeking design that will reveal the shape of the duration-response curve. It starts out by putting the first 150 subjects in each rhythm type on 12, 24, or 48 hours of cooling, and then looks at the shape of the response curve from those subjects. If any trend is emerging toward longer doses of cooling being more effective, then the next 50 patients are randomized to the original arms and some longer duration arms. If the trend continues, longer duration arms are added every 50 patients. If the emerging trend is a flat duration response, then future subjects are placed on a shorter duration arm and more interspersed durations. Subjects are allocated to find the “knee” of the duration response curve, the duration of cooling where the response plateaus. The start of that plateau is the optimal duration of cooling, a meaningful answer.
The shape of the duration response also answers our other question, does hypothermia work at all? Any positive dose response curve indicates that there is efficacy, even without a placebo group. A completely flat dose response curve, in which no duration of cooling is any more effective than the shortest possible duration, strongly implies that the therapy is ineffective. The trial ends early if a flat response is detected.
This trial designs learns, in an entirely pre-planned and programed, from accumulating data in the trial, and then adapts the randomization to put more subjects into the arms that are most informative. This prevents wasting a bunch of subjects on arms that end up being entirely uninformative. The design means that we learn more from each patient.
Schematically the trial design looks like this. All of the possible durations that will be tested are shown in these white boxes over on the left side. They range from 6 to 72 hours. The sequential columns of red circles represent progressive time and accrual in the trial. You can see that there are 3 arms in the burn-in period, but that shorter, longer, and interspersed arms are potentially added incrementally. The blue arrows at the top indicate automated “looks” at the data about every 4 weeks or 50 subjects. At each look, there is a test for futility and then assignment of a new batch of randomization vectors.
We simulated 47 scenarios 5000 times each. Each scenario represents one possible “truth” and for each simulation a random distribution of subjects probabilistically consistent with that truth is created. The simulation then shows how the study design and algorithm will perform for that set of conditions.
Walking through just one simulation of one scenario can be illustrative of how the trial works. These figures represent each look at the data. The graphs show the duration arms along the horizontal axis, the grey bar graph shows the number of subjects at each duration, The blue line shows how the next 50 subjects will be randomized. The open red circles are the accumulating observed outcomes, and the solid red circles and dotted red line show the emerging model of the duration response curve.
Here is look 1, and then look two.
Here you can see, how with each sequential look, subjects are distributed by the model to where they are needed and will be most informative. The modeled responsive curves and becoming more clear and the confidence intervals tighter.
After 32 looks, most subjects in the shockable group are near the start of what appears to be the plateau. They are more broadly distributed in the non-shockable group, but are mostly allocated to the longer durations where there appears to be more efficacy.
Now we reveal the “true” duration response curves assumed in this scenario.
The true duration response curves assumed for this scenario are shown here as solid black lines. You can see that in this scenario, the modeled duration response curve gives an excellent approximation of the true curves.
By design, we can look at the entire space of cooling durations of interest, and still have excellent operating characteristics. By examining the entire space of cooling durations, we eliminate the need to do repeated trials of various tweaks of pairwise comparisons. This will be the only trial of duration of hypothermia that you’ll ever need to do to answer this question.
Whether positive or negative, the results of this trial will be meaningful.