The document discusses survival analysis techniques using SPSS. It defines key survival analysis terms and covers non-parametric and semi-parametric survival analysis methods like Kaplan-Meier analysis and Cox regression. For Kaplan-Meier analysis, it provides an example to compare the effect of two drugs on time to effect. For Cox regression, it demonstrates how to identify attributes associated with customer churn. The document also discusses how to address time-dependent covariates in Cox regression models.
This document introduces Advanced Temporal Language Aided Search (ATLAS), a new interface for defining clinical phenotypes using natural language. ATLAS allows users to search for and write phenotype definitions step-by-step rather than using drop-down menus. The document demonstrates how common phenotypes like type 2 diabetes and acute myocardial infarction can be defined using ATLAS. It reports that a group was able to define 38 phenotypes in around 2 hours using ATLAS, much faster than previous rule-based methods. The Common Data Model may be extended in the future to additional support for ATLAS, such as time attributes and clinical note annotations.
This document discusses biological variation in clinical measurements. It aims to identify the nature of biological variation, appreciate its significance, and understand how to determine and apply indices of biological variation. Biological variation refers to components of variance in biochemical measurements determined by a subject's physiology. The sources, quantification, and practical applications of biological variation data are explored. Understanding biological variation is fundamental to developing reference data and interpreting clinical measurements over time.
Basics of laboratory internal quality control, Ola Elgaddar, 2012Ola Elgaddar
Total Quality Management (TQM) is a continuous approach to improve quality and performance. It requires integrating quality functions throughout an organization with involvement from management, employees, suppliers, and customers. For medical laboratories, quality control has three main stages - pre-analytical, analytical, and post-analytical. Analytical quality control involves internal quality control (IQC) using control materials and external quality assessment (EQA) to monitor quality and compare results between laboratories. IQC follows procedures like plotting daily control results on Levey-Jennings charts and evaluating them using Westgard rules to detect errors.
1) Clinical trial design aims to quantify and reduce errors, eliminate bias, and yield clinically relevant estimates of treatment effects.
2) Key trial design elements include randomization, blinding, choice of control group, and trial type (e.g. parallel, crossover).
3) Randomization assigns participants to groups randomly to reduce bias while blinding conceals group assignments from participants and investigators.
Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...Data Con LA
Medical institutions, universities and software giants like Google and Microsoft are dedicating increasing resources to machine learning for healthcare. This is a very exciting but relatively young field. However, best practices for methods and reporting of results are not yet fully established. I have 2.5 years of experience as data scientist at a national cancer center working on clinical data, evaluating external vendors and peer reviewing machine learning in healthcare papers. The talk gives an overview of best practices in prototyping machine learning models on data from the patient electronic health record (EHR). The topics addressed are:1. Introduction to the EHR2. Overview of machine learning applications to the EHR3. Cohort definition for survival problems4. Data cleaning5. Performance metricsExcerpts of papers from renowned institutions will be critically reviewed. The material is intended to be useful not only to machine learning for healthcare professionals, but to practitioners dealing with very unbalanced dataset in the temporal domain. For example, customer churn prediction can be modeled as survival problem.
Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...John Blue
Increase the value of your diagnostics and your value as a diagnostician - Dr. Marie Culhane, Associate Clinical Professor, Veterinary Population Medicine, College of Veterinary Medicine, University of Minnesota, from the 2013 Allen D. Leman Swine Conference, September 14-17, 2013, St. Paul, Minnesota, USA.
More presentations at http://www.swinecast.com/2013-leman-swine-conference-material
The document discusses survival analysis techniques using SPSS. It defines key survival analysis terms and covers non-parametric and semi-parametric survival analysis methods like Kaplan-Meier analysis and Cox regression. For Kaplan-Meier analysis, it provides an example to compare the effect of two drugs on time to effect. For Cox regression, it demonstrates how to identify attributes associated with customer churn. The document also discusses how to address time-dependent covariates in Cox regression models.
This document introduces Advanced Temporal Language Aided Search (ATLAS), a new interface for defining clinical phenotypes using natural language. ATLAS allows users to search for and write phenotype definitions step-by-step rather than using drop-down menus. The document demonstrates how common phenotypes like type 2 diabetes and acute myocardial infarction can be defined using ATLAS. It reports that a group was able to define 38 phenotypes in around 2 hours using ATLAS, much faster than previous rule-based methods. The Common Data Model may be extended in the future to additional support for ATLAS, such as time attributes and clinical note annotations.
This document discusses biological variation in clinical measurements. It aims to identify the nature of biological variation, appreciate its significance, and understand how to determine and apply indices of biological variation. Biological variation refers to components of variance in biochemical measurements determined by a subject's physiology. The sources, quantification, and practical applications of biological variation data are explored. Understanding biological variation is fundamental to developing reference data and interpreting clinical measurements over time.
Basics of laboratory internal quality control, Ola Elgaddar, 2012Ola Elgaddar
Total Quality Management (TQM) is a continuous approach to improve quality and performance. It requires integrating quality functions throughout an organization with involvement from management, employees, suppliers, and customers. For medical laboratories, quality control has three main stages - pre-analytical, analytical, and post-analytical. Analytical quality control involves internal quality control (IQC) using control materials and external quality assessment (EQA) to monitor quality and compare results between laboratories. IQC follows procedures like plotting daily control results on Levey-Jennings charts and evaluating them using Westgard rules to detect errors.
1) Clinical trial design aims to quantify and reduce errors, eliminate bias, and yield clinically relevant estimates of treatment effects.
2) Key trial design elements include randomization, blinding, choice of control group, and trial type (e.g. parallel, crossover).
3) Randomization assigns participants to groups randomly to reduce bias while blinding conceals group assignments from participants and investigators.
Data Con LA 2019 - Best Practices for Prototyping Machine Learning Models for...Data Con LA
Medical institutions, universities and software giants like Google and Microsoft are dedicating increasing resources to machine learning for healthcare. This is a very exciting but relatively young field. However, best practices for methods and reporting of results are not yet fully established. I have 2.5 years of experience as data scientist at a national cancer center working on clinical data, evaluating external vendors and peer reviewing machine learning in healthcare papers. The talk gives an overview of best practices in prototyping machine learning models on data from the patient electronic health record (EHR). The topics addressed are:1. Introduction to the EHR2. Overview of machine learning applications to the EHR3. Cohort definition for survival problems4. Data cleaning5. Performance metricsExcerpts of papers from renowned institutions will be critically reviewed. The material is intended to be useful not only to machine learning for healthcare professionals, but to practitioners dealing with very unbalanced dataset in the temporal domain. For example, customer churn prediction can be modeled as survival problem.
Dr. Marie Culhane - Increase the value of your diagnostics and your value as ...John Blue
Increase the value of your diagnostics and your value as a diagnostician - Dr. Marie Culhane, Associate Clinical Professor, Veterinary Population Medicine, College of Veterinary Medicine, University of Minnesota, from the 2013 Allen D. Leman Swine Conference, September 14-17, 2013, St. Paul, Minnesota, USA.
More presentations at http://www.swinecast.com/2013-leman-swine-conference-material
Resolving e commerce challenges with probabilistic programmingLogicAI
This summer, during the third edition of Data Science Summit in Warsaw, Magdalena Wójcik (Senior Data Scientist at LogicAI) presented how we used Bayesian models in one of our projects.
(20180524) vuno seminar roc and extensionKyuhwan Jung
This document discusses receiver operating characteristic (ROC) curves and their use in evaluating diagnostic tests. It begins by defining sensitivity and specificity as metrics for diagnostic test performance. It then explains that ROC curves plot the sensitivity vs 1-specificity for varying diagnostic thresholds. The area under the ROC curve (AUC) provides a single measure of test accuracy. Methods for calculating AUC include parametric and nonparametric approaches. The document also discusses extensions of ROC analysis like free-response ROC (FROC) curves which evaluate tests with multiple lesion detections. It concludes by outlining a study that used JAFROC analysis to evaluate the effect of a computer-aided detection (CAD) system on radiologist performance in detecting lung nodules on
This document outlines topics related to survival analysis, including its objectives and key methods. Survival analysis is used to analyze longitudinal data on events like death or disease onset over time. It accounts for censoring of data. The Kaplan-Meier method estimates survival rates without dividing time into intervals like life tables do. The log-rank test statistically compares survival curves between groups. Cox regression analysis examines the relationship between covariates and survival while allowing hazards to vary over time.
Binomial Distribution statistical methods for economics.pptxkunal2422m
A random variable is defined as a variable whose value is determined by the outcome of a chance experiment. A random process assigns a random variable to each time point and represents phenomena that vary unpredictably over time. Examples of random experiments include coin tosses, dice rolls, and card draws from a deck. The outcome of such experiments is random and determined by chance. A random variable follows a probability distribution that describes the frequency or likelihood of different possible values. The binomial distribution specifically describes the number of successes in a fixed number of yes/no trials with constant success probability.
This document discusses polymeric drug delivery systems. It describes controlled release versus sustained release systems and how polymeric systems can incorporate a drug and release it at a known rate over a prolonged duration. It provides examples of commonly used polymers for drug delivery and examines the factors that determine drug release rates from matrix and reservoir devices using equations based on Fick's laws of diffusion. The document also addresses issues like burst and lag effects and discusses delivery systems for soluble drugs and proteins.
Microbiology (immunological method and application)Osama Al-Zahrani
1. The document discusses several immunological methods and applications including ELISA, latex agglutination assay, immunochromatography assay, haemagglutination inhibition assay, immunofluorescence antibody technique, and Western blot.
2. Key principles involve antigen-antibody reactions to detect the presence of various antibodies or antigens through techniques like enzyme reactions, agglutination, fluorescent labeling, or protein separation/detection.
3. The methods can be used to diagnose conditions like hepatitis, rheumatoid arthritis, malaria, and HIV.
Dr. Jim Lowe - Proven Practical Strategies for PRRS Free HeardJohn Blue
Part II Proven Practical Strategies for PRRS Free Heard, Dreaming of a World without PRRS - Dr. Jim Lowe, Carthage Veterinary Service, LTD., from the 2012 Iowa Pork Congress, January 24 - 26, Des Moines, IA, USA.
Experimental method of Educational Research.Neha Deo
experimental method is the most challenging method of the Educational research. In the experimental method different functional & factorial designs can be used. One has to think over the internal & external validity of the experiment also.In this presentation all these things are discussed in details.
The document discusses internal quality control procedures in a medical laboratory. It defines internal quality control and explains the three main stages - pre-analytical, analytical, and post-analytical - that need to be controlled. It describes the process for internal quality control, including using control materials, establishing statistical limits, and interpreting quality control data using rules like Westgard's multi-rules. The document emphasizes the importance of root cause analysis when quality control is out of control and comparing internal quality control with external quality assessment.
Development of health measurement scales – part 2Rizwan S A
This document discusses various methods for developing health measurement scales and assessing their validity and reliability. It begins by describing different scaling methods like categorical, continuous, Likert scales, and paired comparison methods. It then outlines topics like reliability, validity, measuring change and conclusions. Specific methods for assessing reliability are discussed in depth, including internal consistency using Cronbach's alpha, test-retest reliability, and inter-observer reliability which can be calculated using intraclass correlation coefficients. The document emphasizes that reliability is a necessary but not sufficient condition for validity, and different types of validity like content, criterion and construct validity are important to validate the inferences that can be made from scale scores.
This document discusses tests of significance and summarizes key concepts. It begins by describing qualitative and quantitative data and measures of central tendency. It then discusses sampling variation, the null hypothesis, p-values, and the standard error. The document outlines the steps in hypothesis testing and describes different types of tests including the standard error of difference between two proportions (SEDP) test and the chi-square test. Examples are provided to demonstrate how to calculate test statistics and determine significance. The limitations of the SEDP test are also noted.
This document discusses repeated measures ANOVA. It explains that repeated measures ANOVA is used when the same participants are measured under different treatment conditions. This allows researchers to remove variability caused by individual differences. The document outlines the components of the repeated measures ANOVA F-ratio, including the numerator which is the variance between treatments and the denominator which is the variance due to chance/error after removing individual differences. It also discusses how to conduct hypothesis testing and calculate effect size for repeated measures ANOVA.
Overview on crossover trialsStatistical illustration “SPSS”Continues outcomeNouran Hamza, MSc, PgDPH
crossover design
is a repeated/longitudinal measurements design.
|
Patients (experimental units) cross over from one treatment to another during the trial course.
|
In contrast to a parallel design where patients are randomized to a treatment and remain on that treatment throughout the trial duration
This document contains an agenda for an AI-Bio convergence training course that will take place from August to October 2022. The course will cover topics like Python and R for data analysis, statistical analysis techniques like ANOVA and multivariate analysis, genomics analysis including genome, transcriptome, epigenome and proteome, machine learning algorithms like linear models, clustering and association analysis, deep learning models like CNNs and RNNs, applying these techniques to medical data for tasks like predictive modeling and image analysis. It also includes sessions on computational chemistry, drug discovery, ontology and its applications in biology.
This document provides an overview of stochastic processes and Markov chains. It defines stochastic processes as families of random variables indexed by time. Markov chains are a type of stochastic process where the future state depends only on the present state, not on the past. The document discusses examples of Markov chains, transition matrices, classification of states as transient or persistent, and properties like irreducibility. It aims to introduce key concepts in stochastic processes and Markov chains.
The document discusses key principles of experimental design, including replication, randomization, and local control. It then summarizes different types of experimental designs such as completely randomized design, randomized block design, Latin square design, and factorial designs. Key points about each design are highlighted, along with examples to illustrate how they are applied.
This document describes how modeling and simulation (M&S) can be used to project outcomes for clinical trials. M&S involves building statistical models based on incoming patient data and then simulating the remainder of the study multiple times. This allows researchers to predict milestones, test alternative scenarios, and validate study assumptions. The document provides examples of how M&S was used to accurately forecast timelines and inform decisions for trials experiencing issues with enrollment rates and event rates differing from initial assumptions. Management found the simulations to be very valuable for planning by providing projections when other methods would have involved guessing.
This document introduces Tolstoy Targets, a visualization method using radial axes to provide a concise summary of multiple objectives or attributes. It discusses principles like using traffic light colors to indicate success or failure of predefined targets. Conventions are outlined, such as grouping attributes by direction and adding confidence ranges. Practical examples demonstrate comparing projects, mass screening of enzymes, transfusion risks for multiple patients, and assessment scores. The document concludes by providing contact information for the author.
This document describes how modeling and simulation (M&S) can be used to project timelines and resource needs for clinical trials. M&S involves building statistical models based on incoming patient data and then simulating the remainder of the study multiple times. This allows researchers to predict milestones, test alternative scenarios, and validate study assumptions. The document provides examples of how M&S accurately predicted timelines for trials with complex multi-segment designs and competing risk events. Study managers found the projections from M&S to be very valuable for planning purposes.
This document discusses metrics for assessing the performance of randomization methods in clinical trials. It proposes measuring randomness using potential selection bias, which calculates how well an observer could guess the next treatment assignment based on previous assignments. It also considers periodicity to detect patterns. Balance is measured using efficiency loss, which quantifies the increase in variability due to imbalances. The document outlines a simulation study comparing randomization methods using these proposed metrics. Stratification factors are modeled using a Zipf-Mandelbrot distribution to generate realistic subgroup sizes. Randomness and balance metrics are calculated at interim analyses and summarized graphically.
This document discusses metrics for assessing the predictability and efficiency of covariate-adaptive randomization designs in clinical trials. It proposes measuring predictability using a modified Blackwell-Hodges potential selection bias metric that calculates how well an observer could guess the next treatment assignment. It also considers entropy and periodicity measures. Balance/efficiency is proposed to be measured using Atkinson's method of quantifying the loss of statistical power as an equivalent reduction in sample size due to treatment imbalances within subgroups. The document then outlines a simulation study to compare various randomization methods using these proposed metrics.
More Related Content
Similar to Sweitzer,Simulating Multi Phase Studies
Resolving e commerce challenges with probabilistic programmingLogicAI
This summer, during the third edition of Data Science Summit in Warsaw, Magdalena Wójcik (Senior Data Scientist at LogicAI) presented how we used Bayesian models in one of our projects.
(20180524) vuno seminar roc and extensionKyuhwan Jung
This document discusses receiver operating characteristic (ROC) curves and their use in evaluating diagnostic tests. It begins by defining sensitivity and specificity as metrics for diagnostic test performance. It then explains that ROC curves plot the sensitivity vs 1-specificity for varying diagnostic thresholds. The area under the ROC curve (AUC) provides a single measure of test accuracy. Methods for calculating AUC include parametric and nonparametric approaches. The document also discusses extensions of ROC analysis like free-response ROC (FROC) curves which evaluate tests with multiple lesion detections. It concludes by outlining a study that used JAFROC analysis to evaluate the effect of a computer-aided detection (CAD) system on radiologist performance in detecting lung nodules on
This document outlines topics related to survival analysis, including its objectives and key methods. Survival analysis is used to analyze longitudinal data on events like death or disease onset over time. It accounts for censoring of data. The Kaplan-Meier method estimates survival rates without dividing time into intervals like life tables do. The log-rank test statistically compares survival curves between groups. Cox regression analysis examines the relationship between covariates and survival while allowing hazards to vary over time.
Binomial Distribution statistical methods for economics.pptxkunal2422m
A random variable is defined as a variable whose value is determined by the outcome of a chance experiment. A random process assigns a random variable to each time point and represents phenomena that vary unpredictably over time. Examples of random experiments include coin tosses, dice rolls, and card draws from a deck. The outcome of such experiments is random and determined by chance. A random variable follows a probability distribution that describes the frequency or likelihood of different possible values. The binomial distribution specifically describes the number of successes in a fixed number of yes/no trials with constant success probability.
This document discusses polymeric drug delivery systems. It describes controlled release versus sustained release systems and how polymeric systems can incorporate a drug and release it at a known rate over a prolonged duration. It provides examples of commonly used polymers for drug delivery and examines the factors that determine drug release rates from matrix and reservoir devices using equations based on Fick's laws of diffusion. The document also addresses issues like burst and lag effects and discusses delivery systems for soluble drugs and proteins.
Microbiology (immunological method and application)Osama Al-Zahrani
1. The document discusses several immunological methods and applications including ELISA, latex agglutination assay, immunochromatography assay, haemagglutination inhibition assay, immunofluorescence antibody technique, and Western blot.
2. Key principles involve antigen-antibody reactions to detect the presence of various antibodies or antigens through techniques like enzyme reactions, agglutination, fluorescent labeling, or protein separation/detection.
3. The methods can be used to diagnose conditions like hepatitis, rheumatoid arthritis, malaria, and HIV.
Dr. Jim Lowe - Proven Practical Strategies for PRRS Free HeardJohn Blue
Part II Proven Practical Strategies for PRRS Free Heard, Dreaming of a World without PRRS - Dr. Jim Lowe, Carthage Veterinary Service, LTD., from the 2012 Iowa Pork Congress, January 24 - 26, Des Moines, IA, USA.
Experimental method of Educational Research.Neha Deo
experimental method is the most challenging method of the Educational research. In the experimental method different functional & factorial designs can be used. One has to think over the internal & external validity of the experiment also.In this presentation all these things are discussed in details.
The document discusses internal quality control procedures in a medical laboratory. It defines internal quality control and explains the three main stages - pre-analytical, analytical, and post-analytical - that need to be controlled. It describes the process for internal quality control, including using control materials, establishing statistical limits, and interpreting quality control data using rules like Westgard's multi-rules. The document emphasizes the importance of root cause analysis when quality control is out of control and comparing internal quality control with external quality assessment.
Development of health measurement scales – part 2Rizwan S A
This document discusses various methods for developing health measurement scales and assessing their validity and reliability. It begins by describing different scaling methods like categorical, continuous, Likert scales, and paired comparison methods. It then outlines topics like reliability, validity, measuring change and conclusions. Specific methods for assessing reliability are discussed in depth, including internal consistency using Cronbach's alpha, test-retest reliability, and inter-observer reliability which can be calculated using intraclass correlation coefficients. The document emphasizes that reliability is a necessary but not sufficient condition for validity, and different types of validity like content, criterion and construct validity are important to validate the inferences that can be made from scale scores.
This document discusses tests of significance and summarizes key concepts. It begins by describing qualitative and quantitative data and measures of central tendency. It then discusses sampling variation, the null hypothesis, p-values, and the standard error. The document outlines the steps in hypothesis testing and describes different types of tests including the standard error of difference between two proportions (SEDP) test and the chi-square test. Examples are provided to demonstrate how to calculate test statistics and determine significance. The limitations of the SEDP test are also noted.
This document discusses repeated measures ANOVA. It explains that repeated measures ANOVA is used when the same participants are measured under different treatment conditions. This allows researchers to remove variability caused by individual differences. The document outlines the components of the repeated measures ANOVA F-ratio, including the numerator which is the variance between treatments and the denominator which is the variance due to chance/error after removing individual differences. It also discusses how to conduct hypothesis testing and calculate effect size for repeated measures ANOVA.
Overview on crossover trialsStatistical illustration “SPSS”Continues outcomeNouran Hamza, MSc, PgDPH
crossover design
is a repeated/longitudinal measurements design.
|
Patients (experimental units) cross over from one treatment to another during the trial course.
|
In contrast to a parallel design where patients are randomized to a treatment and remain on that treatment throughout the trial duration
This document contains an agenda for an AI-Bio convergence training course that will take place from August to October 2022. The course will cover topics like Python and R for data analysis, statistical analysis techniques like ANOVA and multivariate analysis, genomics analysis including genome, transcriptome, epigenome and proteome, machine learning algorithms like linear models, clustering and association analysis, deep learning models like CNNs and RNNs, applying these techniques to medical data for tasks like predictive modeling and image analysis. It also includes sessions on computational chemistry, drug discovery, ontology and its applications in biology.
This document provides an overview of stochastic processes and Markov chains. It defines stochastic processes as families of random variables indexed by time. Markov chains are a type of stochastic process where the future state depends only on the present state, not on the past. The document discusses examples of Markov chains, transition matrices, classification of states as transient or persistent, and properties like irreducibility. It aims to introduce key concepts in stochastic processes and Markov chains.
The document discusses key principles of experimental design, including replication, randomization, and local control. It then summarizes different types of experimental designs such as completely randomized design, randomized block design, Latin square design, and factorial designs. Key points about each design are highlighted, along with examples to illustrate how they are applied.
Similar to Sweitzer,Simulating Multi Phase Studies (17)
This document describes how modeling and simulation (M&S) can be used to project outcomes for clinical trials. M&S involves building statistical models based on incoming patient data and then simulating the remainder of the study multiple times. This allows researchers to predict milestones, test alternative scenarios, and validate study assumptions. The document provides examples of how M&S was used to accurately forecast timelines and inform decisions for trials experiencing issues with enrollment rates and event rates differing from initial assumptions. Management found the simulations to be very valuable for planning by providing projections when other methods would have involved guessing.
This document introduces Tolstoy Targets, a visualization method using radial axes to provide a concise summary of multiple objectives or attributes. It discusses principles like using traffic light colors to indicate success or failure of predefined targets. Conventions are outlined, such as grouping attributes by direction and adding confidence ranges. Practical examples demonstrate comparing projects, mass screening of enzymes, transfusion risks for multiple patients, and assessment scores. The document concludes by providing contact information for the author.
This document describes how modeling and simulation (M&S) can be used to project timelines and resource needs for clinical trials. M&S involves building statistical models based on incoming patient data and then simulating the remainder of the study multiple times. This allows researchers to predict milestones, test alternative scenarios, and validate study assumptions. The document provides examples of how M&S accurately predicted timelines for trials with complex multi-segment designs and competing risk events. Study managers found the projections from M&S to be very valuable for planning purposes.
This document discusses metrics for assessing the performance of randomization methods in clinical trials. It proposes measuring randomness using potential selection bias, which calculates how well an observer could guess the next treatment assignment based on previous assignments. It also considers periodicity to detect patterns. Balance is measured using efficiency loss, which quantifies the increase in variability due to imbalances. The document outlines a simulation study comparing randomization methods using these proposed metrics. Stratification factors are modeled using a Zipf-Mandelbrot distribution to generate realistic subgroup sizes. Randomness and balance metrics are calculated at interim analyses and summarized graphically.
This document discusses metrics for assessing the predictability and efficiency of covariate-adaptive randomization designs in clinical trials. It proposes measuring predictability using a modified Blackwell-Hodges potential selection bias metric that calculates how well an observer could guess the next treatment assignment. It also considers entropy and periodicity measures. Balance/efficiency is proposed to be measured using Atkinson's method of quantifying the loss of statistical power as an equivalent reduction in sample size due to treatment imbalances within subgroups. The document then outlines a simulation study to compare various randomization methods using these proposed metrics.
This document discusses methods for assessing randomization in clinical trials that use covariate-adaptive randomization designs. It presents metrics for measuring randomness, balance, and efficiency loss in randomization schemes. The document outlines a simulation approach and discusses results from comparing different randomization factors and sample sizes. It proposes future directions such as optimizing randomization parameters and exploring periodicity in system behavior.
- Simulations of clinical trial randomization methods showed consistent trade-offs between efficiency and unpredictability over different methods and parameters. No single best method optimized both metrics.
- Two metrics were used to evaluate predictability (potential for selection bias) and efficiency (loss of statistical power): simulations revealed clear trade-offs between higher predictability and lower efficiency.
- As sample size increased, most methods became more efficient while some also became more predictable and others less predictable, depending on the method. Permuted blocks, dynamic allocation, and complete randomization were among the methods evaluated.
The document discusses randomization in clinical trials. It explains that randomization is important to minimize biases and balance treatment groups. Different randomization methods are presented: complete randomization, minimization, and permuted blocks. Metrics for evaluating randomization like balance, predictability, and loss of power are covered. Simulations comparing methods in terms of confounding factors, overall performance, and discontinuing patients are described. The importance of balanced treatment groups for sufficient statistical power and avoiding light weight results is emphasized.
Splatter plots provide:
(1) A comprehensive yet reducible way to visualize data across multiple dimensions.
(2) Diagnostic insights are obvious and interpretable at a glance, with problem areas visually identified.
(3) Various symbols, colors and visual cues can be used depending on the type of data and desired level of precision needed.
1. Signs of the Timings:
Predicting Time of Completion in
Multiphase Survival Trials
Dennis Sweitzer
Ali Falahati
Delaware Chapter of the ASA
September, 2006
3. The Protocols
Outcome:
• Time to Randomized Relapse
Open Label Phase
– Up to 36 weeks
– Patients must be stable for 12 weeks before randomization
– High withdrawal rate (30-70%)
– Assumed 50% randomize
Randomized Phase
– Up to 104 weeks
– High withdrawal rate
– Assumed 30% Relapse rate
– Trial could not end until last Patient randomized >28 weeks
4. Sensitivity to Relapse & Discontinuation rates (1)
Low Discontinuation
relative to Relapse
Cumulative Patient statuses
as trial progresses
100 Relapse ~Sep
Wrong assumptions, wait
longer
5. Sensitivity to Relapse & Discontinuation rates (2)
Higher event rates
deplete patient pool
Plan to stop enrollment as soon as
certain of reaching 100
~ July
Higher Discontinuation Rate,
Lower relapse Rate
Large delays
May never reach goal
6. Stopping Enrollment
Stopping Criteria
• At least 227 Relapses
• All patients still in Randomized Phase complete at least 28
weeks of treatment
Ideally:
• 227th Relapses occurs shortly after:
• All patients randomized >28 wks (Per Protocol)
• Randomization closed when:
• All enrolled patients randomize or discontinue
28 week Requirement later dropped (Protocol Amendment)
Presentation: use 200 Relapses
7. The Problems
Long Lead times
• Up to 36 weeks before randomization
• Plus 28 weeks Minimum randomization
Ideally: Stop enrollment 64 weeks before target
#Relapses
Must account for
• Enrollment D/C (30%-70%)
• Randomized D/C (D/C Rate ≈ Relapse Rate)
• Relapse Rates vary (Higher Relapse Rate Early)
• Competing Relapses (D/C vs Relapse)
• Sensitivity to rates (Close Rates High Variability)
8. Stopping Enrollment: Issues
Too Early: Too Late:
Fewer Patients Higher certainty of
Fewer randomized reaching Goal
Longer wait for Patients possibly in
target #Relapse. Open Label at End
May never reach Excessive #Relapses at
target #Relapses end of study
Ethics of Randomizing
Excess # pts
9. Many Management Questions
When do we stop enrollment while being sure of
eventually getting target # Relapses?
When can we stop randomization & ensure reaching the
target?
Whats the earliest and latest we can expect to reach the target?
When will all pts be randomized >28wks?
When can the trial be halted (required # Relapses & all pts randomized >28 wks)
Estimated Randomization Rate?
Estimated Relapse Rate?
How many active patients at the end?
How well does the outcome match our assumptions
etc
etc
10. Simulation Solution
• Make a stochastic model of the trial
• Monthly:
– Base model parameters on blinded data observed to date
– Incorporate assumptions where data insufficient
– Incorporate uncertainty of parameters
– Execute 1000’s of simulations of the trial
– Compute statistics from the collection of simulated trials
– Repeat with new data
11. Advantages
Transparency
• Modeling assumptions can be:
Specified -- Graphed -- Debated
Data Driven
• New Data updates the model
• Existing Active Patients are simulated to end
• Assumptions become less important as data
accumulates
12. Vision of Output
• Simulation reports varied according to changing
team needs (how many open label patients on June 1? When will we
reach 150 Relapses? How many randomized patients at time of 200th Relapse? If we stop randomizing on May 15, how many open
lable patiestin will there be?……………………………………………….
13. Stochastic Modeling Approach
1. Make a cartoon model of a patients
progress through the trial
2. What final outcomes are possible?
3. What could happen to the patient?
4. Identify States through which a patient
passes
5. Identify Random Processes which take
patients between states
14. Stochastic Model
Discontinued
Patients (Open
Label Phase) Discontinued
Patients
Open Label (Randomized
Enrolling Phase)
Patients Patients
Randomized
Patients
Relapses
Continuous Time Markov Chain
Markov States: the Bubbles Transitions: the Arrows
Transition Probabilities change with time in state
15. States
Discontinued
)
Patients (Open Label
Phase)
Discontinued Patients
Enrolling ) Open Label
)(Randomized Phase)
Patients Patients
Randomized Patients
)
) Relapses
2 Transitory Markov States:
Open Label Phase
Randomized Phase
3 Terminal Markov States:
Discontinued from Open Label Phase
Discontinued from Randomized Phase
Randomized Relapses
16. Transition Processes
Discontinued Patients
)
) (Open Label Phase)
Discontinued Patients
(Randomized Phase)
Enrolling
Patients
Open Label
Patients )
Randomized Patients
)
Relapses
)
5 Random Transition Processes:
1. Trial Enrollment (Start Open Label)
2. Discontinuation from Open Label Phase
3. Randomization (from Open Label Phase)
4. Discontinuation from Randomized Phase
5. Randomized Relapses
17. Trial Enrollment
)
Enrolling Open Label
Patients Patients
For each simulated patient, generate a random length of
time since the last patient
• Pick an enrollment rate λ (Based on history & judgment)
• Assume: #pts/mo ~ Poisson process with mean λ
• Time between patients ~ Exponential(1/λ)
Can expand enrollment model to evaluate management
options:
• Incorporate mixture of site performances
• Adding/changing sites during the trial
18. Markov Process
) Discontinuation
Continuing
)
Relapse (or
Randomization)
Probability of
transitioning from
state i to state j
between times s and t
19. Aalen-Johansen estimator of Transition Probabilities
• Aalen-Johansen estimator of the transition probability matrices
For and
# obs. Direct transitions from states h to j, visits 1 to t
# pts in state h, just prior to visit t
20. Aalen-Johansen & Kaplan-Meier
• Generalization of Kaplan-Meier Estimation
to Non-homogeneous Markov Chains
• K-M Estimators easier:
– To program (already in SAS)
– To understand (Intuitive)
– To Explain (Familiar)
21. Models
• Enrollment: Poisson Process
• Open Label Phase: Competing Risk Model
0= Still in OL Phase
1= Randomized
2= Discontinue fr OL phase
• Ramdomized Phase: Competing Risk Model
0= Still in Rand Phase
1= Manic event
2= Depressed event
3= Discontinue fr Rand phase
22. Competing Risk Model
Mutually exclusive events
(e.g., Relapse vs Discontinuation, …)
2 Approaches (Pintilie, 2006)
• Jointly distributed Random Variables
• Latent failure times
– Assume both events eventually occur
– But we only observe the first
– Use only marginal distributions
– Assuming independence (between events)
– But cannot test for independence, if only observing 1st
– Independence:
Face validity & Simplest Assumption
23. Kaplan-Meier Simulation
• Assume event are independent
• Model Each process separately using Kaplan-
Meier Estimators
• Censor on other event, current time in trial
• Simulate each event separately
• Earliest of the 2 simulated processes is taken as
simulated outcome
• Caveat: Assumes Independent processes
Intuitive, easy to understand, easy to explain
24. Open Label Transitions
Discontinued Patients
) (Open Label Phase)
2 Competing Open Label
Patients
Processes: ) Randomized Patients
Discontinuation Randomization
1. Generate Random Discontinuation time
2. Generate Random Randomization time
3. Use the earliest event
25. Randomized Phase Processes
Discontinued Patients (Randomized
) Phase)
Randomized Patients
Relapses
)
2 Competing Processes:
Discontinuation * * Relapse
Choose event as previously described.
• Current Open Label Patients are simulated to
randomization or discontinuation
• If simulated randomization, then simulate
Randomized Discontinuation or Relapse
26. Generic Transition Process
Q: When to make the transition? State
State
A: First: estimate random transition "A" "B"
function
1. Generate K-M Survival Functions from data
(censoring on all other events)
2. Make assumptions about Survival beyond last event
?
27. Simulated Patients
State State (p, t)
"A" "B" p
Q: When to make
the transition? t
A: Second: Simulate Trials
For Each Simulated Trial:
• For each simulated patient within a trial
– Pick a random p∈(0,1)
– Interpolate t from the graph, so that (p, t) is on graph
28. Simulating Active Patients
State State
q (q, s)
"A" "B"
(q*p, t)
q*p
For each simulated trial s t
• For each observed patient within state “A” for time s
– Interpolate q∈(0,1) from the graph, so that (q, s) is on graph
– Pick a random p∈(0,1)
– Interpolate t from the graph, so that (q*p, t) is on graph
29. Incorporating Parameter Uncertainty
State State
"A" "B"
For each simulated trial
q
• Pick a random quantile r∈ r
(0,1)
• Simulate all patients using
the r%-tile confidence level t
of the Kaplan-Meier Curve
Simulates: combinations of high & low estimates of Event and D/
C Survival curves
30. Limitations
Requires representative data from all phases
• K-M estimates only through last event
• Assumptions must be made about hazard rate after last
available event(s)
– If assumptions correct, point estimates should be stable while
confidence intervals narrow
• Up to date data
– Special reporting of Relapses (faxes with follow up,
monitoring)
– IVRS, EDC, monitoring reports
• Heterogeneity:
– Earliest sites may not be representative of all sites
– Procedures may change (hopefully improve) over time
– Regional differences (standards of care, patient attitudes, etc)
31. Why Not a Parametric Model?
Trial Structure:
• Events tend to occur
on visits
granularity
✭ Continuous
• Visits vary in
spacing
✭ Discrete
• Active Tx
mixture model
Changing Hazard
over time
• Must make & defend
simplifying
assumptions
32. Diagnostic: Does It Fit?
Survival curves of:
Observed data vs. Simulated Data
(Censored Observed, Active OL Pts, Active Rand. Pts, Entirely simulated Pts)
33. Diagnostics
Plot K-M curves for each event, time in each phase
• Review assumptions (long term behavior)
• Identify data anomalies
• Identify simulation problems
34. Example: Regional Heterogeneity
Regional modeling (Trials A & B):
Parameters varied by region more than by trial
– Estimate parameters within regions
– Simulate patients with Trial and Region
– Summarize results by Trial
In addition to simulations which ignored region
Survival curves
followed 2
patterns by
region & trial
35. Reporting the Simulations
For each simulated Trial:
• Sort Patient Events by occurrence date (enrollment,
randomization, relapse, etc)
For each scenario
• Summarize over Event records which fit scenario
Examples:
• Summarize over all patients enrolled before a potential
enrollment cutoff date.
• … over all patients randomized before a cutoff date
• Summarize with and without a subset of sites
36. Changing Questions
• Early in Trial
– Are the protocol assumptions accurate?
– When to stop enrollment?
– Expected # patients (enrolled, randomized, etc)
Identify problems & Evaluate fixes
• Mid-Trial
– When to stop randomizing patients?
– Are the revised assumptions accurate?
– Were changes effective?
• Late Trial
– When will the last Relapse occur?
– How many patients will be active in various phases?
Plan for Closeout & Database lock
37. Early Trial
For each Enrollment Cutoff Date
• Summarize each trial for all patients enrolled
before that date
• Compute statistics over simulated trials
Some trial outcomes:
• Dates of: last Relapse; All patients randomized>28 week;
PP Completion (>target & >28 weeks)
• Event & patient counts at each of above milestones
• % of Simulations with ≥200, 190, 180,… Relapses at milestones
• # active patients (open label or randomized) at given dates &
milestones
Pick a cutoff date accordingly (e.g., minimize resource with
least risk of running late)
38. Trial Completion (1)
≥227 Relapses & All Active Patients Randomized 28wks
Earliest
Completion:
• 75% Certainty:
2 August Cutoff for
Nov 2006
Completion
• 90% Certainty:
1 Sep Cutoff for
Dec 2006
Completion
39. Trial Completion (2)
≥227 Relapses Events & All Active Patients Randomized 28wks
Earliest
Completion:
• 75% Certainty:
~1775 Enrolled for
Nov 2006
Completion
• 90% Certainty:
~1850 Enrolled for
Dec 2006
Completion
45. Example: Will a trial end?
Study E:
• Endpoint: 300 type 1, 300 type 2 events
• Slower & Fewer than expected
• Simulation predicted:
– 10% chance of 300 of each
– 87% chance of 600 total
• Interim Analysis
– Supported by simulation
– 300 total expected April 25
46. Mid-Trial Output
For each Randomization Cutoff Date
• Summarize each trial for all patients
randomized before that date
• Compute statistics over simulated trials
• Generates same statistics per trial
• Summarize for each randomization cutoff date
Essentially, replace “enrollment” with
“randomization” & execute as before
48. Late-Trial Output
Refine estimates of last Relapse, etc.
For Milestones & Calendar Dates
• Estimate #patients in each stage (e.g., how many
patients will be active at the end?)
Caveats:
• Corrected (or just collected) data may change
estimates
• Old, unreported Relapses may be discovered
Useful:
• Predict time between milestone to end
• Add prediction to best guess milestone
49. Bottom line: How accurate?
Not bad:
• Actual Date of 200th Relapse covered by
predicted 80% C.I.
• Width of C.I.s narrowed over time
50.
51.
52.
53. Value Added
Early Refinement of Protocol Assumptions
• Protocol: 50% randomized, 30% Relapse rates
• Trial A: 33%, 37% Trial B: 55%, 41%
Early Identification of problems
Quick Response to problems
• Changed procedures to improve retention in Trial A
• Added sites to Trial B after delays in starting up sites
Better allocation of resources
54. Trials C, D
Mid trial: Regulators requested analysis of late
Relapses
• Enrollment had already ended for Trial C
– Enough patients to reach 88 late Relapses?
• Enrollment was still ongoing for Trial D
– Extend enrollment how long? Add sites?
• How would this affect time lines?
57. Dirty Data Problems
• Some known Relapses are not usable due to missing data
(e.g., unknown randomization date)
• Corrected (or lately collected) data may change estimates
• Old, unreported Relapses may be discovered
• Data may be collected or corrected irregularly
– Separate data sources (e.g., Relapse log + IVRS)
– Drift & shift over time
– End of trial data clean up
Solutions:
• Estimate time between milestones
– Anchor to a known, early milestone
Future solutions:
• Estimate missing data effects
– Use time between Relapse occurrence and Relapse reporting
– Estimate number of missing Relapses from times in past
58. Some Feedback
• “… the simulations are very valuable
and the only way we have to plan our
timelines. As it has turned out, your
simulations seems to be pretty accurate
as we have increased the mood event
rate significantly … as predicted...”
• ... We would have been guessing and
spinning our wheels without them.”
• Could you simulate trials xyz & uvw?
60. Diagnostics: Cohort Analysis
Cohorts:
• By month enrolled
• By month Randomized
Calculate randomization & Relapse rates
Easy to understand
Multiple Estimates which must be reconciled
Doesn’t Provide Time to Relapses
Useful reality check on Simulation
Point Estimates insufficient: need C.I.s
65. Open Label Cohorts (Cumulative Statuses)
• ~ 64% eventually Randomize & >40% Relapse Rate
• # open label pt < N/(0.64*0.40)
66. Solutions?
Crude Relapse Rates of all Patients in Phase
Mixture of patients:
• Relapse & D/C rates change with exposure
• Mixture of Pt. Exposures changes with time
Cohorts: Track Relapses over Time
Easy to understand
Multiple Estimates which must be reconciled
Doesn’t Provide Time to Relapses
Useful reality check on Simulation
Point Estimates insufficient: need C.I.s
67. Clinical Trial Management
Planning Trials (future)
• Is a trial feasible?
• Sensitivity to assumptions?
• Costs: # Pts, # Pt-mos, #visits, #Sites, #Site-mos
Trial Execution (current)
• Anticipate delays
• No information on outcome
• Could be added to simulation
Program Planning (future)
• Replace “Trial Phases” with “Toll Gates”
• Enhance modeling of Trial Enrollment process
68. Example: Adding Site
Sites Discontinued
Patients
Randomized
Patients
New Relapses
Sites
• Drop OL Phase, expand enrollment process
• Simulate time to start up new site, pts/mo at a new
site, etc
• Report by #Additional Sites instead of cutoff dates