Generating survival data with a clustered and multi-state structure is useful to study Multi-State
Models, Competing Risks Models and Frailty Models. The simulation of such kind of data
is not straightforward as one needs to introduce dependence between times of different transitions
while taking under control the probability of each competing event, the median sojourn time in
each state, the effect of covariates and the type and magnitude of heterogeneity.
Here we propose a simulation procedure based on Clayton copulas for the joint distribution
of times of each competing events block. It allows to specify the marginal distributions of time
variables, while their dependence is induced by the copula. Furthermore, even though a dependence
is obtained between all the time variables, only some joint distributions have to be handled.
The choice of simulation parameters is done by numerical minimization of a criterion function
based on the ratio of target and observed values of median times and of probabilities of competing
events.
The proposed method further allows to simulate discrete and continuous covariates and to
specify their effect on each transition in a proportional hazards way. A frailty term can be added,
too, in order to provide clustering. No particular restriction is needed on covariates distributions,
frailty distribution, number and sizes of clusters.
An example is provided simulating data mimicking those from an Italian multi-center study
on head and neck cancer. The multi-state structure of these data arises from the interest in
studying both time to local relapses and to distant metastases before death.
We show that our proposed method reaches very good convergence to the target values.
The document discusses inferring gene regulatory networks from time-course gene expression data. It presents the problem of inferring interactions between genes from high-dimensional and sparse time-course microarray data. It proposes using a Gaussian graphical model and introducing biologically grounded priors, such as sparsity and latent clustering of networks, to help address the scarcity of data. Several statistical models and algorithms are described for performing regularized inference on the networks with and without using a known or inferred latent clustering structure. The methods are evaluated on simulated time-course data and a real E. coli S.O.S DNA repair network.
Analyzing the effectiveness of different treatments for cancer typically involves collecting data from large clinical studies or animal testing. Such testing is always needed as a final validation for establishing treatment safety, but can be very costly and labor-intensive. Developing alternative testing approaches to be used in preliminary stages of evaluating new treatment strategies would be a great aid in speeding up research and development. We consider the use of mathematical models to describe the progression of cancer and how the influence of anti-cancer drugs can be incorporated into these models.
There are many different forms of cancer, but several types share similar mechanisms for how they start and spread. The basic understanding of metastatic cancer consists of the following general stages:
1. The disease starts from a single primary tumor which grows in one location.
2. The primary tumor will start to shed cancer cells which get carried to other parts of the body by the circulatory or lymphatic systems.
3. These cells will attach to other organs and start new secondary tumors, called metastases (or meta-static tumors).
4. The metastases grow and will shed cancer cells to produce more tumors. Such rapid spreading of cancer (also called “progression”) typically leads to multiple organ failure and fatality.
While there are many different types of clinical studies of cancer, there have been standards (RECIST) defined for many aspects of studies - including how to measure tumors and what 1

Figure 1: Two schematic representations of the spread of metastatic cancer: (left) from trialx.com (right) from www.cancer8.com.
data should be recorded. These articles are also good sources for descriptions of the stages of progression of cancer and practical limitations in what data can be collected in clinical trials.
An important question about the RECIST standards is whether collecting more clinical data can provide better assessments of progression and lead to more accurate models and better treatment protocols. Statistics and mathematical analysis can be applied to address these issues. Practical factors (effort, expense, record keeping, intrusiveness) have shaped the current standards and limited the amount and type of data that is collected in current studies. If benefits of increased data collection for guiding treatments could be demonstrated, this might lead to valuable improvements in the standards. Studies differ in conclusions about probability of fatality correlating with growth of tumors, but for our work, we will focus on increase in total tumor mass as a general descriptor of the progression of the disease.
The document discusses Approximate Bayesian Computation (ABC), a computational technique for Bayesian inference when the likelihood function is intractable. ABC allows sampling from the likelihood and making inferences based on simulated data without calculating the actual likelihood. The technique originated in population genetics models where likelihoods for genetic polymorphism data cannot be calculated in closed form. ABC is presented as both an inference machine with its own legitimacy compared to classical Bayesian approaches, as well as a way to address computational issues with intractable likelihoods.
This document provides an overview of digital modulation and coding fundamentals. It introduces key concepts such as lowpass and bandpass signals, signal space concepts, and orthogonal expansion of signals. Modulation and demodulation of bandpass signals is discussed through translating a baseband signal to a higher frequency bandpass signal and vice versa.
The document describes Bayesian inference for chemical reaction networks using approximate models. It discusses representing networks with reactions and rate constants, and using a Gibbs sampler to infer rate constants from time course data. However, exact inference does not scale well. The document proposes using a particle filter to perform approximate Bayesian filtering by simulating reaction paths between observations. This allows inference for realistic systems by treating concentrations continuously and using approximate simulators like SDEs or ODEs.
The document discusses inferring gene regulatory networks from time-course gene expression data. It presents the problem of inferring interactions between genes from high-dimensional and sparse time-course microarray data. It proposes using a Gaussian graphical model and introducing biologically grounded priors, such as sparsity and latent clustering of networks, to help address the scarcity of data. Several statistical models and algorithms are described for performing regularized inference on the networks with and without using a known or inferred latent clustering structure. The methods are evaluated on simulated time-course data and a real E. coli S.O.S DNA repair network.
Analyzing the effectiveness of different treatments for cancer typically involves collecting data from large clinical studies or animal testing. Such testing is always needed as a final validation for establishing treatment safety, but can be very costly and labor-intensive. Developing alternative testing approaches to be used in preliminary stages of evaluating new treatment strategies would be a great aid in speeding up research and development. We consider the use of mathematical models to describe the progression of cancer and how the influence of anti-cancer drugs can be incorporated into these models.
There are many different forms of cancer, but several types share similar mechanisms for how they start and spread. The basic understanding of metastatic cancer consists of the following general stages:
1. The disease starts from a single primary tumor which grows in one location.
2. The primary tumor will start to shed cancer cells which get carried to other parts of the body by the circulatory or lymphatic systems.
3. These cells will attach to other organs and start new secondary tumors, called metastases (or meta-static tumors).
4. The metastases grow and will shed cancer cells to produce more tumors. Such rapid spreading of cancer (also called “progression”) typically leads to multiple organ failure and fatality.
While there are many different types of clinical studies of cancer, there have been standards (RECIST) defined for many aspects of studies - including how to measure tumors and what 1

Figure 1: Two schematic representations of the spread of metastatic cancer: (left) from trialx.com (right) from www.cancer8.com.
data should be recorded. These articles are also good sources for descriptions of the stages of progression of cancer and practical limitations in what data can be collected in clinical trials.
An important question about the RECIST standards is whether collecting more clinical data can provide better assessments of progression and lead to more accurate models and better treatment protocols. Statistics and mathematical analysis can be applied to address these issues. Practical factors (effort, expense, record keeping, intrusiveness) have shaped the current standards and limited the amount and type of data that is collected in current studies. If benefits of increased data collection for guiding treatments could be demonstrated, this might lead to valuable improvements in the standards. Studies differ in conclusions about probability of fatality correlating with growth of tumors, but for our work, we will focus on increase in total tumor mass as a general descriptor of the progression of the disease.
The document discusses Approximate Bayesian Computation (ABC), a computational technique for Bayesian inference when the likelihood function is intractable. ABC allows sampling from the likelihood and making inferences based on simulated data without calculating the actual likelihood. The technique originated in population genetics models where likelihoods for genetic polymorphism data cannot be calculated in closed form. ABC is presented as both an inference machine with its own legitimacy compared to classical Bayesian approaches, as well as a way to address computational issues with intractable likelihoods.
This document provides an overview of digital modulation and coding fundamentals. It introduces key concepts such as lowpass and bandpass signals, signal space concepts, and orthogonal expansion of signals. Modulation and demodulation of bandpass signals is discussed through translating a baseband signal to a higher frequency bandpass signal and vice versa.
The document describes Bayesian inference for chemical reaction networks using approximate models. It discusses representing networks with reactions and rate constants, and using a Gibbs sampler to infer rate constants from time course data. However, exact inference does not scale well. The document proposes using a particle filter to perform approximate Bayesian filtering by simulating reaction paths between observations. This allows inference for realistic systems by treating concentrations continuously and using approximate simulators like SDEs or ODEs.
This document discusses an online EM algorithm and some extensions. It begins by outlining the goals of maximum likelihood estimation, good scaling, processing data incrementally without storage, and simple implementation. It then provides an overview of the topics covered, which include the EM algorithm in exponential families, the limiting EM recursion, the online EM algorithm, using online EM for batch maximum likelihood estimation, and extensions. The document uses a Poisson mixture model as a running example to illustrate the E and M steps of the EM algorithm.
This document contains the homework assignment for Dr. Ashu Sabharwal's ELEC 430 class at Rice University due on February 19, 2009. It includes 3 exercises on topics related to signaling and detection:
1. The likelihood ratio test is derived for a binary communication system with additive white Gaussian noise. Conditional probability density functions are found and used to determine the probability of error as a function of threshold.
2. Matched filters are discussed for an antipodal signaling system. The impulse response and output of the matched filter are sketched. Expressions are derived for the noise variance and probability of error.
3. Orthogonal signal properties are explored for a set of signals that are modified by subtracting
This document presents a Bayesian joint model for longitudinal and time-to-event outcomes that allows for subpopulation heterogeneity using latent variables. It begins with motivational examples in HIV and prostate cancer research. It then reviews existing joint modeling approaches before introducing a new latent process model that models the longitudinal and survival outcomes conditional on a shared latent process. The document describes prior distributions, a Gibbs sampling algorithm, and simulation studies to evaluate the model under both Gaussian and non-Gaussian longitudinal distributions.
This document discusses applying renewal theorems to analyze the exponential moments of local times of Markov processes. It contains three main points:
1) If γ is greater than 1/G∞(i,i), the expected exponential moment grows exponentially over time.
2) If γ equals 1/G∞(i,i), the expected exponential moment grows linearly over time if H∞(i,i) is finite, and sublinearly otherwise.
3) If γ is less than 1/G∞(i,i), the expected exponential moment converges to a constant as time increases.
The analysis simplifies and strengthens previous results by framing the problem as a renewal
1) This document describes an optimal monetary policy model with 7 endogenous variables and 5 equilibrium conditions, leaving two degrees of freedom.
2) The model maximizes social welfare as the sum of period utilities from consumption and labor, subject to the equilibrium conditions.
3) The first order optimality conditions result in a system of 7 equations that can be solved using log-linearization methods around the non-stochastic steady state, similarly to previous examples.
1. The document provides an overview of Fourier analysis techniques including Fourier series, Fourier transforms, and their applications to signal representation and analysis.
2. Key concepts covered include representing periodic and aperiodic signals in the time and frequency domains, properties of linear and time-invariant systems, Parseval's theorem relating signal energy in the time and frequency domains, and the Fourier transforms of basic functions like impulses and complex exponentials.
3. The document establishes essential mathematical foundations for further study of analog and digital communications techniques that involve signal processing and transmission in the frequency domain.
The document discusses adaptive Markov chain Monte Carlo (MCMC) for Bayesian inference of spatial autologistic models. It notes that standard MCMC cannot be implemented when the likelihood function is unavailable or the completion step is too costly due to high dimensionality. Adaptive MCMC is proposed as an alternative that bypasses computation of the normalizing constant. Questions are raised about how to combine adaptations of the proposal distribution, tuning parameters, and sample sizes to improve the method.
Aristidis Likas, Associate Professor and Christoforos Nikou, Assistant Professor, University of Ioannina, Department of Computer Science , Mixture Models for Image Analysis
The document introduces perturbation methods as a way to solve functional equations that describe economic problems. It presents a basic real business cycle model as an example problem that can be solved using perturbation methods. Specifically, it:
1) Defines the real business cycle model as a functional equation system that is difficult to solve directly.
2) Proposes using perturbation methods by introducing a small perturbation parameter (the standard deviation of technology shocks) and solving the problem when this parameter equals zero.
3) Expands the decision rules as Taylor series in terms of the state variables and perturbation parameter to build a local approximation around the deterministic steady state. This leads to a system of equations that can be solved order-by-order for
This document discusses heterogeneous agent models without aggregate uncertainty. It introduces a model with a continuum of agents who face idiosyncratic income fluctuations but no aggregate shocks. There is a unique stationary equilibrium with constant interest rates and wages. The document discusses the recursive competitive equilibrium, existence and uniqueness of the stationary equilibrium, transition functions, computation methods, and some qualitative results from calibrating the model.
The document discusses three examples of nonlinear and non-Gaussian DSGE models. The first example features Epstein-Zin preferences to allow for a separation between risk aversion and the intertemporal elasticity of substitution. The second example models volatility shocks using time-varying variances. The third example aims to distinguish between the effects of stochastic volatility ("fortune") versus parameter drifting ("virtue") in explaining time-varying volatility in macroeconomic variables. The document outlines the motivation, structure, and solution methods for these three nonlinear DSGE models.
This document discusses likelihood methods for continuous-time models in finance. It describes approximating the transition density function pX of a continuous-time process through a series of transformations to get closer to a normal distribution. This allows representing pX as a series expansion involving Hermite polynomials. Computing the expansion coefficients allows obtaining an explicit closed-form approximation to pX. Maximizing the approximate likelihood results in an estimator that converges to the true MLE as the number of terms increases.
Optimal control of coupled PDE networks with automated code generationDelta Pi Systems
This document summarizes an approach for optimal control of coupled partial differential equation (PDE) networks using automated code generation. It discusses representing PDE networks as graphs, formulating the optimal control problem, deriving adjoint equations to compute gradients, discretizing control variables, and generating code to solve the direct and adjoint problems. Tools used include the DOT language for graph representation, SymPy for symbolic math, Cog for code generation, SfePy for PDE solvers, and SciPy for numerics.
The document summarizes the Wang-Landau algorithm and some of its improvements. The Wang-Landau algorithm is an adaptive Markov chain Monte Carlo method that iteratively estimates the density of states of a system. It partitions the state space into bins and iteratively adjusts estimates of the density within each bin so that the generated samples spend an equal amount of time in each bin. The algorithm has been improved through automatic binning methods, adaptive proposal distributions, and using parallel interacting chains. An example application to variable selection is also discussed.
EM algorithm and its application in probabilistic latent semantic analysiszukun
The document discusses the EM algorithm and its application in Probabilistic Latent Semantic Analysis (pLSA). It begins by introducing the parameter estimation problem and comparing frequentist and Bayesian approaches. It then describes the EM algorithm, which iteratively computes lower bounds to the log-likelihood function. Finally, it applies the EM algorithm to pLSA by modeling documents and words as arising from a mixture of latent topics.
This document discusses filtering and likelihood inference. It begins by introducing filtering problems in economics, such as evaluating DSGE models. It then presents the state space representation approach, which models the transition and measurement equations with stochastic shocks. The goal of filtering is to compute the conditional densities of states given observed data over time using tools like the Chapman-Kolmogorov equation and Bayes' theorem. Filtering provides a recursive way to make predictions and updates estimates as new data arrives.
This document reviews the Fourier transform and its properties. It defines the Fourier transform and inverse Fourier transform. The Fourier transform of a signal decomposes it into its frequency components. Properties covered include linearity, time/frequency shifting, modulation, convolution, and more. Examples of Fourier transforms are given for rectangular pulses and Dirac delta functions. Applications to signals like DC, complex exponentials, and sinusoids are described. Proofs can be found in the referenced textbook.
EGLA's cloud-based platform called MEDIAMPLIFY merges cloud and cable TV by bringing music and video/TV content from the cloud to devices like smart TVs, phones, and set-top boxes. Their solutions MEDIAMPLIFY and MEVIA provide entertainment and learning experiences for operators, universities, hotels, and businesses. MEDIAMPLIFY is a robust cloud platform for TV, video, and music content distribution to cable and telecom operators, while MEVIA is the consumer application available on smart devices and TVs. The company projects growing quarterly revenues from licensing MEVIA to subscribers.
Applying the Scientific Method to Simulation ExperimentsFrank Bergmann
In this talk I would like to explore on how to apply the scientific method to in silico experiments. How can we design these experiments, so that they are independent of the software tool that gave rise to them? Over the past decade we have seen the rise of model exchange formats such as the Systems Biology Markup Language (SBML), that enable us to share the models readily with colleagues and between applications.
Here I present the Simulation Experiment Description Markup Language (SED-ML) that aims to do the same thing for in silico experiments. After detailing its history, and where it currently stands, I will give a short overview of the growing tool support.
This document discusses an online EM algorithm and some extensions. It begins by outlining the goals of maximum likelihood estimation, good scaling, processing data incrementally without storage, and simple implementation. It then provides an overview of the topics covered, which include the EM algorithm in exponential families, the limiting EM recursion, the online EM algorithm, using online EM for batch maximum likelihood estimation, and extensions. The document uses a Poisson mixture model as a running example to illustrate the E and M steps of the EM algorithm.
This document contains the homework assignment for Dr. Ashu Sabharwal's ELEC 430 class at Rice University due on February 19, 2009. It includes 3 exercises on topics related to signaling and detection:
1. The likelihood ratio test is derived for a binary communication system with additive white Gaussian noise. Conditional probability density functions are found and used to determine the probability of error as a function of threshold.
2. Matched filters are discussed for an antipodal signaling system. The impulse response and output of the matched filter are sketched. Expressions are derived for the noise variance and probability of error.
3. Orthogonal signal properties are explored for a set of signals that are modified by subtracting
This document presents a Bayesian joint model for longitudinal and time-to-event outcomes that allows for subpopulation heterogeneity using latent variables. It begins with motivational examples in HIV and prostate cancer research. It then reviews existing joint modeling approaches before introducing a new latent process model that models the longitudinal and survival outcomes conditional on a shared latent process. The document describes prior distributions, a Gibbs sampling algorithm, and simulation studies to evaluate the model under both Gaussian and non-Gaussian longitudinal distributions.
This document discusses applying renewal theorems to analyze the exponential moments of local times of Markov processes. It contains three main points:
1) If γ is greater than 1/G∞(i,i), the expected exponential moment grows exponentially over time.
2) If γ equals 1/G∞(i,i), the expected exponential moment grows linearly over time if H∞(i,i) is finite, and sublinearly otherwise.
3) If γ is less than 1/G∞(i,i), the expected exponential moment converges to a constant as time increases.
The analysis simplifies and strengthens previous results by framing the problem as a renewal
1) This document describes an optimal monetary policy model with 7 endogenous variables and 5 equilibrium conditions, leaving two degrees of freedom.
2) The model maximizes social welfare as the sum of period utilities from consumption and labor, subject to the equilibrium conditions.
3) The first order optimality conditions result in a system of 7 equations that can be solved using log-linearization methods around the non-stochastic steady state, similarly to previous examples.
1. The document provides an overview of Fourier analysis techniques including Fourier series, Fourier transforms, and their applications to signal representation and analysis.
2. Key concepts covered include representing periodic and aperiodic signals in the time and frequency domains, properties of linear and time-invariant systems, Parseval's theorem relating signal energy in the time and frequency domains, and the Fourier transforms of basic functions like impulses and complex exponentials.
3. The document establishes essential mathematical foundations for further study of analog and digital communications techniques that involve signal processing and transmission in the frequency domain.
The document discusses adaptive Markov chain Monte Carlo (MCMC) for Bayesian inference of spatial autologistic models. It notes that standard MCMC cannot be implemented when the likelihood function is unavailable or the completion step is too costly due to high dimensionality. Adaptive MCMC is proposed as an alternative that bypasses computation of the normalizing constant. Questions are raised about how to combine adaptations of the proposal distribution, tuning parameters, and sample sizes to improve the method.
Aristidis Likas, Associate Professor and Christoforos Nikou, Assistant Professor, University of Ioannina, Department of Computer Science , Mixture Models for Image Analysis
The document introduces perturbation methods as a way to solve functional equations that describe economic problems. It presents a basic real business cycle model as an example problem that can be solved using perturbation methods. Specifically, it:
1) Defines the real business cycle model as a functional equation system that is difficult to solve directly.
2) Proposes using perturbation methods by introducing a small perturbation parameter (the standard deviation of technology shocks) and solving the problem when this parameter equals zero.
3) Expands the decision rules as Taylor series in terms of the state variables and perturbation parameter to build a local approximation around the deterministic steady state. This leads to a system of equations that can be solved order-by-order for
This document discusses heterogeneous agent models without aggregate uncertainty. It introduces a model with a continuum of agents who face idiosyncratic income fluctuations but no aggregate shocks. There is a unique stationary equilibrium with constant interest rates and wages. The document discusses the recursive competitive equilibrium, existence and uniqueness of the stationary equilibrium, transition functions, computation methods, and some qualitative results from calibrating the model.
The document discusses three examples of nonlinear and non-Gaussian DSGE models. The first example features Epstein-Zin preferences to allow for a separation between risk aversion and the intertemporal elasticity of substitution. The second example models volatility shocks using time-varying variances. The third example aims to distinguish between the effects of stochastic volatility ("fortune") versus parameter drifting ("virtue") in explaining time-varying volatility in macroeconomic variables. The document outlines the motivation, structure, and solution methods for these three nonlinear DSGE models.
This document discusses likelihood methods for continuous-time models in finance. It describes approximating the transition density function pX of a continuous-time process through a series of transformations to get closer to a normal distribution. This allows representing pX as a series expansion involving Hermite polynomials. Computing the expansion coefficients allows obtaining an explicit closed-form approximation to pX. Maximizing the approximate likelihood results in an estimator that converges to the true MLE as the number of terms increases.
Optimal control of coupled PDE networks with automated code generationDelta Pi Systems
This document summarizes an approach for optimal control of coupled partial differential equation (PDE) networks using automated code generation. It discusses representing PDE networks as graphs, formulating the optimal control problem, deriving adjoint equations to compute gradients, discretizing control variables, and generating code to solve the direct and adjoint problems. Tools used include the DOT language for graph representation, SymPy for symbolic math, Cog for code generation, SfePy for PDE solvers, and SciPy for numerics.
The document summarizes the Wang-Landau algorithm and some of its improvements. The Wang-Landau algorithm is an adaptive Markov chain Monte Carlo method that iteratively estimates the density of states of a system. It partitions the state space into bins and iteratively adjusts estimates of the density within each bin so that the generated samples spend an equal amount of time in each bin. The algorithm has been improved through automatic binning methods, adaptive proposal distributions, and using parallel interacting chains. An example application to variable selection is also discussed.
EM algorithm and its application in probabilistic latent semantic analysiszukun
The document discusses the EM algorithm and its application in Probabilistic Latent Semantic Analysis (pLSA). It begins by introducing the parameter estimation problem and comparing frequentist and Bayesian approaches. It then describes the EM algorithm, which iteratively computes lower bounds to the log-likelihood function. Finally, it applies the EM algorithm to pLSA by modeling documents and words as arising from a mixture of latent topics.
This document discusses filtering and likelihood inference. It begins by introducing filtering problems in economics, such as evaluating DSGE models. It then presents the state space representation approach, which models the transition and measurement equations with stochastic shocks. The goal of filtering is to compute the conditional densities of states given observed data over time using tools like the Chapman-Kolmogorov equation and Bayes' theorem. Filtering provides a recursive way to make predictions and updates estimates as new data arrives.
This document reviews the Fourier transform and its properties. It defines the Fourier transform and inverse Fourier transform. The Fourier transform of a signal decomposes it into its frequency components. Properties covered include linearity, time/frequency shifting, modulation, convolution, and more. Examples of Fourier transforms are given for rectangular pulses and Dirac delta functions. Applications to signals like DC, complex exponentials, and sinusoids are described. Proofs can be found in the referenced textbook.
EGLA's cloud-based platform called MEDIAMPLIFY merges cloud and cable TV by bringing music and video/TV content from the cloud to devices like smart TVs, phones, and set-top boxes. Their solutions MEDIAMPLIFY and MEVIA provide entertainment and learning experiences for operators, universities, hotels, and businesses. MEDIAMPLIFY is a robust cloud platform for TV, video, and music content distribution to cable and telecom operators, while MEVIA is the consumer application available on smart devices and TVs. The company projects growing quarterly revenues from licensing MEVIA to subscribers.
Applying the Scientific Method to Simulation ExperimentsFrank Bergmann
In this talk I would like to explore on how to apply the scientific method to in silico experiments. How can we design these experiments, so that they are independent of the software tool that gave rise to them? Over the past decade we have seen the rise of model exchange formats such as the Systems Biology Markup Language (SBML), that enable us to share the models readily with colleagues and between applications.
Here I present the Simulation Experiment Description Markup Language (SED-ML) that aims to do the same thing for in silico experiments. After detailing its history, and where it currently stands, I will give a short overview of the growing tool support.
Eswaran Subrahmanian - Serious Games in Complex Design of Urban Systems and P...SeriousGamesAssoc
Presenter: Eswaran Subrahmanian, Research Professor, Carnegie Mellon University and Fields of View
The goal of this talk is to illustrate the use of Serious games for inclusive design of systems. The talk will take examples from a developing country context: India. The talk will illustrate how games enhances awareness and participation in the design process. The talk will also use a game on the design of operation of railways in a developed country. The talk will make the case that games are ideal way to deal with participation and design across functional divisions and also linguistic and social boundaries.
This document discusses using cellular automata (CA) and artificial neural networks (ANN) to model urban growth. It describes the structures of CA, ANN, and fuzzy sets and reviews previous studies applying CA, ANN, and combinations of the two (CA-ANN models) to simulation and forecasting of land use change. The document also presents a case study that develops a CA-ANN model for urban simulation, calibrates and validates the model, and achieves 87% goodness of fit when comparing model results to observed data.
The document reviews the use of computational fluid dynamics (CFD) simulation in urban design to analyze outdoor thermal comfort in hot and dry climates. CFD simulation can be used to understand factors like airflow, wind patterns, heat distribution and radiation that affect pedestrian thermal comfort. It allows designers to evaluate different design options and incorporate features to improve thermal comfort like shelters, vegetation and materials. While CFD provides accurate analysis of microclimate factors, its results require assumptions and simplifications that can limit effectiveness.
Simulation As a Method To Support Complex Organizational Transformations in H...Jos van Hillegersberg
How to Support Complex Organizational Transformations in Healthcare?
Using Realistic Simulations in Participatory Workshops
Contributions:
Functionalistic Approaches Dominate in design and implementation
-> Integration of Functionalistic and Interpretive approach
Simulation is traditionally used as a method to quantitatively analyse performance issues and improvement opportunities
-> The use of simulation in participative and collaborative workshop settings
Simulation traditionally uses mathematical models of reality with several assumptions and constraints
-> Higly accurate models based on real world data from planning systems and future organization layout
Based on:
Rothengatter, Diederik; Katsma, Christiaan; and Hillegersberg, Jos van, "Simulation As a Method To Support Complex
Organizational Transformations in Healthcare" (2010). AMCIS 2010 Proceedings. Paper 554.
http://aisel.aisnet.org/amcis2010/554
Simulation of Urban Mobility (Sumo) For Evaluating Qos Parameters For Vehicul...iosrjce
IOSR Journal of Electronics and Communication Engineering(IOSR-JECE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of electronics and communication engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in electronics and communication engineering. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
CFMS is a not-for-profit organization committed to accelerating simulation design processes. It has established the CFMS Limited company and Advanced Simulation Research Centre facility to achieve its vision. The facility provides high-performance computing, visualization capabilities, and collaborative work spaces to support research and technology projects. CFMS engages organizations through its Associates Scheme, which aims to improve awareness of new technologies and methods through demonstrations, networking, and other activities.
1) The document discusses using the simulation game SimCity to teach sustainable urban planning. Students play SimCity as part of a course project to design sustainable cities.
2) Data shows students who used SimCity scored higher on learning outcomes related to evaluating sustainability and producing urban development plans compared to previous cohorts.
3) Students reported being immersed in their projects through SimCity and receiving immediate feedback to improve their designs, compared to traditional classroom instruction.
A High-speed Verilog HDL Simulation Method using a Lightweight TranslatorRyohei Kobayashi
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help protect against mental illness and improve symptoms.
Games and Serious Games in Urban Planning: Study CasesBeniamino Murgante
The document discusses a serious game called the B3 Game that was developed to support public participation in urban planning. The game allows citizens of Billstedt, Hamburg to design their local marketplace virtually. It addresses issues like rational ignorance in participation. Research goals were to design a game that could support playful public participation and test it with users. Comparable examples and open questions about integrating games into planning processes are also mentioned.
1. The charge simulation method (CSM) simulates an actual electric field with a field formed by discrete charges placed outside the region where the field solution is desired.
2. CSM determines the values of discrete charges by satisfying boundary conditions at selected contour points on electrodes. Once charge values and positions are known, the potential and field distribution can be easily computed.
3. CSM describes surface charge on an electrode boundary using fictitious point, line, or ring charges in the electrode interior. Charge types and positions are determined first, then magnitudes are calculated to satisfy boundary conditions.
Architecture and urban planning (3 d) representationMaria Bostenaru
This document discusses the representation of architecture and urban planning in games and toys. It begins with an introduction on 3D viewing toys and board games that model construction management. It then reviews different types of games that feature architecture and urban planning, including playcards, toys, puzzle games, board games, role-playing games at the city scale, and computer games. Examples are provided for many of these categories. The document concludes by noting how games can be used for educational purposes and involve societal participation in urban planning decisions.
Parallel Simulation of Urban Dynamics on the GPU Ivan Blečić, Arnaldo Cecchi...Beniamino Murgante
1) The document discusses parallelizing urban simulation models on GPUs to reduce long computing times required for large-scale and complex models.
2) Two cellular automata models (constrained and unconstrained) for simulating land use change are implemented on GPUs using CUDA.
3) Computational results show the parallel GPU implementations achieve significant speedups over sequential CPU versions, enabling more accurate calibration of models operating at regional or larger scales.
Keynote address for the Community College of Aurora Faculty in-service on the use of simulation and educational games in immersive environments.
For more information, visit: my research and class blog at http://ctusoftware.blogspot.com
This document provides an overview and instructions for installing and using the open source traffic simulation software SUMO (Simulation of Urban MObility). It describes how to download and install SUMO, create road networks manually or using the NETGENERATE tool, import networks from formats like OSM, and model vehicle demand. Road networks consist of nodes and edges, and SUMO tools like NETCONVERT are used to convert files into the SUMO network format. Vehicle routes and types are defined to simulate traffic.
Empowering Stakeholders – Simulation Games As a Participatory Method - Jan Ec...ServDes
This document describes a simulation game called "Work A Round" developed by Dr. Jan Eckert to address challenges related to distributed and mobile knowledge work. The game aims to empower stakeholders like workers, companies, transportation and design firms. It uses a board with locations and task/action cards for players to match tasks and workplaces. The game is followed by a two-part debriefing to discuss strategies, work patterns, and insights. The game has been tested with companies and can provide research findings on distributed work practices and space needs.
The document describes a three-parameter generalized inverse Weibull (GIW) distribution that can model failure rates. Key properties of the GIW distribution include:
- It reduces to the inverse Weibull distribution when the shape parameter γ equals 1.
- Its probability density function, survival function, and hazard function are defined.
- Formulas are provided for the moments, moment generating function, and Shannon entropy of the GIW distribution.
- Methods are described for maximum likelihood estimation of the GIW distribution parameters from censored lifetime data.
lecture1 on survival analysis HRP 262 classTroyTeo1
1. Survival analysis is a set of statistical methods used to analyze longitudinal data on the occurrence of events such as death, disease onset, or recovery. It can accommodate data from randomized clinical trials or cohort studies.
2. Key concepts in survival analysis include the survival function, which gives the probability of surviving past a particular time, and the hazard function, which provides the instantaneous risk of an event at a particular time given survival up to that time.
3. Common distributions used in parametric survival analysis to model event times include the exponential distribution, which assumes a constant hazard over time, and the Weibull distribution, which allows the hazard to increase or decrease over time.
1. Consider experiments with the following censoring mechanism A gr.docxstilliegeorgiana
1. Consider experiments with the following censoring mechanism: A group of n units is observed from time 0; observation stops at the time of the rth failure or at time C, whatever occurs first. Show by direct calculation that the likelihood function is of the form L = Yn i=1 f(ti) δiS(ti+)1−δi , assuming that the units gave failure times which are i.i.d. with survivor function S(t) and p.d.f. f(t). (Hint: first define ti and δi .)
2. Suppose that T is a survival random variable with survival function S and cumulative hazard function H(t) = − log S(t). Show that H(T) ∼ exp(1).
3. Suppose that the lifetime Ti has hazard function hi(t) and that Ci is a random censoring time associated with Ti . Define λi(t) = lim ∆t→0 P(t ≤ Ti ≤ t + ∆t|Ti ≥ t, Ci ≥ t) ∆t (a) Show that if Ti is independent of Ci , hi(t) = λi(t). (b) Suppose that there exists an unobserved covariate Zi which affects both Ti and Ci , as follows: P(Ti ≥ t|Zi) = exp(−Ziθt), P(Ci ≥ t|Zi) = exp(−Ziρt), and Ti , Ci are independent, given Zi . Assume that Zi has a gamma distribution with density function g(z) = φ φ Γ(φ) z φ−1 e −φz(z > 0). Show that the joint survivor function for Ti , Ci is P(Ti ≥ t, Ci ≥ s) = 1 + 1 φ θt + 1 φ ρs−φ .
4. The lifetime of an article is thought to have an exponential distribution. Twelve such articles were selected at random and tested until nine of them failed. The nine observed failure times were 8, 14, 23, 32, 46, 57, 69, 88, 109. Assume that the data follow the exponential distribution. (a) Compute the maximum likelihood estimate of mean µ. (b) Compute the Fisher information for ˆµ. (c) Obtain a 90% confidence interval for µ by using the quantity Z = (ˆµ−µ)/se(ˆµ) where se(ˆµ) is the standard error for the estimate ˆµ.
.
call for papers, research paper publishing, where to publish research paper, journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJEI, call for papers 2012,journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, research and review articles, engineering journal, International Journal of Engineering Inventions, hard copy of journal, hard copy of certificates, journal of engineering, online Submission, where to publish research paper, journal publishing, international journal, publishing a paper, hard copy journal, engineering journal
Sequential experimentation in clinical trialsSpringer
This chapter discusses time-sequential clinical trial designs where the primary endpoint is survival time. It begins with an overview of survival analysis methodology, which must be extended to account for the sequential nature of interim analyses in time-sequential trials. The seminal Beta-Blocker Heart Attack Trial (BHAT) is described as an example of an early time-sequential trial. Key developments following BHAT include methods that account for two time scales: the information accumulated over time and calendar time of interim analyses. Nelson-Aalen and Kaplan-Meier estimators are also summarized as tools for survival analysis in time-sequential settings.
Bayesian inference for mixed-effects models driven by SDEs and other stochast...Umberto Picchini
An important, and well studied, class of stochastic models is given by stochastic differential equations (SDEs). In this talk, we consider Bayesian inference based on measurements from several individuals, to provide inference at the "population level" using mixed-effects modelling. We consider the case where dynamics are expressed via SDEs or other stochastic (Markovian) models. Stochastic differential equation mixed-effects models (SDEMEMs) are flexible hierarchical models that account for (i) the intrinsic random variability in the latent states dynamics, as well as (ii) the variability between individuals, and also (iii) account for measurement error. This flexibility gives rise to methodological and computational difficulties.
Fully Bayesian inference for nonlinear SDEMEMs is complicated by the typical intractability of the observed data likelihood which motivates the use of sampling-based approaches such as Markov chain Monte Carlo. A Gibbs sampler is proposed to target the marginal posterior of all parameters of interest. The algorithm is made computationally efficient through careful use of blocking strategies, particle filters (sequential Monte Carlo) and correlated pseudo-marginal approaches. The resulting methodology is is flexible, general and is able to deal with a large class of nonlinear SDEMEMs [1]. In a more recent work [2], we also explored ways to make inference even more scalable to an increasing number of individuals, while also dealing with state-space models driven by other stochastic dynamic models than SDEs, eg Markov jump processes and nonlinear solvers typically used in systems biology.
[1] S. Wiqvist, A. Golightly, AT McLean, U. Picchini (2020). Efficient inference for stochastic differential mixed-effects models using correlated particle pseudo-marginal algorithms, CSDA, https://doi.org/10.1016/j.csda.2020.107151
[2] S. Persson, N. Welkenhuysen, S. Shashkova, S. Wiqvist, P. Reith, G. W. Schmidt, U. Picchini, M. Cvijovic (2021). PEPSDI: Scalable and flexible inference framework for stochastic dynamic single-cell models, bioRxiv doi:10.1101/2021.07.01.450748.
On estimating the integrated co volatility usingkkislas
This document proposes a method to estimate the integrated co-volatility of two asset prices using high-frequency data that contains both microstructure noise and jumps.
It considers two cases - when the jump processes of the two assets are independent, and when they are dependent. For the independent case, it proposes an estimator that is robust to jumps. For the dependent case, it proposes a threshold estimator that combines pre-averaging to remove noise with a threshold method to reduce the effect of jumps. It proves the estimators are consistent and establishes their central limit theorems. Simulation results are also presented to illustrate the performance of the proposed methods.
On Some Measures of Genetic Distance Based on Rates of Nucleotide SubstitutionJustine Leon Uro
The document presents a general DNA base-nucleotide substitution model and discusses three special cases: the three-substitution-type (3ST) model, two-substitution-type (2ST) model, and the Jukes-Cantor model. The 3ST model considers transitions and two types of transversions, while the 2ST model and Jukes-Cantor model further simplify the substitution rates. Differential equations are derived to model the change in base probabilities over time under each model.
Show that Greenwoods formula reduces to the binomial variance formul.pdfakshitent
Show that Greenwoods formula reduces to the binomial variance formula.
Solution
Let S(t) be the probability that an item from a given population will have a lifetime
exceeding t. For a sample from this population of size N let the observed times until death of N
sample members be Corresponding to each ti is ni, the number \"at risk\" just prior to time ti,
and di, the number of deaths at time ti. Note that the intervals between each time typically are not
uniform. For example, a small data set might begin with 10 cases, have a death at Day 3, a loss
(censored case) at Day 9, and another death at Day 11. Then we have (t1 = 3, t2 = 11), (n1 = 10,
n2 = 8), and (d1 = 1, d2 = 1). The Kaplan–Meier estimator is the nonparametric maximum
likelihood estimate of S(t). It is a product of the form When there is no censoring, ni is just the
number of survivors just prior to time ti. With censoring, ni is the number of survivors less the
number of losses (censored cases). It is only those surviving cases that are still being observed
(have not yet been censored) that are \"at risk\" of an (observed) death.[3] There is an alternative
definition that is sometimes used, namely The two definitions differ only at the observed event
times. The latter definition is right-continuous whereas the former definition is left-continuous.
Let T be the random variable that measures the time of failure and let F(t) be its cumulative
distribution function. Note that Consequently, the right-continuous definition of may be
preferred in order to make the estimate compatible with a right-continuous estimate of F(t)..
Poster for Information, probability and inference in systems biology (IPISB 2...Colin Gillespie
Interest lies in inference for the rate parameters in a complex stochastic biological model describing the aggregation of proteins within human cells. Protein aggregation is a factor in many age-related diseases such as Alzheimer's disease. Ideally time-course measurements on all chemical species in the model would be available. However, current experimental techniques only allow noisy observations on the proportions of cell death at a few time points.
Although the model has a large state space and is analytically intractable, realisations from the model can be obtained using a stochastic simulator. The time evolution of a cell can be repeatedly simulated giving an estimate of the proportion of cell death. Unfortunately, simulation from the model is too slow to be used in an MCMC inference scheme. A Gaussian process emulator, which is very fast, can be used to approximate the simulator.
An MCMC scheme can be constructed targeting the posterior distribution of interest, however evaluating the marginal likelihood is challenging. A pseudo-marginal approach replaces the marginal likelihood with an easy to construct unbiased estimate while still
targeting the true posterior.
The methods will be illustrated using a toy birth-death model, allowing comparison with the exact model.
The document presents a dynamic discrete choice model of demand for insecticide treated nets (ITNs) that accounts for time inconsistent preferences and unobserved heterogeneity. The model has three periods where agents make ITN purchase and retreatment decisions. Agents are either time consistent, "naive" time inconsistent, or "sophisticated" time inconsistent. The model is identified in two steps - first when types are directly observed using survey responses, and second when types are unobserved. Identification exploits variation from elicited beliefs about malaria risk. The model can point identify time preference parameters and utility functions up to a normalization.
This document discusses various statistical methods for analyzing survival data, including censoring, methods to assess survival like the Kaplan-Meier method, and models like the Cox proportional hazards model. It begins with definitions of survival analysis and censoring. It then describes the Kaplan-Meier method for estimating survival functions from censored data and the log-rank test for comparing survival curves. Finally, it discusses the Cox proportional hazards model for assessing the effects of covariates on the hazard function while leaving the baseline hazard unspecified.
Research Inventy : International Journal of Engineering and Scienceresearchinventy
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
This document discusses regression with frailty in survival analysis using the Cox proportional hazards model. It introduces survival analysis concepts like the hazard function and survival function. It then describes how to incorporate frailty, a random effect, into the Cox model to account for clustering in survival times. The Newton-Raphson method is used to estimate model parameters by maximizing the penalized partial likelihood. A simulation study applies this approach to data on infections in kidney patients.
STATISTICAL TESTS TO COMPARE k SURVIVAL ANALYSIS FUNCTIONS INVOLVING RECURREN...Carlos M Martínez M
This document proposes statistical tests to compare k survival curves for recurrent event data. It begins with background on survival analysis and models for recurrent events. It then proposes a statistic to test whether k survival curves are equal, which is a linear combination of differences in observed and expected numbers of events between groups at different time points. Various choices of weights in this statistic generate tests that generalize classical survival analysis tests to the recurrent events setting, such as log-rank, Gehan, Peto-Peto, and Tarone-Ware tests. The proposal is applied to a dataset on tumor recurrence in bladder cancer patients under different treatments.
Dependent processes in Bayesian NonparametricsJulyan Arbel
This document summarizes dependent processes in Bayesian nonparametrics. It motivates the need for dependent random probability measures to accommodate temporal dependence structures beyond the exchangeability assumption. It describes modeling collections of random probability measures indexed by time as either discrete-time or continuous-time processes. The diffusive Dirichlet process is introduced as a dependent Dirichlet process with Dirichlet marginal distributions at each time point and continuous sample paths. Simulation and estimation methods are discussed for this model.
The tau-leap method for simulating stochastic kinetic modelsColin Gillespie
This document discusses approximate methods for simulating chemically reacting systems stochastically in a more computationally efficient manner than the direct method. It introduces the τ-leap method, where reactions are simulated in fixed time intervals (τ) by assuming reaction rates are constant over τ. It describes how τ can be chosen to satisfy a "leap condition" and minimize errors. The midpoint estimation technique is introduced to further reduce errors by estimating propensities at the midpoint of each τ interval. Examples applying these methods to a Lotka-Volterra system are provided to illustrate the techniques.
Considerate Approaches to ABC Model SelectionMichael Stumpf
The document discusses using approximate Bayesian computation (ABC) for model selection when directly evaluating likelihoods is computationally intractable, noting that ABC involves simulating data from models and comparing simulated and observed summary statistics, and that constructing minimally sufficient summary statistics is important for accurate ABC model selection.
Double Occupancy as a Probe of the Mott Transition for Fermions in One-dimens...Jorge Quintanilla
1) This document proposes measuring double occupancy as a probe of the Mott transition in a one-dimensional fermionic Hubbard model with an optical lattice.
2) It finds that the Mott phase exhibits inherent fluctuations in double occupancy that can be used to detect the Mott phase.
3) The double occupancy in the bulk can be determined from measurements in a trapped system using the local density approximation.
Sequential Monte Carlo algorithms for agent-based models of disease transmissionJeremyHeng10
This document discusses agent-based models for disease transmission and sequential Monte Carlo algorithms for statistical inference of these models. It begins with an overview of agent-based models and their use in epidemiology. It then describes an agent-based SIS model where each agent can be susceptible or infected. Observations are the number of reported infections over time. The likelihood of the model involves a sum over all possible state sequences, which is intractable for large populations. The document proposes using sequential Monte Carlo methods to approximate the likelihood, including the bootstrap particle filter and auxiliary particle filter.
Similar to A copula-based Simulation Method for Clustered Multi-State Survival Data (20)
Sequential Monte Carlo algorithms for agent-based models of disease transmission
A copula-based Simulation Method for Clustered Multi-State Survival Data
1. A copula-based simulation method
for clustered multi-state survival data
F. Rotolo• , C. Legrand , I. Van Keilegom , M. Chiogna•
• Dipartmento di Scienze Statistiche Institut de Statistique, Biostatistique
et Sciences Actuarielles
Universit` degli Studi di Padova
a Universit´ Catholique de Louvain
e
September 23, 2011
2. Clustered Multi-State Survival Data F. Rotolo
Survival Data
Time since an origin event until an event of interest.
Example: from birth to death, since beginning of therapy until remission, etc.
Time
q q
T=5
0 1 2 3 4 5
A copula-based simulation method for clustered multi-state survival data 2/ 22
3. Clustered Multi-State Survival Data F. Rotolo
Survival Data
Time since an origin event until an event of interest.
Example: from birth to death, since beginning of therapy until remission, etc.
Time
q q
T=5
0 1 2 3 4 5
Censoring: some observations cannot be observed, the only
available information being a lower bound.
Example: migration, change of therapy, loss to follow-up, etc.
Time
q x q
T>3.25
0 1 2 3 4 5
A copula-based simulation method for clustered multi-state survival data 2/ 22
4. Clustered Multi-State Survival Data F. Rotolo
Modeling Survival Data
Because of this peculiarity, instead of modeling the density f (t) of
T , the hazard is considered
P[t ≤ T < t + ∆t|T ≥ t] f (t) d
h(t) = lim = = − log S(t),
∆t 0 ∆t S(t) dt
∞
with S(t) = t f (u)du = P[T > t].
t
Note: S(t) = exp{− 0 h(u)du}.
A copula-based simulation method for clustered multi-state survival data 3/ 22
5. Clustered Multi-State Survival Data F. Rotolo
Modeling Survival Data
Because of this peculiarity, instead of modeling the density f (t) of
T , the hazard is considered
P[t ≤ T < t + ∆t|T ≥ t] f (t) d
h(t) = lim = = − log S(t),
∆t 0 ∆t S(t) dt
∞
with S(t) = t f (u)du = P[T > t].
t
Note: S(t) = exp{− 0 h(u)du}.
The basic regression model for the hazard is the Proportional
Hazards (PH) Model (Cox, 1972)
h(t|X ) = h0 (t) exp{β X }.
A copula-based simulation method for clustered multi-state survival data 3/ 22
6. Clustered Multi-State Survival Data F. Rotolo
Survival Models
Complications of Cox models have been developed
Frailty Models (FMs)
account for overdispersion
or clustering by means
of random effects
h(t|Xij ) = h0 (t)Zi e β Xij ,
similar to GLMM
log[h(t|Xij )] = log[h0 (t)]+Wi +β Xij ,
with Zi = e Wi
(Duchateau & Janssen, 2008; Wienke, 2010)
A copula-based simulation method for clustered multi-state survival data 4/ 22
7. Clustered Multi-State Survival Data F. Rotolo
Survival Models
Complications of Cox models have been developed
Frailty Models (FMs) Multi-State Models (MSMs)
account for overdispersion consider several events
or clustering by means and their interactions
of random effects LR
T1 T4
h(t|Xij ) = h0 (t)Zi e β Xij ,
T3
NED De
similar to GLMM
log[h(t|Xij )] = log[h0 (t)]+Wi +β Xij ,
T2 T5
Wi
with Zi = e DM
(Duchateau & Janssen, 2008; Wienke, 2010)
(Putter et al., 2007; de Wreede et al., 2010)
A copula-based simulation method for clustered multi-state survival data 4/ 22
8. Clustered Multi-State Survival Data F. Rotolo
Survival Models
Complications of Cox models have been developed
Frailty Models (FMs) Multi-State Models (MSMs)
account for overdispersion consider several events
or clustering by means and their interactions
of random effects LR
T1 T4
h(t|Xij ) = h0 (t)Zi e β Xij ,
T3
NED De
similar to GLMM
log[h(t|Xij )] = log[h0 (t)]+Wi +β Xij ,
T2 T5
Wi
with Zi = e DM
(Duchateau & Janssen, 2008; Wienke, 2010)
(Putter et al., 2007; de Wreede et al., 2010)
Possible integration?
A copula-based simulation method for clustered multi-state survival data 4/ 22
9. Clustered Multi-State Survival Data F. Rotolo
Survival Models
Complications of Cox models have been developed
Frailty Models (FMs) Multi-State Models (MSMs)
account for overdispersion consider several events
or clustering by means and their interactions
of random effects LR
T1 T4
h(t|Xij ) = h0 (t)Zi e β Xij ,
T3
NED De
similar to GLMM
log[h(t|Xij )] = log[h0 (t)]+Wi +β Xij ,
T2 T5
Wi
with Zi = e DM
(Duchateau & Janssen, 2008; Wienke, 2010)
(Putter et al., 2007; de Wreede et al., 2010)
Possible integration? Simulation studies
A copula-based simulation method for clustered multi-state survival data 4/ 22
10. Clustered Multi-State Survival Data F. Rotolo
Simulation of data
A simulation method should be able to generate
the dependence of times of
LR
competing events
NED De
DM
A copula-based simulation method for clustered multi-state survival data 5/ 22
11. Clustered Multi-State Survival Data F. Rotolo
Simulation of data
A simulation method should be able to generate
the dependence of times of
LR
competing events
the dependence of times of
subsequent events
NED De
DM
A copula-based simulation method for clustered multi-state survival data 5/ 22
12. Clustered Multi-State Survival Data F. Rotolo
Simulation of data
A simulation method should be able to generate
the dependence of times of
LR LR LR LR competing events
NED De NED De NED De NED De
LR LR
the dependence of times of
DM DM DM DM
NED De NED De subsequent events
LR LR LR LR
DM DM
NED De NED De NED De NED De
the dependence between clustered
DM DM DM DM observations
LR LR LR LR
NED De NED De NED De NED De
LR LR
DM DM DM DM
NED De NED De
LR LR LR LR
DM DM
NED De NED De NED De NED De
DM DM DM DM
A copula-based simulation method for clustered multi-state survival data 5/ 22
13. Clustered Multi-State Survival Data F. Rotolo
Simulation of data
A simulation method should be able to generate
the dependence of times of
LR
competing events
the dependence of times of
subsequent events
the dependence between clustered
observations
NED x De
the censoring due to competing
events occurrence
x
DM
A copula-based simulation method for clustered multi-state survival data 5/ 22
14. Clustered Multi-State Survival Data F. Rotolo
Simulation of data
A simulation method should be able to generate
the dependence of times of
LR
competing events
the dependence of times of
x subsequent events
the dependence between clustered
observations
NED x De
the censoring due to competing
events occurrence
x the censoring due to end of the
study or loss to follow up
DM
A copula-based simulation method for clustered multi-state survival data 5/ 22
15. Clustered Multi-State Survival Data F. Rotolo
Simulation of data
A simulation method should be able to generate
the dependence of times of
LR
competing events
the dependence of times of
T1 T4
subsequent events
the dependence between clustered
T3
observations
NED De
the censoring due to competing
events occurrence
the censoring due to end of the
T2 T5
study or loss to follow up
DM the event-specific covariates effect
A copula-based simulation method for clustered multi-state survival data 5/ 22
16. Simulation Algorithm F. Rotolo
Outline
Clustered Multi-State Survival Data
Simulation Algorithm
Clustering
Choice of Parameters
Example
A copula-based simulation method for clustered multi-state survival data 6/ 22
17. Simulation Algorithm F. Rotolo
Copula Model
LR
Marginal survival functions freely chosen
T1
S1 (t), S2 (t) and S3 (t)
T3
NED De
T2
DM
A copula-based simulation method for clustered multi-state survival data 7/ 22
18. Simulation Algorithm F. Rotolo
Copula Model
LR
Marginal survival functions freely chosen
T1
S1 (t), S2 (t) and S3 (t)
Joint survival function by Clayton Copula
3 −θ −1/θ
S123 (t) = i=1 Si (ti ) −2
T3
NED De
T2
DM
A copula-based simulation method for clustered multi-state survival data 7/ 22
19. Simulation Algorithm F. Rotolo
Copula Model
LR
Marginal survival functions freely chosen
T1 S1 (t), S2 (t) and S3 (t)
Joint survival function by Clayton Copula
3 −θ −1/θ
S123 (t) = i=1 Si (ti ) −2
NED
T3
De
Conditional survivals from the joint
θ −1/θ−1
S1 (t1 )
S2|1 (t2 |t1 ) = 1 + S2 (t2 )
− S1 (t1 )θ
T2
DM
A copula-based simulation method for clustered multi-state survival data 7/ 22
20. Simulation Algorithm F. Rotolo
Copula Model
LR
Marginal survival functions freely chosen
T1 S1 (t), S2 (t) and S3 (t)
Joint survival function by Clayton Copula
3 −θ −1/θ
S123 (t) = i=1 Si (ti ) −2
NED
T3
De
Conditional survivals from the joint
θ −1/θ−1
S1 (t1 )
S2|1 (t2 |t1 ) = 1 + S2 (t2 )
− S1 (t1 )θ
−1/θ−2
S3 (t3 )−θ −1
S3|12 (t3 |t1 , t2 ) = 1 + S1 (t1 )−θ +S2 (t2 )−θ −1
T2
DM
A copula-based simulation method for clustered multi-state survival data 7/ 22
21. Simulation Algorithm F. Rotolo
Algorithm
Data from the copula model (Kpanzou, 2007) are simulated as
follows
−1
1 T1 = S1 (U1 )
with U1 , U2 , U3 , UC i.i.d. U(0, 1)
A copula-based simulation method for clustered multi-state survival data 8/ 22
22. Simulation Algorithm F. Rotolo
Algorithm
Data from the copula model (Kpanzou, 2007) are simulated as
follows
−1
1 T1 = S1 (U1 )
−1
2 T2 |t1 = S2|1 (U2 |t1 ) =
θ −1/θ
−1 − 1+θ
S2 U2 − 1 S1 (t1 )−θ + 1
with U1 , U2 , U3 , UC i.i.d. U(0, 1)
A copula-based simulation method for clustered multi-state survival data 8/ 22
23. Simulation Algorithm F. Rotolo
Algorithm
Data from the copula model (Kpanzou, 2007) are simulated as
follows
−1
1 T1 = S1 (U1 )
−1
2 T2 |t1 = S2|1 (U2 |t1 ) =
θ −1/θ
−1 − 1+θ
S2 U2 − 1 S1 (t1 )−θ + 1
−1
3 T3 |t1 , t2 = S3|12 (U3 |t1 , t2 ) =
θ −1/θ
−1 − 1+2θ
S3 U3 −1 S1 (t1 )−θ + S2 (t2 )−θ − 1 + 1
with U1 , U2 , U3 , UC i.i.d. U(0, 1)
A copula-based simulation method for clustered multi-state survival data 8/ 22
24. Simulation Algorithm F. Rotolo
Algorithm
Data from the copula model (Kpanzou, 2007) are simulated as
follows
−1
1 T1 = S1 (U1 )
−1
2 T2 |t1 = S2|1 (U2 |t1 ) =
θ −1/θ
−1 − 1+θ
S2 U2 − 1 S1 (t1 )−θ + 1
−1
3 T3 |t1 , t2 = S3|12 (U3 |t1 , t2 ) =
θ −1/θ
−1 − 1+2θ
S3 U3 −1 S1 (t1 )−θ + S2 (t2 )−θ − 1 + 1
−1
C TC = FC (UC )
with U1 , U2 , U3 , UC i.i.d. U(0, 1)
A copula-based simulation method for clustered multi-state survival data 8/ 22
25. Simulation Algorithm F. Rotolo
Algorithm
Data from the copula model (Kpanzou, 2007) are simulated as
follows
−1
1 T1 = S1 (U1 )
−1
2 T2 |t1 = S2|1 (U2 |t1 ) =
θ −1/θ
−1 − 1+θ
S2 U2 − 1 S1 (t1 )−θ + 1
−1
3 T3 |t1 , t2 = S3|12 (U3 |t1 , t2 ) =
θ −1/θ
−1 − 1+2θ
S3 U3 −1 S1 (t1 )−θ + S2 (t2 )−θ − 1 + 1
−1
C TC = FC (UC )
T min(TC , T1 , T2 , T3 )
with U1 , U2 , U3 , UC i.i.d. U(0, 1)
A copula-based simulation method for clustered multi-state survival data 8/ 22
26. Simulation Algorithm F. Rotolo
Second transitions
For patients with a transition into state LR or DM, an analogous
copula model is used for second transition to state De
LR
The following conditional survivals can be obtained
T1 T4 −1/θ−1
θ
S1 (t1 )
S4|1 (t4 |t1 ) = 1 + S4 (t4 )
− S1 (t1 )θ
θ −1/θ−1
S2 (t2 )
NED De
S5|2 (t5 |t2 ) = 1 + S5 (t5 )
− S2 (t2 )θ
and the same algorithm is used to simulate second
transition times, conditionally on first transition ones.
DM
A copula-based simulation method for clustered multi-state survival data 9/ 22
27. Simulation Algorithm F. Rotolo
Second transitions
For patients with a transition into state LR or DM, an analogous
copula model is used for second transition to state De
LR
The following conditional survivals can be obtained
θ −1/θ−1
S1 (t1 )
S4|1 (t4 |t1 ) = 1 + S4 (t4 )
− S1 (t1 )θ
θ −1/θ−1
S2 (t2 )
NED De
S5|2 (t5 |t2 ) = 1 + S5 (t5 )
− S2 (t2 )θ
and the same algorithm is used to simulate second
transition times, conditionally on first transition ones.
T2 T5
DM
A copula-based simulation method for clustered multi-state survival data 9/ 22
28. Simulation Algorithm F. Rotolo
Clustering
The algorithm allows to freely specify the marginal survivals Si (t).
How can we insert clustering?
A copula-based simulation method for clustered multi-state survival data 10/ 22
29. Simulation Algorithm F. Rotolo
Clustering
The algorithm allows to freely specify the marginal survivals Si (t).
How can we insert clustering?
In a PH way
hi (t|Z ) = Z h0i (t),
with h0i (t) the baseline hazard for transition i.
A copula-based simulation method for clustered multi-state survival data 10/ 22
30. Simulation Algorithm F. Rotolo
Clustering
The algorithm allows to freely specify the marginal survivals Si (t).
How can we insert clustering?
In a PH way
hi (t|Z ) = Z h0i (t),
with h0i (t) the baseline hazard for transition i.
t
Since S0i (t) = exp{− 0 h0i (u)du}, then
t
Si (t|Z ) = exp −Z h0i (u)du = [S0i (t)]Z
0
A copula-based simulation method for clustered multi-state survival data 10/ 22
31. Simulation Algorithm F. Rotolo
Clustering
The algorithm allows to freely specify the marginal survivals Si (t).
How can we insert clustering?
In a PH way
hi (t|Z ) = Z h0i (t),
with h0i (t) the baseline hazard for transition i.
t
Since S0i (t) = exp{− 0 h0i (u)du}, then
t
Si (t|Z ) = exp −Z h0i (u)du = [S0i (t)]Z
0
The copula model can be used for conditional survivals
{Si (t|Z )}i∈{1,2,3,4,5} and the same algorithm can be used,
conditionally on Z .
A copula-based simulation method for clustered multi-state survival data 10/ 22
32. Simulation Algorithm F. Rotolo
Clustering and covariates
The effect of covariates X can be inserted in an analogous way.
The marginals are then
βi X
Si (t|X , Z ) = S0i (t)Ze
and simulation via the copula model is done conditionally on
(X , Z ).
A copula-based simulation method for clustered multi-state survival data 11/ 22
33. Simulation Algorithm F. Rotolo
The Clayton–Weibull model
Despite the model is quite general, we consider in the following a
particular case:
Ti ∼ Wei(λi , ρi ), i ∈ {1, 2, 3, 4, 5}
TC ∼ Wei(λC , 1) ∼ Exp(λC )
72 months (6 years) of administrative censoring
A copula-based simulation method for clustered multi-state survival data 12/ 22
34. Simulation Algorithm F. Rotolo
The Clayton–Weibull model
Despite the model is quite general, we consider in the following a
particular case:
Ti ∼ Wei(λi , ρi ), i ∈ {1, 2, 3, 4, 5}
TC ∼ Wei(λC , 1) ∼ Exp(λC )
72 months (6 years) of administrative censoring
This model
1. gives simple forms of conditional distributions
T
2. implies that Si|X ,Z (t|x, z) = exp{−λi ze βi x t ρi },
T
i.e. Ti |X , Z ∼ Wei(λi ze βi x , ρi ) is still a Weibull r.v.
A copula-based simulation method for clustered multi-state survival data 12/ 22
35. Choice of Parameters F. Rotolo
Outline
Clustered Multi-State Survival Data
Simulation Algorithm
Clustering
Choice of Parameters
Example
A copula-based simulation method for clustered multi-state survival data 13/ 22
36. Choice of Parameters F. Rotolo
Choice of parameters
When simulating a dataset, one should be able to choose parameters in
order to obtain particular target values for
LR pi probabilities of LR, DM, De and
censoring from NED
T1 T4
T3
NED De
T2 T5
DM
A copula-based simulation method for clustered multi-state survival data 14/ 22
37. Choice of Parameters F. Rotolo
Choice of parameters
When simulating a dataset, one should be able to choose parameters in
order to obtain particular target values for
LR pi probabilities of LR, DM, De and
censoring from NED
T1 T4
mi median of uncensored LR, DM
and De times from NED
T3
NED De
T2 T5
DM
A copula-based simulation method for clustered multi-state survival data 14/ 22
38. Choice of Parameters F. Rotolo
Choice of parameters
When simulating a dataset, one should be able to choose parameters in
order to obtain particular target values for
LR pi probabilities of LR, DM, De and
censoring from NED
T1 T4
mi median of uncensored LR, DM
and De times from NED
T3
NED De
pi probabilities of De and censoring
from LR and from DM
T2 T5
DM
A copula-based simulation method for clustered multi-state survival data 14/ 22
39. Choice of Parameters F. Rotolo
Choice of parameters
When simulating a dataset, one should be able to choose parameters in
order to obtain particular target values for
LR pi probabilities of LR, DM, De and
censoring from NED
T1 T4
mi median of uncensored LR, DM
and De times from NED
T3
NED De
pi probabilities of De and censoring
from LR and from DM
T2 T5
mi median of uncensored De times
from LR and from DM
DM
It is not possible to analytically express these quantities as functions of
the parameters.
A copula-based simulation method for clustered multi-state survival data 14/ 22
40. Choice of Parameters F. Rotolo
Criterion function
In order to find appropriate parameters for given target values
{pi , mi }, we want to minimize the criterion function
2 2
pi mi
Υ(Π) = log + log
pi (Π)
ˆ mi (Π)
ˆ
i∈{1,2,3,4,5}
≥0
with
Π = {λi }i∈{1,2,3,C ,4,C 4,5,C 5} ∪ {ρi }i∈{1,2,3,4,5} ∈ R13
+
and {ˆi (Π), mi (Π)} the observed values in a simulated dataset with
p ˆ
parameters Π
A copula-based simulation method for clustered multi-state survival data 15/ 22
41. Choice of Parameters F. Rotolo
Criterion function
In order to find appropriate parameters for given target values
{pi , mi }, we want to minimize the criterion function
2 2
pi mi
Υ(Π) = log + log
pi (Π)
ˆ mi (Π)
ˆ
i∈{1,2,3,4,5}
= Υ123 (Π123 ) + Υ4 (Π4 ) + Υ5 (Π5 ) ≥ 0
with
Π = {λi }i∈{1,2,3,C ,4,C 4,5,C 5} ∪ {ρi }i∈{1,2,3,4,5} ∈ R13
+
Π = Π123 ∪ Π4 ∪ Π5 ∈ R7 × R3 × R3
+ + +
and {ˆi (Π), mi (Π)} the observed values in a simulated dataset with
p ˆ
parameters Π
A copula-based simulation method for clustered multi-state survival data 15/ 22
42. Choice of Parameters F. Rotolo
Criterion function
In order to find appropriate parameters for given target values
{pi , mi }, we want to minimize the criterion function
2 2
pi mi
Υ(Π) = log + log
pi (Π)
ˆ mi (Π)
ˆ
i∈{1,2,3,4,5}
= Υ123 (Π123 ) + Υ4 (Π4 ) + Υ5 (Π5 ) ≥ 0
with
Π = {λi }i∈{1,2,3,C ,4,C 4,5,C 5} ∪ {ρi }i∈{1,2,3,4,5} ∈ R13
+
Π = Π123 ∪ Π4 ∪ Π5 ∈ R7 × R3 × R3
+ + +
Further reduction of problem dimension...
Π = Π123 ∪ Π4 ∪ Π5 ∈ R4+3 × R2+1 × R2+1
+ + +
and {ˆi (Π), mi (Π)} the observed values in a simulated dataset with
p ˆ
parameters Π
A copula-based simulation method for clustered multi-state survival data 15/ 22
43. Choice of Parameters F. Rotolo
Minimization of criterion function
In order to further reduce the dimension of the problem, each of
the parameter sets ΠK , K ∈ {{123}, {4}, {5}} is split into the
scale {λi } and the shape parameters {ρi }. The optimization of the
criterion function ΥK (ΠK ) is iterated on each subset
Example: algorithm for K = {123}
Set J = 1
(0)
λ(0) = {λi }i∈{C ,1,2,3} = {1, 1, 1, 1}
(0)
ρ(0) = {ρi }i∈{1,2,3} = {1, 1, 1}
A copula-based simulation method for clustered multi-state survival data 16/ 22
44. Choice of Parameters F. Rotolo
Minimization of criterion function
In order to further reduce the dimension of the problem, each of
the parameter sets ΠK , K ∈ {{123}, {4}, {5}} is split into the
scale {λi } and the shape parameters {ρi }. The optimization of the
criterion function ΥK (ΠK ) is iterated on each subset
Example: algorithm for K = {123}
Set J = 1
(0)
λ(0) = {λi }i∈{C ,1,2,3} = {1, 1, 1, 1}
(0)
ρ(0) = {ρi }i∈{1,2,3} = {1, 1, 1}
Repeat until J = maxit or Υ123 (λ(J−1) , ρ(J−1) ) < th
Obtain λ(J) by minimizing Υ123 (λ, ρ(J−1) ) over λ
Obtain ρ(J) by minimizing Υ123 (λ(J) , ρ) over ρ
Set J = J + 1
where maxit and th are arbitrary termination parameters.
A copula-based simulation method for clustered multi-state survival data 16/ 22
45. Example F. Rotolo
An example
A dataset of size 44 is available from a multi-center study on head and neck cancer.
Target values {pi } and {mi }
LR
8
15 7
3
NED De
22 14
4 4
DM
Tot: 44 0
A copula-based simulation method for clustered multi-state survival data 17/ 22
46. Example F. Rotolo
An example
A dataset of size 44 is available from a multi-center study on head and neck cancer.
Target values {pi } and {mi }
Frailty term
40 Hospitals
LR
8
random sizes
15 7 Z ∼ Gam(1, 0.5)
3
NED De
22 14
4 4
DM
Tot: 44 0
A copula-based simulation method for clustered multi-state survival data 17/ 22
47. Example F. Rotolo
An example
A dataset of size 44 is available from a multi-center study on head and neck cancer.
Target values {pi } and {mi }
Frailty term
40 Hospitals
LR
8
random sizes
15 7 Z ∼ Gam(1, 0.5)
Covariates
Age ∼ N (60, 7)
3
NED De with
22 14 log(0.8)/10
i =1
βi,Age = log(0.9)/10 i =2
log(1.2)/10 i = 3, 4, 5
4 4
Treat ∼ Bin(0.5)
with
DM log(1/3) i =1
0
Tot: 44
βi,Treat = 0 i =2
log(1.2) i = 3, 4, 5
A copula-based simulation method for clustered multi-state survival data 17/ 22
48. Example F. Rotolo
Results
First transitions. The algorithm is run with datasets of size 104 ,
maxit = 10 and th = 0.1. The time of execution was 11:57’ hours
NED→ {LR,DM,De}
λ1 λ2 λ3 λC ρ1 ρ2 ρ3
0.276 0.019 0.013 0.031 0.851 1.076 0.569
A copula-based simulation method for clustered multi-state survival data 18/ 22
49. Example F. Rotolo
Results
First transitions. The algorithm is run with datasets of size 104 ,
maxit = 10 and th = 0.1. The time of execution was 11:57’ hours
NED→ {LR,DM,De}
λ1 λ2 λ3 λC ρ1 ρ2 ρ3
0.276 0.019 0.013 0.031 0.851 1.076 0.569
NED→ {LR,DM,De}
pi mi
LR DM De C LR DM De
Target 0.34 0.09 0.07 0.50 6.00 10.00 3.00
Simulated 0.33 0.12 0.09 0.46 5.41 9.33 2.29
Υ123 (Π123 ) = 0.24
A copula-based simulation method for clustered multi-state survival data 18/ 22
50. Example F. Rotolo
Results
Second transitions. Conditionally on first transitions data, the
algorithm is run for second transitions from LR and DM with
maxit = 6 and th = 0.05. The times of execution were 4:31’ and
3:57’ hours, respectively.
LR→De DM→De
λ4 λC 4 ρ4 λ5 λC 5 ρ5
0.029 0.099 1.078 0.192 0.039 1.000
A copula-based simulation method for clustered multi-state survival data 19/ 22
51. Example F. Rotolo
Results
Second transitions. Conditionally on first transitions data, the
algorithm is run for second transitions from LR and DM with
maxit = 6 and th = 0.05. The times of execution were 4:31’ and
3:57’ hours, respectively.
LR→De DM→De
λ4 λC 4 ρ4 λ5 λC 5 ρ5
0.029 0.099 1.078 0.192 0.039 1.000
LR→De DM→De
pi mi pi mi
De C De De C De
0.53 0.47 3.25 Target 0.95 0.05 0.50
0.50 0.50 3.32 Simulated 0.97 0.03 0.54
Υ4 (Π4 ) = 0.0043 Υ5 (Π5 ) = 0.0064
A copula-based simulation method for clustered multi-state survival data 19/ 22
52. Conclusion F. Rotolo
Conclusion
The proposed simulation procedure for clustered MS allows to
MSMs generate dependence between times of the same subject
(between both competing and subsequent event times)
FMs generate dependence between times of clustered subjects
(with arbitrary number and size of groups and free frailty distribution)
PH insert covariates via proportional hazards
parMod choose marginal distributions of time variables
A copula-based simulation method for clustered multi-state survival data 20/ 22
53. Conclusion F. Rotolo
Conclusion
The proposed simulation procedure for clustered MS allows to
MSMs generate dependence between times of the same subject
(between both competing and subsequent event times)
FMs generate dependence between times of clustered subjects
(with arbitrary number and size of groups and free frailty distribution)
PH insert covariates via proportional hazards
parMod choose marginal distributions of time variables
automatically find appropriate parameters, given arbitrary
target values for probabilities of censoring, of competing
events and for medians of uncensored times
generate censoring, both random and administrative
A copula-based simulation method for clustered multi-state survival data 20/ 22
54. References F. Rotolo
References
Cox, D. R. (1972). Regression models and life-tables. Journal of the
Royal Statistical Society. Series B (Methodological) 34, 187–220.
de Wreede, L. C., Fiocco, M. & Putter, H. (2010). The mstate
package for estimation and prediction in non- and semi-parametric
multi-state and competing risks models. Comput Methods Programs
Biomed 99, 261–74.
Duchateau, L. & Janssen, P. (2008). The frailty model. Springer.
Kpanzou, T. A. (2007). Copulas in statistics. African Institute for
Mathematical Sciences (AIMS) .
Putter, H., Fiocco, M. & Geskus, R. B. (2007). Tutorial in
biostatistics: competing risks and multi-state models. Stat Med 26,
2389–430.
Wienke, A. (2010). Frailty Models in Survival Analysis. Chapman &
Hall/CRC biostatistics series. Taylor and Francis.
A copula-based simulation method for clustered multi-state survival data 21/ 22
55. F. Rotolo [federico.rotolo@stat.unipd.it – federico.rotolo@uclouvain.be]
PhD Student at University of Padova and Visiting PhD Student at UCL
under the supervision of
prof. C. Legrand, UCL
prof. I. Van Keilegom, UCL
prof. M. Chiogna, UniPd