Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterprise Miner Patricia B. Cerrito [email_address] University of Louisville
Objectives To examine some issues with traditional statistical models and their basic assumptions To examine the Central Limit Theorem and its necessity in statistical models To look at the differences and similarities between clinical trials and health outcomes research
Surrogate Versus Real Endpoints Because clinical trials tend to be short term, they use high risk patients and surrogate endpoints Use of statins reduce cholesterol levels but do they increase longevity and disease free survival? Health outcomes data can examine real endpoints from the general population
One Versus Many Endpoints Clinical trials generally have one survival endpoint-time to recurrence, time to death, time to disease progression Health outcomes can examine multiple endpoints simultaneously using survival data mining
Homogeneous Versus Heterogeneous Data Clinical trials generally use inclusion/exclusion criteria to define a homogeneous sample Health outcomes have to rely upon heterogeneous data Populations are more gamma distributions than normal and this must be taken into consideration
Large Versus Small Samples Clinical trials tend to use the smallest sample possible to achieve the desired power Database designed for analysis and data are very clean Health outcomes have an abundance of data and variables Power not an issue Data are very messy and require considerable preprocessing
Rare Occurrences Clinical trials not large enough to find all potential rare occurrences Health outcomes have enough data to find rare occurrences and to predict the probability of occurrence Requires modifications to standard linear models Predictive modeling much better at actual prediction
Example 1 Ottenbacher, Kenneth J. Ottenbacher, Heather R. Tooth, Leigh. Ostir, Glenn V. A review of two journals found that articles using multivariable logistic regression frequently did not report commonly recommended assumptions.  Journal of Clinical Epidemiology. 57(11):1147-52, 2004 Nov. continued...
Example 1 Statistical significance testing or confidence intervals were reported in all articles. Methods for selecting independent variables were described in 82%, and specific procedures used to generate the models were discussed in 65%.  continued...
Example 1 Fewer than 50% of the articles indicated if interactions were tested or met the recommended events per independent variable ratio of 10:1.  Fewer than 20% of the articles described conformity to a linear gradient, examined collinearity, reported information on validation procedures, goodness-of-fit, discrimination statistics, or provided complete information on variable coding.
Example 2 Brown, James M. O'Brien, Sean M. Wu, Changfu. Sikora, Jo Ann H. Griffith, Bartley P. Gammie, James S. Title: Isolated aortic valve replacement in North America comprising 108,687 patients in 10 years: changes in risks, valve types, and outcomes in the Society of Thoracic Surgeons National Database. Source: Journal of Thoracic & Cardiovascular Surgery. 137(1):82-90, 2009 Jan. continued...
Example 2 108,687 isolated aortic valve replacements were analyzed. Time-related trends were assessed by comparing distributions of risk factors, valve types, and outcomes in 1997 versus 2006. Differences in case mix were summarized by comparing average predicted mortality risks with a logistic regression model. Differences across subgroups and time were assessed.   continued...
Example 2 RESULTS:  There was a dramatic shift toward use of bioprosthetic valves.  Aortic valve replacement recipients in 2006 were older (mean age 65.9 vs 67.9 years, P < .001) with higher predicted operative mortality risk (2.75 vs 3.25, P < .001) Observed mortality and permanent stroke rate fell (by 24% and 27%, respectively).  continued...
Example 2 Female sex, age older than 70 years, and ejection fraction less than 30% were all related to higher mortality, higher stroke rate and longer postoperative stay.  There was a 39% reduction in mortality with preoperative renal failure.
Central Limit Theorem As the sample size increases to infinity, the distribution of the sample average approaches a normal distribution with mean  μ  and variance  σ 2 /n.  As n approaches infinity, the variance approaches zero.  Therefore, the distribution of the sample average starts to look like a straight line at the point  μ  if n is too large. continued...
Central Limit Theorem In addition, the sample mean is very susceptible to the influence of outliers.  Moreover, the confidence limits are defined based upon the assumption of normality and symmetry. Therefore, the existence of many outliers will skew the confidence interval.
Nonparametric Statistics Nonparametric models still require symmetry. Many populations are highly skewed so that these models also have problems
Dataset We use data from the National Inpatient Sample from 2005 A stratified sample from 1000 hospitals from 37 states Approximately 8 million inpatient stays
Distribution of Patient Stays
Normal Estimate
Kernel Density Estimation Instead of assuming that the population follows a known distribution, we can estimate it. Kernel density estimation is an excellent method to use to do this continued...
Kernel Density Estimation
Proc KDE proc   kde  data=nis.diabetesless50los; univar los/gridl= 0  gridu= 50  method=srot out=nis.kde50 bwm= 3 ; run ;
Kernel Estimate of Length of Stay
Sampling from NIS Given that the National Inpatient Sample has 8 million records, we can consider it to be an infinite population. Therefore, we can sample from this population to see if it can be estimated by the Central Limit Theorem We start with extracting 100 different samples of size N=5
Examine Central Limit Theorem PROC SURVEYSELECT DATA=nis.nis_205 OUT=work.samples METHOD=SRS N=5 rep=100 noprint; RUN; proc means data=work.samples noprint; by replicate; var los; output out=out mean=mean; run;
Sample Size=5
Sample Size=30
Sample Size=100
Sample Size=1000
Confidence Limit The confidence limit excludes much of the actual population distribution
Confidence Limit With Larger n
Discussion An over-reliance on the Central Limit Theorem can give a very misleading picture of the population distribution.  Kernel density estimation (PROC KDE) allows an examination of the entire population distribution instead of just using the mean to represent the population. Without the assumption of normality, we need to use predictive modeling.
Discussion This is true for both logistic and linear regression where the assumption of normality is required. The two regression techniques do not work well with skewed populations.  We first look at logistic regression for rare occurrences
Problems With Regression Logistic regression is not designed to predict rare occurrences With a rare occurrence, logistic regression will predict virtually all observations as non-occurrences The accuracy will be high but the predictive ability of the model will be virtually nil.
Regression Equation
Threshold Value For Logistic regression, a threshold value is defined, and regression values above the threshold are predicted as 1 Regression values below the threshold are predicted as 0 Choice of threshold value optimizes error rate
Simple Regression
Classification Table
Classification With 3 Variables continued...
Classification With 3 Variables
Models Linear regression: Y = β 0  + β 1  X 1  + β 2 X 2  …….+ β k  X k Logistic regression: log e (p/1− p) = β 0  + β 1 Χ 1  + β 2 Χ 2  …….β n Χ n Poisson regression log e  (Y) = β 0  + β 1 Χ 1  + β 2 Χ 2  …….β n Χ n
Poisson Distribution The parameter of the Poisson Distribution,  λ , will represent the average mortality rate, say 2%. Then the sample size times 2% will give the estimate for the number of deaths, say 1,000,000*0.02=20,000 However, the problem still persists. For example, septicemia has a 26% mortality rate, pneumonia has a 7.5% rate
Parameters The three conditions include approximately 25% of total hospitalizations, leaving 75% not accounted for.  The Poisson distribution can be accurate on those patients but cannot determine anything about the remaining 75% If more patient conditions are added, the 25% will increase but not to the point that the model will have good predictability
Predictive Modeling Takes a different approach Uses equal group sizes 100% of the rarest level Equal sample size of other level Randomizes the selection of the sampling Uses prior probabilities to choose the optimal model
50/50 Split in the Data Filter data to mortality outcome Filter data to non-mortality outcome Use PROC SURVEYSELECT to extract a subsample of non-mortality outcome Append the mortality outcome data to subsample
75/25 Split in the Data
90/10 Split in the Data
Validation The reduced sample is partitioned into training/validation/testing sets Only need training/testing sets for regression models Model is validated on the testing set
 
Sampling Node
Misclassification in Regression
ROC Curves
 
Rule Induction Results
Variable Selection
 
ROC Curves
Decile Data are sorted and divided into deciles True positive patients with highest confidence come first Next, positive patients with lower confidence.  True negative cases with lowest confidence come next Next, negative cases with highest confidence.
Lift Target density  =number of actually positive instances in that decile\ the total number of instances in the decile. The  lift  =the ratio of the target density for the decile to the target density over all the test data.  Way to find patients most at risk for mortality (or infection)
Discussion Predictive modeling in Enterprise Miner has some capabilities that are possible, but extremely difficult in SAS/Stat Sampling a rare occurrence to a 50/50 split Partitioning to validate the results Comparing multiple models to find the one that is optimal Variable selection
Summary Clinical trials do differ from health outcomes research and the statistical techniques required must be adapted to outcomes research Model assumptions are important, but too often ignored We need to look at results in detail Superficial consideration of results can lead to very erroneous conclusions

Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterprise Miner by Patricia B. Cerrito

  • 1.
    Clinical Trials VersusHealth Outcomes Research: SAS/STAT Versus SAS Enterprise Miner Patricia B. Cerrito [email_address] University of Louisville
  • 2.
    Objectives To examinesome issues with traditional statistical models and their basic assumptions To examine the Central Limit Theorem and its necessity in statistical models To look at the differences and similarities between clinical trials and health outcomes research
  • 3.
    Surrogate Versus RealEndpoints Because clinical trials tend to be short term, they use high risk patients and surrogate endpoints Use of statins reduce cholesterol levels but do they increase longevity and disease free survival? Health outcomes data can examine real endpoints from the general population
  • 4.
    One Versus ManyEndpoints Clinical trials generally have one survival endpoint-time to recurrence, time to death, time to disease progression Health outcomes can examine multiple endpoints simultaneously using survival data mining
  • 5.
    Homogeneous Versus HeterogeneousData Clinical trials generally use inclusion/exclusion criteria to define a homogeneous sample Health outcomes have to rely upon heterogeneous data Populations are more gamma distributions than normal and this must be taken into consideration
  • 6.
    Large Versus SmallSamples Clinical trials tend to use the smallest sample possible to achieve the desired power Database designed for analysis and data are very clean Health outcomes have an abundance of data and variables Power not an issue Data are very messy and require considerable preprocessing
  • 7.
    Rare Occurrences Clinicaltrials not large enough to find all potential rare occurrences Health outcomes have enough data to find rare occurrences and to predict the probability of occurrence Requires modifications to standard linear models Predictive modeling much better at actual prediction
  • 8.
    Example 1 Ottenbacher,Kenneth J. Ottenbacher, Heather R. Tooth, Leigh. Ostir, Glenn V. A review of two journals found that articles using multivariable logistic regression frequently did not report commonly recommended assumptions. Journal of Clinical Epidemiology. 57(11):1147-52, 2004 Nov. continued...
  • 9.
    Example 1 Statisticalsignificance testing or confidence intervals were reported in all articles. Methods for selecting independent variables were described in 82%, and specific procedures used to generate the models were discussed in 65%. continued...
  • 10.
    Example 1 Fewerthan 50% of the articles indicated if interactions were tested or met the recommended events per independent variable ratio of 10:1. Fewer than 20% of the articles described conformity to a linear gradient, examined collinearity, reported information on validation procedures, goodness-of-fit, discrimination statistics, or provided complete information on variable coding.
  • 11.
    Example 2 Brown,James M. O'Brien, Sean M. Wu, Changfu. Sikora, Jo Ann H. Griffith, Bartley P. Gammie, James S. Title: Isolated aortic valve replacement in North America comprising 108,687 patients in 10 years: changes in risks, valve types, and outcomes in the Society of Thoracic Surgeons National Database. Source: Journal of Thoracic & Cardiovascular Surgery. 137(1):82-90, 2009 Jan. continued...
  • 12.
    Example 2 108,687isolated aortic valve replacements were analyzed. Time-related trends were assessed by comparing distributions of risk factors, valve types, and outcomes in 1997 versus 2006. Differences in case mix were summarized by comparing average predicted mortality risks with a logistic regression model. Differences across subgroups and time were assessed. continued...
  • 13.
    Example 2 RESULTS: There was a dramatic shift toward use of bioprosthetic valves. Aortic valve replacement recipients in 2006 were older (mean age 65.9 vs 67.9 years, P < .001) with higher predicted operative mortality risk (2.75 vs 3.25, P < .001) Observed mortality and permanent stroke rate fell (by 24% and 27%, respectively). continued...
  • 14.
    Example 2 Femalesex, age older than 70 years, and ejection fraction less than 30% were all related to higher mortality, higher stroke rate and longer postoperative stay. There was a 39% reduction in mortality with preoperative renal failure.
  • 15.
    Central Limit TheoremAs the sample size increases to infinity, the distribution of the sample average approaches a normal distribution with mean μ and variance σ 2 /n. As n approaches infinity, the variance approaches zero. Therefore, the distribution of the sample average starts to look like a straight line at the point μ if n is too large. continued...
  • 16.
    Central Limit TheoremIn addition, the sample mean is very susceptible to the influence of outliers. Moreover, the confidence limits are defined based upon the assumption of normality and symmetry. Therefore, the existence of many outliers will skew the confidence interval.
  • 17.
    Nonparametric Statistics Nonparametricmodels still require symmetry. Many populations are highly skewed so that these models also have problems
  • 18.
    Dataset We usedata from the National Inpatient Sample from 2005 A stratified sample from 1000 hospitals from 37 states Approximately 8 million inpatient stays
  • 19.
  • 20.
  • 21.
    Kernel Density EstimationInstead of assuming that the population follows a known distribution, we can estimate it. Kernel density estimation is an excellent method to use to do this continued...
  • 22.
  • 23.
    Proc KDE proc kde data=nis.diabetesless50los; univar los/gridl= 0 gridu= 50 method=srot out=nis.kde50 bwm= 3 ; run ;
  • 24.
    Kernel Estimate ofLength of Stay
  • 25.
    Sampling from NISGiven that the National Inpatient Sample has 8 million records, we can consider it to be an infinite population. Therefore, we can sample from this population to see if it can be estimated by the Central Limit Theorem We start with extracting 100 different samples of size N=5
  • 26.
    Examine Central LimitTheorem PROC SURVEYSELECT DATA=nis.nis_205 OUT=work.samples METHOD=SRS N=5 rep=100 noprint; RUN; proc means data=work.samples noprint; by replicate; var los; output out=out mean=mean; run;
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
    Confidence Limit Theconfidence limit excludes much of the actual population distribution
  • 32.
  • 33.
    Discussion An over-relianceon the Central Limit Theorem can give a very misleading picture of the population distribution. Kernel density estimation (PROC KDE) allows an examination of the entire population distribution instead of just using the mean to represent the population. Without the assumption of normality, we need to use predictive modeling.
  • 34.
    Discussion This istrue for both logistic and linear regression where the assumption of normality is required. The two regression techniques do not work well with skewed populations. We first look at logistic regression for rare occurrences
  • 35.
    Problems With RegressionLogistic regression is not designed to predict rare occurrences With a rare occurrence, logistic regression will predict virtually all observations as non-occurrences The accuracy will be high but the predictive ability of the model will be virtually nil.
  • 36.
  • 37.
    Threshold Value ForLogistic regression, a threshold value is defined, and regression values above the threshold are predicted as 1 Regression values below the threshold are predicted as 0 Choice of threshold value optimizes error rate
  • 38.
  • 39.
  • 40.
    Classification With 3Variables continued...
  • 41.
  • 42.
    Models Linear regression:Y = β 0 + β 1 X 1 + β 2 X 2 …….+ β k X k Logistic regression: log e (p/1− p) = β 0 + β 1 Χ 1 + β 2 Χ 2 …….β n Χ n Poisson regression log e (Y) = β 0 + β 1 Χ 1 + β 2 Χ 2 …….β n Χ n
  • 43.
    Poisson Distribution Theparameter of the Poisson Distribution, λ , will represent the average mortality rate, say 2%. Then the sample size times 2% will give the estimate for the number of deaths, say 1,000,000*0.02=20,000 However, the problem still persists. For example, septicemia has a 26% mortality rate, pneumonia has a 7.5% rate
  • 44.
    Parameters The threeconditions include approximately 25% of total hospitalizations, leaving 75% not accounted for. The Poisson distribution can be accurate on those patients but cannot determine anything about the remaining 75% If more patient conditions are added, the 25% will increase but not to the point that the model will have good predictability
  • 45.
    Predictive Modeling Takesa different approach Uses equal group sizes 100% of the rarest level Equal sample size of other level Randomizes the selection of the sampling Uses prior probabilities to choose the optimal model
  • 46.
    50/50 Split inthe Data Filter data to mortality outcome Filter data to non-mortality outcome Use PROC SURVEYSELECT to extract a subsample of non-mortality outcome Append the mortality outcome data to subsample
  • 47.
  • 48.
  • 49.
    Validation The reducedsample is partitioned into training/validation/testing sets Only need training/testing sets for regression models Model is validated on the testing set
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
    Decile Data aresorted and divided into deciles True positive patients with highest confidence come first Next, positive patients with lower confidence. True negative cases with lowest confidence come next Next, negative cases with highest confidence.
  • 60.
    Lift Target density =number of actually positive instances in that decile\ the total number of instances in the decile. The lift =the ratio of the target density for the decile to the target density over all the test data. Way to find patients most at risk for mortality (or infection)
  • 61.
    Discussion Predictive modelingin Enterprise Miner has some capabilities that are possible, but extremely difficult in SAS/Stat Sampling a rare occurrence to a 50/50 split Partitioning to validate the results Comparing multiple models to find the one that is optimal Variable selection
  • 62.
    Summary Clinical trialsdo differ from health outcomes research and the statistical techniques required must be adapted to outcomes research Model assumptions are important, but too often ignored We need to look at results in detail Superficial consideration of results can lead to very erroneous conclusions