Aim Of Project- To forecast the demand and to determine the optimal order quantity and reorder point for the Japan Airlines spare parts.
Conclusion-Recommend plan for the optimal order quantity and reorder point for the spare parts.
Survival analysis on kidney failure of kidney transplant patients
1. “SURVIVAL ANALYSIS ON KIDNEY FAILUREFOR KIDNEY TRANSPLANTPATIENTS”
Summary
The Accelerated Failure Time model presents a way to easily describe and interpret survival regression
data. It approachesthe datadifferentlythanthe widelyusedandwell describedCox proportional hazard
model,byassumingproportional effectof the covariatesonthe log-failure time ratherthanonthe hazard
function.Inthisreport,wepresentsemiparametricmethods(theCox PHmodel)andparametricmethods
(AFT model) for analyzing survival data. We have the data set for 469 patients with kidney transplants
along with the graft survival and failure time. Eight variates have been recognized which might have a
relation with the survival experience.
Introduction
Survival analysisisastatistical methodfordataanalysiswhere the outcome variable of interestisthe
time to the occurrence of an event.Hence,survival analysisisalsoreferredtoas"time to event
analysis",whichisappliedinanumberof appliedfieldssuchasmedicine,publichealth,socialscience,
and engineering.
The Cox proportional hazards(PH) model isnow the mostwidelyusedforthe analysisof survivaldatain
the presence of covariatesorprognosticfactors.Thisisthe mostpopularmodel forsurvival analysis
because of itssimplicity,andnotbeingbasedonanyassumptionsaboutthe survival distribution.The
model assumesthatthe underlyinghazardrate isa functionof the independentcovariates,butno
assumptionsare made aboutthe nature or shape of the hazardfunction.
The acceleratedfailure time (AFT) modelisanotheralternative methodforthe analysisof survival
data. The AFT model assumesacertainparametricdistributionforthe failure timesandthatthe effect
of the covariates onthe failure time ismultiplicative. The appeal of the AFTmodel liesinthe ease of
interpretingthe results,becausethe AFTmodelsthe effectof predictorsandcovariatesdirectlyonthe
survival time insteadof throughthe hazardfunction.
Purpose and Methods
The purpose of this report is to analyze the data using the Cox models and the AFT models. This will be
studied by means of real dataset which is from a randomized data set for 469 patients with kidney
transplants.
We start withthe AFTmodels checkingthe AICandBIC (goodnessof fittests) foreachof the
distributionsincludingall the covariates. We thendiscussbrieflypossible typesof responseand
prognosticvariables.We willselectthe covariate basedonthe backwardselectionprocedure.The final
model will be chosenbasedonthisresult.
2. Secondpart of the reportconsistsof the Cox’sPH model. We are goingto try to fitthe data basedon
the Cox’sPH model basedondifferenttiesandcompare withthe parametricmodel exponential and
Weibull.Thiswill alsobe followedbyselectionof significantcovariatesandthe final modelwill be
chosen.
SASsoftware package wasusedand proc lifereg,procphregare usedfor estimatingthe resultsforAFT
and Cox PH model.
Conclusion
We applythese methodsto arandomizeddataof 495 kidneytransplants. OurconclusionisthatAge,
Diabetes(DIBT) andALG (animmune drug) are the most significantcovariates,thusbeingthe
interactingvariableswhichispossiblypredictive of the outcome understudy.The majorgoal of this
reportis alsoto supportan argumentforthe considerationof the AFTmodel asan alternative tothe PH
model inthe analysisof some survival databymeansof thisreal dataset.
In conclusion,althoughthe Cox proportional hazardsmodeltendstobe more popularinthe literature,
the AFT model shouldalsobe consideredwhenplanningasurvival analysis.Itshouldgowithoutsaying
that the choice shouldbe drivenbythe desiredoutcome orthe fittothe data,and neverbywhichgives
a significantP value forthe predictorof interest.The choice shouldbe dictatedonlybythe research
hypothesisandbywhichassumptionsof the model are validforthe data beinganalyzed.
Analysis,Procedure along with computation and output with interpretation
Data set
Followingisthe datafor469 patientswithkidneytransplants.The primaryinterestwasgraftsurvival,
and time tograft failure wasrecordedinmonths(whichwassubjecttorightcensoring).Thisstudy
included measurementsof manycovariatesthatmaybe relatedtosurvival experience.Use bothCox's
PH model andthe acceleratedfailure timemodeltoanalyze the dataand write areport.
The 10 covariatesincludedare:
AGE: Age at transplantinyears
SEX: 1=female,0=male
DIALY: Durationof hemodialysispriortotransplantindays
DBT: Diabetes;1=yes,0=no
PTX: Numberof priortransplants
BLOOD: Amountof bloodtransfusion,inbloodunits
3. MIS: Mismatch score
ALG: Use of ALG, an immune suppressiondrug;1=yes, 0=no
MONTH: Durationtime startingfromtransplant, inmonths
FAIL:status of the newkidney;1=newkidneyfailed,0=functioning
A. AFT Models-
Under AFT modelswe measure the directeffectof the explanatoryvariablesonthe survival
time insteadof hazard,as we do inthe PH model.Thischaracteristicallowsforaneasier
interpretationof the resultsbecause the parametersmeasure the effectof the correspondent
covariate onthe meansurvival time.Currently,the AFTmodel isnotcommonlyusedforthe
analysisof clinical trial data,although itisfairlycommoninthe field of manufacturing.Similarto
the PH model,the AFTmodel describesthe relationshipbetweensurvivalprobabilitiesandaset
of covariates
Procedure-
Usingproc lifereg(SAScode) we performgoodnessof fittestsforeachdistribution(Exponential,
Weibull,Lognormal,GammaandLog-logistic).
Thisprocessis done bytakingall the covariatesintoconsideration.
SASoutput-
Exponential-
Fit Statistics
-2 Log Likelihood 1266.509
AIC (smaller is better) 1284.509
AICC (smaller is better)1284.901
BIC (smaller is better) 1321.864
Weibull-
Fit Statistics
-2 Log Likelihood 1189.103
AIC (smaller is better) 1209.103
AICC (smaller is better)1209.583
BIC (smaller is better) 1250.609
4. Lognormal-
Fit Statistics
-2 Log Likelihood 1171.515
AIC (smaller is better) 1191.515
AICC (smaller is better)1191.996
BIC (smaller is better) 1233.021
Gamma-
Fit Statistics
-2 Log Likelihood 1166.625
AIC (smaller is better) 1188.625
AICC (smaller is better)1189.202
BIC (smaller is better) 1234.281
Loglogistic-
Fit Statistics
-2 Log Likelihood 1182.578
AIC (smaller is better) 1202.578
AICC (smaller is better)1203.059
BIC (smaller is better) 1244.085
Interpretation-
From the above tablesitisquite evidentthatgammadistributionbeingthe chosenone withthe lowest
AICamongstthe others.
Choice of Covariates-
In thissectionwe will be selectingthe mostsignificantcovariatesamongthe eightcovariatesthatare
giveninthe data set.Covariatesare selectedbyusingthe backwardselectionprocedure.
The backward selectionprocedure isaneliminationprocessinwhichall the covariatesare includedin
the model at the beginningandare removedone by one accordingtoa significancecriterion. The
specificparametersthatdefine the parametricmodelandthe coefficientsof all the covariatesare
estimatedfirst.Thenthe Waldtestisusedtoexamine eachcovariate.
We delete the predictorwiththe highestp-valueandre run the model deletingthe predictorswith
highestp-value until all of themsatisfiesourconstraints.The predictorsleftwill be oursignificant
variable.
5. To be thoroughwithourselectionprocedureof the covariateswe have done the backwardselection
procedure foreachdistributions.
Age, Diabetes(DBT) andALG(animmune suppressiondrug) are the three significantcovariates.
Followingtablesshowsindetailsthe selectionprocessof the covariatesforeachandeverydistribution.
SASTables-
Exponential distribution-
Analysis of Maximum Likelihood Parameter Estimates
Parameter DF Estimate
Standard
Error95% Confidence LimitsChi-Square Pr > ChiSq
Intercept 1 5.4088 0.3362 4.7500 6.0676 258.90 <.0001
AGE 1 -0.0224 0.0062 -0.0345 -0.0102 13.07 0.0003
SEX 1 0.0393 0.1494 -0.2535 0.3321 0.07 0.7924
DIALY 1 -0.0002 0.0002 -0.0005 0.0001 1.31 0.2516
DBT 1 -0.5263 0.1897 -0.8981 -0.1544 7.70 0.0055
PTX 1 0.0234 0.2159 -0.3998 0.4466 0.01 0.9137
BLOOD 1 -0.0018 0.0049 -0.0113 0.0078 0.13 0.7162
MIS 1 -0.1278 0.0906 -0.3053 0.0497 1.99 0.1582
ALG 1 0.5028 0.1688 0.1720 0.8335 8.87 0.0029
Scale 0 1.0000 0.0000 1.0000 1.0000
Weibull Shape 0 1.0000 0.0000 1.0000 1.0000
We rejectPTX
Analysis of Maximum Likelihood Parameter Estimates
Parameter DF Estimate
Standard
Error95% Confidence LimitsChi-Square Pr > ChiSq
Intercept 1 5.4182 0.3252 4.7807 6.0556 277.55 <.0001
AGE 1 -0.0224 0.0061 -0.0345 -0.0104 13.30 0.0003
SEX 1 0.0401 0.1492 -0.2523 0.3325 0.07 0.7880
DIALY 1 -0.0002 0.0002 -0.0005 0.0001 1.35 0.2444
DBT 1 -0.5270 0.1896 -0.8986 -0.1554 7.73 0.0054
BLOOD 1 -0.0016 0.0047 -0.0109 0.0076 0.12 0.7287
17. Interpretation-The aboveisthe final table showingall the covariatesthatare significantandwill be used
infittingthe model.
AGE,DBT and ALG are chosen.
Goodness of fit test for the selected covariates and selction of model
In thissectionwe performthe goodnessof fittestonce againforthe three covariatesselectedinthe
above section(AGE,DBTandALG).
We selectthe Gammadistributionasourappropriate model withthe lowestAICvalue comparedtothe
others.
Followingtablesshowsthe detailedAICandBICvaluesforeachand everydistributions.
SASoutput-
Exponential distribution-
Fit Statistics
-2 Log Likelihood 1270.646
AIC (smaller is better) 1278.646
AICC (smaller is better)1278.732
BIC (smaller is better) 1295.249
Weibull distribution-
Fit Statistics
-2 Log Likelihood 1192.098
AIC (smaller is better) 1202.098
AICC (smaller is better)1202.227
BIC (smaller is better) 1222.851
Lognormal distribution-
Fit Statistics
-2 Log Likelihood 1172.735
AIC (smaller is better) 1182.735
AICC (smaller is better)1182.865
BIC (smaller is better) 1203.488
18. Gamma distribution-
Fit Statistics
-2 Log Likelihood 1167.171
AIC (smaller is better) 1179.171
AICC (smaller is better)1179.353
BIC (smaller is better) 1204.075
Log logisticdistribution-
Fit Statistics
-2 Log Likelihood 1184.615
AIC (smaller is better) 1194.615
AICC (smaller is better)1194.745
BIC (smaller is better) 1215.368
Interpretation-Gammadistributionisselectedwiththe lowest AICvalue of 1179.171 comparedtothe
others.
Fitting the appropriate model
Gamma AFT model hasbeenselectedasthe appropriate model forthe givendataset.
For the gamma distributionwe have the followingtable-
Analysis of Maximum Likelihood Parameter Estimates
ParameterDF Estimate
Standard
Error95% Confidence LimitsChi-Square Pr > ChiSq
Intercept 1 3.2692 0.5712 2.1498 4.3886 32.76 <.0001
AGE 1 -0.0197 0.0107 -0.0407 0.0014 3.36 0.0669
DBT 1 -0.5758 0.3218 -1.2065 0.0549 3.20 0.0736
ALG 1 1.1947 0.3403 0.5276 1.8618 12.32 0.0004
Scale 1 2.6330 0.1455 2.3628 2.9341
Shape 1 -1.0523 0.2170 -1.4777 -0.6269
Hence if T be definedasthe survival time we canwrite the appropriate model as-
Log (Ti)= 3.2692 - .0197(AGE) - .5758(DBT) + 1.1947(ALG) + 2.6330(Error)
19. B. Cox’sPH Model
The non-parametricmethoddoesnotcontrol forcovariatesanditrequirescategorical
predictors.Whenwe have several prognosticvariables,we mustuse multivariate approaches.
But we cannot use multiplelinearregressionorlogisticregressionbecause theycannot deal
withcensoredobservations.We needanothermethodtomodel survival datawiththe presence
of censoring.One verypopularmodelinsurvival dataisthe Cox proportional
hazardsmodel.
Procedure-
We are goingto use proc phregto estimate the regressioncoefficient(parameterestimate) basedon
differentmethodsfortiesonthe givendata.
Thenwe are goingto compare those resultswiththe twoparametricmodels,exponential andWeibull
for furtherclarification.
Conclusion-
From the testresultswe can saythat the signsof regressioncoefficientof Age,Durationof
hemodialysis(DIALY),Diabetes(DBT),Bloodare all positive andthushave ahigherhazardrisk inthe
kidneytransplants.
The coefficientsof SEX,PTX(numberof priortransplants) andALG(animmune suppressiondrug) are
all negative indicatinglowhazardriskinthe kidneytransplantation.
It isalso to be notedthatif we compare these value withthatof exponential andWeibulltheygive the
opposite results.AccordingtoWeibullandexponential-SEX,PTXandALGhave higherhazardrisk
whereasthe Age,DIALY,DBT,BloodandMIS have low hazardrisk.
The followingSAStabularvaluesexplainsouroutcome indetails.
Breslow-
Analysis of Maximum Likelihood Estimates
ParameterDF
Parameter
Estimate
Standard
ErrorChi-Square Pr > ChiSq
Hazard
Ratio
AGE 1 0.01732 0.00611 8.0468 0.0046 1.017
SEX 1 -0.02983 0.14882 0.0402 0.8411 0.971
DIALY 10.00009310.0001556 0.3581 0.5495 1.000
DBT 1 0.35004 0.19236 3.3115 0.0688 1.419
PTX 1 -0.07341 0.21763 0.1138 0.7359 0.929
22. Weibull-
Analysis of Maximum Likelihood Parameter Estimates
Parameter DF Estimate
Standard
Error95% Confidence LimitsChi-Square Pr > ChiSq
Intercept 1 5.7554 0.5405 4.6961 6.8147 113.39 <.0001
AGE 1 -0.0287 0.0099 -0.0482 -0.0093 8.37 0.0038
SEX 1 0.0449 0.2424 -0.4301 0.5200 0.03 0.8529
DIALY 1 -0.0002 0.0003 -0.0007 0.0003 0.50 0.4814
DBT 1 -0.5754 0.3101 -1.1831 0.0323 3.44 0.0635
PTX 1 0.1449 0.3570 -0.5548 0.8447 0.16 0.6848
BLOOD 1 -0.0073 0.0081 -0.0232 0.0085 0.82 0.3663
MIS 1 -0.1537 0.1495 -0.4467 0.1393 1.06 0.3038
ALG 1 0.9736 0.2852 0.4146 1.5327 11.65 0.0006
Scale 1 1.6326 0.1013 1.4457 1.8436
Weibull Shape 1 0.6125 0.0380 0.5424 0.6917
Choice of Covariates
Like we didinthe AFTmodel inthissectionwe will be selectingthe significantcovariates only,using
Breslow’smethod(defaultprocedure) forties,the backward selectionmethod,andthe SASproc phreg.
Conclusion- AGE,DBTand ALG are the three covariatesthatare significantfromourtestresults.
FollowingSAStablesgive usthe detailedprocessof the backwardselectionprocedure.
UsingBreslowapproximationof tiesandbackwardselection-
Analysis of Maximum Likelihood Estimates
ParameterDF
Parameter
Estimate
Standard
ErrorChi-Square Pr > ChiSq
Hazard
Ratio
AGE 1 0.01732 0.00611 8.0468 0.0046 1.017
SEX 1 -0.02983 0.14882 0.0402 0.8411 0.971
DIALY 10.00009310.0001556 0.3581 0.5495 1.000
DBT 1 0.35004 0.19236 3.3115 0.0688 1.419
PTX 1 -0.07341 0.21763 0.1138 0.7359 0.929
BLOOD 1 0.00372 0.00497 0.5592 0.4546 1.004
23. Analysis of Maximum Likelihood Estimates
ParameterDF
Parameter
Estimate
Standard
ErrorChi-Square Pr > ChiSq
Hazard
Ratio
MIS 1 0.08376 0.09196 0.8295 0.3624 1.087
ALG 1 -0.60697 0.17033 12.6983 0.0004 0.545
We rejectsex-
Analysis of Maximum Likelihood Estimates
ParameterDF
Parameter
Estimate
Standard
ErrorChi-Square Pr > ChiSq
Hazard
Ratio
AGE 1 0.01731 0.00611 8.0406 0.0046 1.017
DIALY 10.00009100.0001551 0.3441 0.5574 1.000
DBT 1 0.35031 0.19234 3.3171 0.0686 1.420
PTX 1 -0.07537 0.21728 0.1203 0.7287 0.927
BLOOD 1 0.00359 0.00493 0.5305 0.4664 1.004
MIS 1 0.08476 0.09181 0.8524 0.3559 1.088
ALG 1 -0.60890 0.17001 12.8275 0.0003 0.544
We rejectPTX-
Analysis of Maximum Likelihood Estimates
ParameterDF
Parameter
Estimate
Standard
ErrorChi-Square Pr > ChiSq
Hazard
Ratio
AGE 1 0.01753 0.00609 8.2854 0.0040 1.018
DIALY 10.00009500.0001542 0.3799 0.5376 1.000
DBT 1 0.35394 0.19211 3.3945 0.0654 1.425
BLOOD 1 0.00309 0.00475 0.4232 0.5154 1.003
MIS 1 0.08957 0.09078 0.9735 0.3238 1.094
ALG 1 -0.59912 0.16774 12.7568 0.0004 0.549
We rejectDialy-
Analysis of Maximum Likelihood Estimates
ParameterDF
Parameter
Estimate
Standard
ErrorChi-Square Pr > ChiSq
Hazard
Ratio
AGE 1 0.01835 0.00593 9.5785 0.0020 1.019
DBT 1 0.34675 0.19167 3.2728 0.0704 1.414
BLOOD 1 0.00393 0.00454 0.7474 0.3873 1.004
MIS 1 0.09187 0.09122 1.0142 0.3139 1.096
ALG 1 -0.60177 0.16772 12.8728 0.0003 0.548
24. We rejectblood-
Analysis of Maximum Likelihood Estimates
ParameterDF
Parameter
Estimate
Standard
ErrorChi-Square Pr > ChiSq
Hazard
Ratio
AGE 1 0.01824 0.00592 9.4984 0.0021 1.018
DBT 1 0.33636 0.19123 3.0940 0.0786 1.400
MIS 1 0.09052 0.09121 0.9848 0.3210 1.095
ALG 1 -0.59750 0.16753 12.7202 0.0004 0.550
We rejectMIS-
Analysis of Maximum Likelihood Estimates
ParameterDF
Parameter
Estimate
Standard
ErrorChi-Square Pr > ChiSq
Hazard
Ratio
AGE 1 0.01827 0.00591 9.5362 0.0020 1.018
DBT 1 0.32304 0.19083 2.8656 0.0905 1.381
ALG 1 -0.58475 0.16708 12.2492 0.0005 0.557
Thisis the table forfinal model withsignificantcovariates.
Fitting of the Model
Analysis of Maximum Likelihood Estimates
ParameterDF
Parameter
Estimate
Standard
ErrorChi-Square Pr > ChiSq
Hazard
Ratio
AGE 1 0.01827 0.00591 9.5362 0.0020 1.018
DBT 1 0.32304 0.19083 2.8656 0.0905 1.381
ALG 1 -0.58475 0.16708 12.2492 0.0005 0.557
Form the above table we can getour significantcovariatesandhence we canconstruct our final model
The final model withsignificant(p<.10) covariatesis-
Log[h(t)/ho(t)]=.01827 AGE + .32304 DBT - .58475 ALG
Interpretation-The positive signof the regressioncoefficientof AGEand DBT implieshighhazardrisk
rate inkidneytransplantwhereasthe negative coefficientof ALGdrug implieslow hazardriskrate.