Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian Application of Hypertabastic Proportional Hazards Model
Breast cancer is one of the most severe diseases in the world and become the public’s ever day’s agenda in both developed and developing countries. The primary goal of this study was to identify the determinants of survival time of breast cancer patients at Hossana hospital, south Ethiopia. Kaplan-Meier estimation method and a new two-parameter probability distribution called hypertabastic are introduced to model the survival time of the data. A simulation study was carried out to evaluate the performance of the hypertabastic distribution in comparison with popular distribution with the help of R and SAS statistical software Packages. One-fourth (25%) of the total patients survived for only 2 days. 31(35.2%) were censored, and 55(62.5%) were died. Hypertabastic survival model was found to be best fitting to the breast cancer data and age, level of education, family history, breast problem before, High fat diet, child late age, early menarche, late menopause were significant risk factors for the death of breast cancer patients. Awareness has to be given for the society on causes of breast cancer and screening test and early detection policies for most risky groups has to be established.
Similar to Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian Application of Hypertabastic Proportional Hazards Model
Similar to Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian Application of Hypertabastic Proportional Hazards Model (20)
General Principles of Intellectual Property: Concepts of Intellectual Proper...
Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian Application of Hypertabastic Proportional Hazards Model
2. Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian
Application of Hypertabastic Proportional Hazards Model
Tekle and Dutamo 109
As per the many literatures, risk factors for breast cancer
can be separated into avoidable and non- avoidable.
Breast cancer, like other forms of cancer, can result from
multiple environmental and hereditary risk factors. The
term "environmental", as used by cancer researchers,
means any risk factor that is not genetically inherited
[https://en.wikipedia.org/wiki/Risk_factors_for_breast_can
ce].
Howlader (2012) put breast cancer asa multifaceted
disease that affects women and men of all ages and ethnic
groups. Despite decades of productive research on breast
cancer diagnosis and treatment, preventing this cancer is
the only way to reduce the human toll of this disease that
affects 1 in 8 women in their lifetime.
In the Cox model, the baseline hazard function is regarded
as a nuisance parameter, while in parametric models, the
hazard function reflects the time course of the process
under study. In this study, we introduce a new two-
parameter continuous probability distribution called
hypertabastic probability distribution. The hypertabastic
hazard function can assume a different variety of shapes.
Some studies tried to identify the epidemiological risk
factors, but the cause of any individual breast cancer is not
known. As epidemiological researches inform, though not
informed of the incidence in individual populations the
patterns of the breast cancer incidence a cross certain
population clearly revealed. In relation to this, hereditary
syndrome contributed 5% of new breast cancer while 30%
of the risk factors are accountable to well-established
factors [Madigan et al., 1995].
Majority of breast cancers have not been with the exact
cause(s) so that major challenge for prevention is to
identify women at risk as precisely as possible and then to
apply measures such as chemoprevention and lifestyle
changes. Current models can predict probable numbers of
breast cancer cases in specific risk factor strata, but have
modest discriminatory accuracy at the individual level
[Amir et al., 2010].
This study attempts to identify factors that have strong
associations with the survival experience of breast cancer
patients under treatment in one of the government
hospitals in the regional state of SNNPR at Hossana
Queen Elleni Mohammed Referral Hospital.
Hence, the current study is supposed to answer the
following basic questions:
1. What the potential risk factors for breast cancer are
indentified?
2. What is the distribution of breast cancer among some
covariates?
3. What is the instant ahead of which 50% of the
individual’s breast cancer patients are likely to
survive?
4. Which model better fits the data?
The primary goal of this study was to identify the
determinants of survival time of breast cancer patients at
HQEMMH, south Ethiopia.
METHODOLOGY
Description of Study Area and Period
Hadiya zone is one of 13 zones in SNNPR. There are 10
woredas and one town administration in Hadiya zone.
Hosanna town is administrative center for Hadiya zone
and which far from capital city of the country by 235 kms.
The study was conducted in HQEMMH, South Ethiopia
from September, 2011 to February, 2017 (range of data
time).
Sample size, Sampling Design and Technique
A total of 86 random breast cancer patients were
considered for the study. For this study retrospective study
design was employed. In order to select a fairly
representative sample of the population for this study,
simple random sampling technique was employed.
Table 1: Covariates Considered in the Study
Covariates Codes
1. Gender of patients 1=female, 0=Male
2. Age of patients 0<=18yrs,
1= 19-25yrs,
2= >26yrs
3. Breast ca.stage 0=stgI,
1= stgII,
2=stgIII,
3=stgIV
4. Family history 0=no, 1= yes
5. Has breast problem 0=no, 1= yes
6. High fat diet 0=no, 1= yes
7. Residence of patients 0=rural, 1= Urban
8. Marital status 0= separated/divorced/widowed,
1= never married,
2=married
9. Level of education 0=secondary and above
1= no education
2= primary
10. Religion 0= Muslim
1=orthodox
2=protestant,
3= others
11. Smoking 0= no, 1= yes
12. First child at delayed
age
13. Premature start of
menarche
14. Late menopause
15. Stress
16. Type of treatment
0= no, 1= yes
0= no, 1= yes
0= no, 1= yes
0= no, 1= yes
0= surgery, 1= chemotherapy,
2=Radiation therapy,
3=Hormone therapy
3. Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian
Application of Hypertabastic Proportional Hazards Model
Int. J. Public Health Epidemiol. Res. 110
Survival Analysis (SA)
SA consists of a set of specialized statistical techniques
used to study time-to- event data. In analyzing such data,
the main objective is to determine the length of time
interval for the occurrence of an event. Survival analysis is
mainly used for two distinguishing features of time-to-
event data. Duration times are non-negative values usually
exhibiting highly skewed distribution and therefore the
assumption of normality is violated. Secondly, censoring
may occur or the true duration is not always observed or
known, that is, some subjects are potentially being
unobserved for the full time to failure.
The core characteristic of time-to-event data is the
existence of repress which occurs when the periods of time
to event incidence of some individuals cannot be
completely observed. The process of censoring and
truncation make these data unsuitable to analyze with
traditional regression method and hence, the appropriate
technique is SA. Details on various estimation methods
developed in SDA that taken censoring and truncation in
to account can be obtained in [Hosmer and Lemeshow
1998].
In this study the CPHM was used to examine survival time
of breast cancer patient. Kaplan-Meier (KM) estimators
were applied to estimate survival curves of breast cancer
patient and the log rank test was used for the comparison
between the covariate categories. As usual we start our
scheme by generous the connotation of censoring, KM
and CPHM; we then proceed to model building and
assessments.
Kaplan-Meier Estimation
Kaplan-Meier Estimation is a product limit estimation of the
survivorship function which is developed by Kaplan-Meier
(1958). Kaplan-Meier (KM) estimator is used by most
software packages because of the simplistic step
approach. The KM estimator incorporates information from
all of the observations available, both censored and
uncensored, by considering any point in time as a series
of steps defined by the observed survival and censored
times. When there is no censoring, the estimator is simply
the sample proportion of observations with event times
greater than t. The technique becomes more complicated
but still manageable when censored times are included.
The KM estimator consists of the result of a number of
conditional probabilities resulting in an estimated survival
function in the form of a walk function. It is a parameter
free estimator of theS(t), which a survivor function.
−=
tt j
j
j
)
n
d
(1(t)ˆS
(1)
Where djis the number of individuals who experienced the
event at time tj, and nj is the number of individuals who
have not yet experienced the event at that time.
Proportional Hazards Model (PHM)
The basic model for survival data to be considered in this
study is the PHM. It is anticipated by David Cox (1972) and
has also come to be known as the Cox regression model.
His paper took a special approach to standard parametric
SA and extended the methods of the parameter free KM
parameter coefficient values to regression type point of
view for life-table analyses. Cox advanced to prediction of
survival time in individual subjects by only utilizing
variables covering with survival and ignoring the baseline
hazard of individuals. Cox assumed only that the hazard
functions of different individuals remained proportional and
constant over time and he made no assumptions about the
baseline hazard of individuals.
The proportional hazards (PH) assumption refers to the
fact that the hazard functions are multiplicatively related.
That is, for any two individuals with covariates Xi and Xj
the ratio h(t|Xi)/h(t|Xj) is assumed to be constant over
survival time.
The Hazard Function
It gives a turn of phrase for the chance at time t for an entity
with a specified measurement of a set of covariates
denoted by X and it is usually put as follows:
)Xββ),X ii
'
(t)exp(h,h(t o= (2)
in which way ho(t) indicates the baseline hazard function
which is found while all X's are set to zero, iX is the vector
of values of the covariates for the ith individual at time t and
β is the vector of unknown regression parameters that are
assumed to be the same for all individuals in the study,
which measures the influence of the covariate on the
survival experience.
So, it can likewise be regarded as linear model, as a linear
combination of the covariates for the logarithm
transformation of the HR put as:
Xβ
βX, '
=
)(
),(
log
0 th
th
(3)
The CHF is put as:
X)β'
exp()()( tHtH O
=
From model
( ) x
th
xth
'
)(
,,
log
0
=
, we acquired the
survivor function shown as:
X)'β
β)X,
exp(
)(,( tStS O= ,
Where So (t) is a baseline survival function.
Hypertabastic Proportional Hazards Model
Tabatabai et al (2007) proposed a new probability
distribution, hypertabastic distribution, and hypertabastic
survival model. Let T be a continuous random variable
4. Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian
Application of Hypertabastic Proportional Hazards Model
Tekle and Dutamo 111
representing the waiting time until the occurrence of an
event. The hypertabastic baseline survival function is
defined by (cited by Hong Li, 2017)
𝑆0(𝑡) = 𝑃(𝑇 > 𝑡) = 𝑆𝑒𝑐ℎ[𝛼𝑊(𝑡)] (4)
Where W (t) =𝛼(1 − 𝑡 𝛽
𝐶𝑜𝑡ℎ(𝑡 𝛽
))/𝛽 , αand β are the model
parameters and both positive. Correspondingly, the
hypertabastic baseline hazard function is given by
ℎ0(𝑡) = 𝛼 (𝑡2𝛽−1
𝐶𝑠𝑐ℎ2
(𝑡 𝛽
) − 𝑡 𝛽−1
𝐶𝑜𝑡ℎ(𝑡 𝛽
)) 𝑇𝑎𝑛ℎ[𝑊(𝑡)] (5)
Under the proportional hazards assumption, the above
authors introduced the hypertabastic proportional hazards
model. The hazard function for this model is given by (cited
by Hong Li, 2017)
ℎ(𝑡/𝑋, 𝜃) = ℎ0(𝑡)𝑔(𝑋/𝜃) (6)
where X is a p-dimensional vectors of covariates, θ is a
vector of unknown parameters, g ( X θ ) is non negative
function of X satisfying the condition that g (0/θ ) = 1, and
g(X/ 𝜃 )=exp[- ∑ 𝜃 𝑘
𝑝
𝑘=1 𝑋 𝑘 ]. Similarly, the hypertabastic
survival function for this model is defined as
𝑆(𝑡/𝑋, 𝜃) = [𝑆0(𝑡)] 𝑔(𝑋/𝜃)
(7)
All the unknown parameters, including X and θ, can be
estimated using the maximum likelihood method. If the
sample consists of only right censored data, the
hypertabastic proportional hazards log-likelihood function
with log time can be written as
𝐿𝐿(𝜃, 𝛼, 𝛽: 𝑥) = ∑ ln[ 𝛼(1 − 𝑡𝑖
𝛽
𝐶𝑜𝑡ℎ(𝑡𝑖
𝛽
))/𝛽]𝑔(𝑋𝑖
𝑛
𝑖=1 /𝜃) +
𝛿𝑖ln[ 𝑡𝑖(𝛼𝑡𝑖
−1+2𝛽
𝐶𝑠𝑐ℎ2
(𝑡𝑖
𝛽
) − 𝛼𝑡𝑖
−1+𝛽
𝐶𝑜𝑡ℎ(𝑡𝑖
𝛽
) ×
𝑇𝑎𝑛ℎ(𝛼(1 − 𝑡𝑖
𝛽
𝐶𝑜𝑡ℎ(𝑡𝑖
𝛽
))/𝛽]𝑔(𝑋𝑖/𝜃)))] (8)
Where𝛿𝑖 = {
0𝑖𝑓𝑡𝑖𝑖𝑠𝑎𝑟𝑖𝑔ℎ𝑡𝑐𝑒𝑛𝑠𝑜𝑟𝑒𝑑𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛
1𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Bayesian Approach
The Bayesian method is based on specifying a probability
model for the observed data X, given a vector of unknown
parameters𝜃, leading to the likelihood function L(𝜃/𝑋).
Posterior Distribution
The posterior distribution is obtained by multiplying the
prior distribution over all parameters, 𝜃by the full likelihood
functions, L (𝜃/𝑋). All Bayesian inferential conclusions are
based on theposterior distribution of the model generated
(Asrat Demeke, 2015).
Inference concerning 𝜃 is then based on the posterior
distribution, which is obtained by Bayes’ theorem. Then
posterior distribution of 𝜃 is given by
𝜋(𝜃/𝑋) =
𝐿(𝜃/𝑋)𝜋(𝜃)
∫ 𝐿(𝜃/𝑋)𝜋(𝜃)𝑑𝜃
(9)
Combining the likelihood function with the prior distribution
on (β, σ2) and the full conditional distributions for unknown
parameters, the posterior distribution can be written as:
∏(𝛽/𝜎2
, 𝑡, 𝑥) ∝ ∏[𝑓(𝑡𝑖/𝑥𝑖, 𝜃)𝐼(𝛿𝑖=0)
∗ 𝑆(𝑡𝑖/𝑥𝑖, 𝜃)𝐼(𝛿𝑖=1)
]
𝑛
𝑖=1
∗ ∏(𝛽/𝜎2
)
∏(𝜎2
/𝛽𝑡𝑥) ∝ ∏ [𝑓(𝑡𝑖/𝑥𝑖, 𝜃)𝐼(𝛿 𝑖=0)
∗ 𝑆(𝑡𝑖/𝑥𝑖, 𝜃)𝐼(𝛿 𝑖=1)
]𝑛
𝑖=1 ∗
∏(𝛽/𝜎2
) ∗ ∏(𝜎2
) (10)
The posterior distribution for the model specification above
does not have closed form solution for the parameters. For
these models, MCMC-Gibbs sampler is implemented
using the SAS software’s (Asrat Demeke, 2015).
Model Building Strategies for proportional Hazard
model
Having a basic plan to follow in selecting the covariates for
the model and assessing the adequacy of the model both
in terms of the individual covariates and from the point of
view of the overall fit of the model is required for achieving
this “best” model. It is also highlighted in Hosmer and
Lemeshow (1998) that successful modeling of a complex
data set is part science, part statistical methods, and part
experience and common sense.
Here the model to buildstarts from single covariate
analysis as recommended by Collet (1994). Collet
suggested the move toward of first doing a single variable
analysis to “screen" out potentially significant covariates
for consideration in the multi variate model in order to
recognize the significance of each covariate. All covariates
those are noteworthy at 25% level, the humble level of
significance from one predictor single covariate regression
model are taken into multi variate model.
Ethical Consideration
The principled permission was found from WCU Research
and community engagement Vice president Office. And
also authorized permission was obtained from WCU
Referral Hospital medical director. Careful recruitment and
training for data collectors were undertaken.
RESULTS AND DISCUSSION
There is 11(12.5%) chance of having breast cancer for a
woman who bear child at a late age.
One-fourth (25%) of the total patients survived for only 2
days. The median survival time of death in breast cancer
patients are 2.5 days. Thus, the survival time of most
patients is minimal as the disease is chronic and fast fatal.
The proportion of being exposed to the disease for women
who get stressed/ depressed is 18(20.5%). There is
22(25.0%) chance of being exposed to breast cancer for
late menopause women.
Forty (40) random females age greater than or equal to 20
years were involved in testing the measures of level of
knowledge and practice of danger causes, near the
5. Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian
Application of Hypertabastic Proportional Hazards Model
Int. J. Public Health Epidemiol. Res. 112
Table 2: Descriptive Statistics (Total and Percentage for each Category of the Variables)
Variable Category Status of Patient
Number of
Event (%)
Number of
Censored (%)
Total (%)
Sex Male
Female
1(1.1)
54 (63.53)
0(0)
31(36.5)
1(1.1)
85(96.6)
Smoking habit non-smoker
Smoker
49 (64.47)
6(60)
27(35.5)
4(40.0)
76(83.37)
10(11.63)
Region
Muslim
Protestant
Orthodox
Others
13(61.90)
17(73.91)
13(56.52)
19(63.16)
8(38.1)
6(26.1)
10(43.5)
7(36.8)
21(38.1)
23(32.75)
193(26.75)
19(22.09)
Place/residence
Urban
Rural
10(66.67)
45(63.38)
5(33.3)
26(36.6)
15(17.44)
71(82.56)
Breast problem before No/Normal 49(66.21) 25(33.8) 74(86.04)
Yes/Analgic, bleeding or other 6(50.00) 6(50.00) 12(13.95)
Stage I
II
III
26(59.09)
26(72.23)
3(50.00)
18(40.9)
10(27.8)
3(50.00)
44(51.16)
36(41.86)
6(6.97)
Marital status
single/unmarried Married
Divorced/separated/widowed
9(56.25)
46(66.67)
0(0.00)
7(43.8)
23(33.3)
1(100.00)
16(18.60)
69(80.23)
1(1.16)
Level of education Primary
Secondary
Tertiary
0ther
25(60.98)
19(82.60)
0(0)
11(55.00)
16(39.0)
4(17.4)
2(100.0)
9(45.0)
41(47.67)
23(26.74)
2(2.33)
20(23.26)
Family History No
Yes
53(63.85)
2(66.67)
30(36.1)
1(33.3)
83(96.51)
3(3.49)
Treatment taken
Radiotherapy surgery
Chemotherapy
s/w or OR
10(66.67)
17(80.95)
2(33.34)
26(59.09)
5(33.3)
4(19.0)
4(66.7)
18(40.9)
15(17.44)
21(24.41)
6(6.97)
44(51.16)
High fat diet No
Yes
42(61.76)
13(72.23)
26(38.2)
5(27.8)
68(79.07)
18(20.93)
Early onset of
menarche
No
Yes
46(63.01)
9(69.23)
27(37.0)
4(30.8)
73(84.88)
13(15.11)
1st child at late age No
Yes
48(64.00)
7(63.64)
27(36.0)
4(36.4)
75(87.21)
11(12.79)
Stress/depression No
Yes
41(60.29)
14(77.78)
27(39.7)
4(22.2)
68(79.07)
18(20.93)
Knowledge &practice Very good 0(0.00)
Good 5(12.50)
Poor 35(87.5)
beginning exposure actions and premature caution signs
of breast cancer. Among which 35(87.5%) of them have
poor knowledge and practice of risk factors, early detection
measures and early warning signs of breast cancer.
The Kaplan- Meier Estimate of Time-to-death of
Breast Cancer Patients
The mean survival time of cigarette smoker women were
5.220 days with [95% CI: 2.857, 7.583] which was more
than that of non-smoker women by 4.676 days with its
[95% CI: 3.876, 5.475]. The mean survival time for patients
with no family record of breast cancer was greater than
had family history by 4.814and 2.667 [95% CI: 4.012,
5.615] and [95% CI: 1.912, 3.421], respectively.
Figure 1: Fitting survival time (in days) curve of the breast
cancer patients at HQEMMRH, south Ethiopia from
September, 2011 to February, 2017.
6. Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian
Application of Hypertabastic Proportional Hazards Model
Tekle and Dutamo 113
The above survival curve in the figure-1shows that the
probability to die for those who stayed longer is getting
lower and vice versa, as shown in the numerical table-8
output, in the appendix. Thus, the probability of surviving
for two days is 0.6887 which means 15 patients survived
for the second day of admit ion with (95%CI: 0.5919,
0.801) while only four patients survived for the fourth date
of admit ion with the probability of 0.4720 and with
(95%CI: 0.3648, 0.611).
Figure 2: Kaplan Meier curve by Green Wood method for
WHO stage of the cancer with CIs to Compare Survival
Experience of breast cancer Patients at HQEMMRH, south
Ethiopia from September, 2011 to February, 2017.
As shown in the table 4, figure-2shows S (t) ^hat for the
CPHM of time to death of breast cancer patients on WHO
Stage (I, II, III) of the cancer which is similar for other
categorical variables; the ruined lines show a 95%
confidence cover about the survival function. Figure-1 has
shown the same pattern as shown here.
Multivariate Analysis (Cox Proportional Hazards
Model)
Result offered in Table 3 indicates the model factor
approximates of coefficients for the predictors in the final
model based on Cox Proportional hazards beside with the
related standard error, Wald statistic/Z-statistic,
significance level/ Pr(>|z|) , hazard ratio/ exp(-estimates)
and 95-percentCI for the HR. The predictors such as age,
smoking habit, level of education, family history, breast
problem/Analgic, bleeding or other, High fat diet, child late
age, early menarche, late menopause passed the first
filtration of variables for multiple covariates analysis and
then forward variable selection method was used to
choose the important covariates worthy of including in
CPHM. So as to fix on if a covariate is important, the p-
value related with each model parameter has been
estimated and variables that have p-value less than or
equals 0.05 cut point or 5% significance level are
considered as important variables and hence, are included
in the final model.
Table 3: Parameter (model) estimates of coefficients for the variables in the last model beside with the related standard
error, Wald statistic/Z-statistic, significance level/ Pr (>|z|), hazard ratio/ exp (-coef) and95-percent CI for the HR.
Estimates exp(Estim) SE z Pr(>|z|) exp(-Estima) lower
[.95
upper
.95]
age -5.862e-02 9.431e-01 2.604e-02 -2.251 0.0244 * 1.060e+00 0.896 0.992
LeveleducSecon 8.426e-01 2.322e+00 3.878e-01 2.173 0.0298 * 4.306e-01 1.085 4.966
Breast problem Yes -6.654e-01 5.141e-01 5.956e-01 -1.117 0.0029* 1.945e+00 0.159 1.651
HighfatdietYes 2.883e-01 1.334e+00 6.223e-01 0.463 0.0170* 7.496e-01 0.394 4.517
ChildlateageYes -6.811e-01 5.061e-01 7.731e-01 -0.881 0.0321* 1.976e+00 0.111 2.303
EarlymenarchYes 1.732e+00 5.650e+00 7.495e-01 2.310 0.0209 * 1.770e-01 1.300 24.54
LatemenopauseYes -1.729e-01 8.413e-01 6.584e-01 -0.263 0.0192 * 1.189e+00 0.231 3.057
Table 4: Model Summary for Hypertabastic Proportional Hazards Model for breast cancer Patients at HQEMMRH
Parameter Estimate Approx
Std Err
t Value Gradient
Objective Function
Α 4.235545 2.756014 1.53683* -6.509327E-8
Β 1.300136 0.257030 5.05829* -0.000000160
Age -0.048327 0.022378 -2.15956* -0.000014768
leveleducSecon 0.657296 0.710347 0.92531* 0.000000155
histroyYes -0.200784 0.344273 -0.58321* -0.000000607
bresatprblemYes -1.107280 0.432010 -2.5630* 7.1233358E-8
HighfatdietYes -0.766543 0.681431 -1.1249* -0.000001279
childlateageYes 0.315151 0.470965 0.6691* 0.000003493
latemonepouseYes -1.018009 0.562178 -1.810* 0.000003604
In final model (model with significant covariates) of Cox
Proportional hazards the survival moment of patients had
breast cancer was significantly affected as a result of
age, level of education, family history, breast
problem/Analgic, bleeding or other, High fat diet, child
late age, and late menopause. The values of the Wald
value for every
i
coefficients hold that the
i
values
were significantly dissimilarwith zero at α= 5% level of
significance.
7. Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian
Application of Hypertabastic Proportional Hazards Model
Int. J. Public Health Epidemiol. Res. 114
Table 5: Model comparison inputs for the Hypertabastic Proportional Hazards and Cox proportional Models for breast
cancer Patients at HQEMMRH
Cox proportional Hazards Model Hypertabastic Proportional Hazards Model
Criterion WithoutCovariates WithCovariates Criterion WithoutCovariates WithCovariates
-2 LOG L 151.715 145.208 -2 LOG L 151.715 134.007
AIC 151.715 162.007 AIC 151.715 157.208
SBC 151.715 180.149 SBC 151.715 164.983
Hypertabastic Proportional Hazards Model Analysis
and Comparison with Cox Proportional Hazards Model
The above Table 5 gives the clear visualization for the
comparison of the two models applied for this data.
Comparison of the two models is based on comparison
criterion -2 LOG L and AIC values where the model with
smaller value fit the data better. Hence, Hypertabastic
Proportional Hazards Model has smaller criterion values
as compared to Cox Proportional Hazards Model and it
better fits the data on breast cancer patients in this study.
Based on the results (Table-4) from Hypertabastic
Proportional Hazards Model, age of patient, level of
education (Secondary), family history of cancer, earlier
breast problem (Analgic, bleeding or other), High fat diet
consumption, late age child and late menopause are
significantly contributing the risk of breast cancer for
females in this study which is also similar with model result
of Cox Proportional Hazards Model.
Simulation Study
To evaluate the performance of the hypertabastic model
we conduct a simulation study in which we compare the
overall fit of it with the Cox Proportional Hazards model.
In Bayesian inference, uncertainty with respect to
parameters is at any point in time quantified by probability
distributions. This means that a distribution needs to be
specified for all parameters in advance. These prior
distributions (Uniform prior for this study specifically)
reflect the priori expectations with respect to the parameter
values where in this study the Uniform distribution was
used as prior distribution.
The number of iteration used in the analysis for all of the
parameters was set up to be 10,000 with ‘thinning’ number
of 1; the first 2000 was discarded (burn-in) and the rest of
the chains was used to summarize the posterior
distribution and this way means that 10,000 MC samples
were used.
The Bayesian model discussed below the posterior
means, standard deviations and Monte Carlo (MC) errors
for beta and sigma parameters were calculated to assess
the accuracy of the simulation.
The first step in evaluating the results is to review the
convergence diagnostics. Figures 3 to8 (Appendix) display
the Bayesian diagnostic graphs for age, level of education
(secondary), family history, earlier breast problem, high fat
diet, late age child, and late menopause coefficient
parameters. The time series/ chain plots given in the
figures confirm that convergence has been achieved for
the estimated parameters, since the chains appear to be
overlapping. That is, we are reasonably confident that
convergence has been achieved as all the chains appear
to be overlapping one another. Also, the kernel density
plots in the figures show the smooth, unimodal bell-shaped
parameters. The autocorrelation plots displayed in the
figures decreases to near zero indicate efficient sampling.
A more convincing evidence for convergence is observed
in the Brook, Gelman and Rubin (BGR) diagnostics, which
compares the within-chain and the between-chain
variability. The figures indicate that between-chain to the
within-chain variability is one for each parameter or
converge to approximately one indicating the good
convergence of chains.
Table 6: Posterior Summary Statistics for parameters from Bayesian analysis of the Hypertabastic model
Quantiles Posterior Intervals
Description
Deviation
N Mean Standardev 95% Equal-Tail
Interval
95% HPD Interval
25% 50% 75%
Age 10000 0.052 0.0267 -0.070 -0.0518 -0.0342 0.098 0.0016 0.1506 0.1141
Leveleduc 1vs 2 10000 0.238 0.5783 -0.145 0.2373 0.6262 0.890 1.2649 1.454 3.584
History 0 vs 1 10000 2.939 4.9578 0.837 1.5690 3.1225 0.279 14.47 0.078 9.611
Breast problem 0 vs 1 10000 2.919 2.2615 1.515 2.3121 3.5670 0.695 8.777 0.375 7.077
Highfatdiet 0 vs 1 10000 1.181 0.8987 0.603 0.9351 1.4660 0.267 3.5687 0.134 2.851
Childlateage 0 vs 1 10000 2.100 2.1670 0.847 1.4719 2.5626 0.302 7.770 0.093 5.780
Latemenopause 0 vs 1 10000 1.956 1.7675 0.931 1.4603 2.3710 0.415 6.3862 0.269 5.046
8. Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian
Application of Hypertabastic Proportional Hazards Model
Tekle and Dutamo 115
Table-6: Posterior Summary Statistics for parameters from Bayesian analysis of the Hypertabastic model (Continued…)
Maximum Likelihood Estimates
Parameter DF Estimate MC St.Error z Pr > |z|
age 1 -0.0483 0.0255 -0.9230 0.0350
leveleduc2 1 0.1871 0.5499 0.7175 0.0020
history1 1 0.2036 0.9059 -0.5295 0.04560
bresatprblem1 1 0.6144 0.5856 1.2643 0.0029
Highfatdiet1 1 0.1473 0.6120 -1.0869 0.0170
childlateage1 1 -0.4471 0.7576 1.0198 0.3078
latemonepouse1 1 -0.2648 0.6482 -0.0826 0.0189
Table 7: comparison of goodness-of-fit (based on DIC, pD) for the two models
Bayesian Cox Proportional Hazards Model Bayesian Hypertabastic Proportional Hazards Model
AIC (smaller is better) 161.187 AIC (smaller is better) 160.241
BIC (smaller is better) 188.400 BIC (smaller is better) 187.603
DIC (smaller is better) 1132.411 DIC (smaller is better) 1120.200
pD (Effective Number of Parameters) 275.343 pD (Effective Number of Parameters) 271.139
Based on the above Table-7 result, Bayesian
Hypertabastic Proportional Hazards Model would be the
better fit the data as its criterion (DIC and pD) are smaller
than that of the Bayesian Cox Proportional Hazards Model.
Hence, the last inference in this study would be based on
Bayesian Hypertabastic Proportional Hazards Model.
Parameter Interpretation
Thus, an extra year of age reduces the daily hazard of
breast cancer with a value of exp (-0.0483) = 0.9528479,
on average that is, by 100 % - (100%*0.9528479) =
4.71521percent, by keeping other variables in the model.
Similarly, the hazard of breast cancer patient for a woman
learnt secondary education increased with 1.2058, or
20.58%. Thus, the hazard of breast cancer for woman
learnt secondary education (95%CI: 0.1506, 0.1141;
P=0.0020) is 1.2058 times higher than woman learnt other
educational level or never learnt. The hazard of breast
cancer for woman with breast problem before like Analgic,
bleeding or other (95%CI: 0.375, 7.077; P=0.0029) is
1.848547 times higher than that of breast normal woman.
The hazard of breast cancer for woman consuming high
fat diet (95%CI: 0.134, 2.851; P=0.0170) 1.1587 times
higher than that of consuming free of fat diet. The risk of
breast cancer for woman late menopause (95%CI: 0.269,
5.046; P=0.0192) is 0.7674 times reduced as compared to
the counterpart.
Further Checking for the Model Adequacy
Statistical tests for Proportional hazards model
assumptions and model diagnostics were done using
numerical and graphical techniques.
The test of correlation (rho) is insignificant that indicates
proportional hazards assumption is fulfilled and the global
test is greater than 0.05 the assumptions did satisfy by the
covariates in the model. Hence, the non-parametric
modeling approach was applied by overwhelming the
parametric models for this reason.
A q-q plot is made to check whether the Hypertabastic
Proportional model presented an enough fit to the data by
having 2 dissimilar groups of population. We shall
graphically check the sufficiency of the model by
evaluating the significantly dissimilar groups of patients by
stage, first child at late age, and educational level of
patients. The facts come into view to be roughly linear for
all variables stage, first child at late age, and educational
level of patients. Therefore, the Hypertabastic Proportional
model comes into view to be the best in explaining survival
time of the patients in this study.
DISCUSSION
Hypertabastic Proportional Hazards model in classical
approach and in Bayesian approach were fitted. Both
methods give almost consistent results but most of the
parameters in Bayesian approach had smaller standard
error than the corresponding classical model. Therefore,
Bayesian Hypertabastic Proportional Hazards gives better
fit than classical counterpart.
In average, the study subjects of patients are young in age
and age was found to be among the leading threat factor
for the disease as the study result reveals. This result is
similar to a study done on the disease occurrence with age
amongst women in the UK 2006-2008 which revealed that
the hazard of getting the disease boosts through age.
Thus, a woman is more than 100 times more likely to
develop breast cancer in her 60s than in her 20s [Richard
et al., 2000; Ahammad Basha Shaik et al., 2015].
Out-of-the-way from rising age, among the strongest
recognized threat cause for a woman being detected with
breast cancer is with a close female relative (i.e., mother,
sister, or daughter) had a history of breast cancer. This
relationship is interrelated in most researches with a dual
raise in risk [Burstein H., 2008].
9. Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian
Application of Hypertabastic Proportional Hazards Model
Int. J. Public Health Epidemiol. Res. 116
Male gender has a great deal of lesser threat of emergent
of the disease than female gender. This is supported by
other studies stated as around 99-percent of the disease
cases are detected in women, among most urbanized
countries; which stand for the uppermost prevalence of
men breast cancer which way male comprises 5–15
percent of the disease cases, as manifested in a few
African countries [Richardet al., 2000].
As the current study shows, lesser age of initial childbirth,
contrasted to the average age of 24,
[http://www.medscape.com/viewarticle] having additional
offspring (about 7% lowered risk per child), and
breastfeeding (4.3% per breastfeeding year, with an
average relative risk around 0.7 [McTiernan and Thomas,
1986; Byers et al., 1985]) have all been associated with
lowered breast cancer risk in huge studies [Breast cancer
and hormone replacement therapy, 2008]. Breast cancer
risk increases with premature menarche and delayed
menopause, and it is reduced by premature first full term
pregnancy [Jatoi, 1999].
In 2009, the Canadian Expert Panel on Tobacco Smoke
and Breast Cancer Risk concluded that both active and
passive smoke experiences raise breast cancer hazard.
However, this current study contradicts this result. The
deviation can also be revealed as absence of correlations
between breast cancer and smoking in various researches
could be because of the potential anti-estrogenic
consequence of smoking, which might oppose the
unpleasant effects of chemical carcinogens in the breast
[MacMahon B et al., 1982].
As Ahammad Basha Shaik et al., (2015) conducted,
survival prototypes of the patients were researched and
survival estimates were computed using the KM method
and the Hypertabastic Proportional model was considered
to investigate the consequence of covariates to the
survival time of the patients.
CONCLUSIONS AND RECOMMENDATIONS
Bayesian inference is more powerful than the classical
counterpart in model analysis. Thus, in this study the
authors fitted both Bayesian and classical models for the
data and the Bayesian approach better fitted the data.
Hypertabastic PH model in Bayesian version (in final
mode) and in Classical version (in other mode) is
supported to be better fit the data as compared to the Cox
PH version. Hence, the model inference was given based
on the Bayesian Hypertabastic PH model in the most final
model.
In assessing the significant risk factors the Log Rank test
for both models revealed that, age, level of education,
family history, breast problem/Analgic, bleeding or other,
High fat diet, child late age, late menopause had significant
effect on the continued existence probability of patients
with the disease. It also showed that sex, place of
residence of the patients, stages of the disease, marital
status were not significant for the survival probability of
patients with breast cancer.
Hypertabastic PH model was fitted because the
supposition of it was satisfied or not violated. Using the QQ
plot technique Hypertabastic PH model was found to be
best fitting to the breast cancer data. The results showed
that the predictors: age, level of education, family history,
breast problem/Analgic, bleeding or other, High fat diet,
child late age, late menopause were significant threat
cause for the death of patients from the disease. And
though not included in the model, Knowledge &practice of
pretest or early test of breast by self was very poor among
the patients. Thus, Knowledge &practice towards threat
causes, premature discovery measures and early warning
signs of breast cancer were not recognized by the patients.
The median survival time of death in breast cancer patients
is 2.5 days. Thus, the survival time of most patients is
minimal as the disease is chronic and fast fatal.
The ministry of health and policy makers should work on
awareness by letting to know the threat causes for the
disease and to complete the prescribed treatment without
considering breast cancer as incurable disease and to
follow up their cancer status to minimize the risk of death
and recognizes breast cancer as an important health
problem and establishing screening test and early
detection policies for most risky groups.
In addition, it will be important to open cancer diagnosing
and treatment center in each woreda of the zone.
Awareness has to be given for the society on causes of
breast cancer. The mass media can play an effective role
in this regard and special attention should be given to old
age women, because they are the riskiest groups for
breast cancer.
Limitation of the Study
As the data is gathered from the card of patients in the
study there were a lot of patients with insufficient
information. Even if the number of patients is high in
number, on follow-up it is often a physical and financial
burden for them to return hospital for follow-up in the day
of appointment; most patients didn’t come again to the
Hospital to be treated. The other limitation was small
number of cases because the Hospital has no cancer
registry center to serve the cancer related patients in the
catchment area. And therefore, the study included only 86
patients which were the total number of patients with full
information.
ABBREVIATION
Cox Proportional Hazard Model = CPHM
Cumulative Hazard Function = CHF
10. Survival Analysis of Determinants of Breast Cancer Patients at Hossana Queen Elleni Mohammad Memorial Referral Hospital, South Ethiopia: Bayesian
Application of Hypertabastic Proportional Hazards Model
Tekle and Dutamo 117
Hazard Ratio = HR
Hossana Queen Elleni Mohammad
Memorial Referral Hospital = HQEMMRH
Kaplan-Meier = KM
Proportional Hazards Model = PHM
Survival Analysis = SA
Wachemo University = WCU
Declaration of Agreement with Ethical Values
Ethical endorsement: All procedures done in the research
concerning human participants were in harmony with the
ethical standards of the institutional and/or general
research commission.
Informed consent: Not applicable
Fund Status: This study was not funded by anyone. Null
fund at all, done by self strive.
Authors’ clash of concern: There is no clash of concern
among the authors.
REFERENCES
Ahammad Basha Shaik , Venkataramanaiah. M , S.C.
Thasleema (2015). Statistical Applications of Survival
Data Analysis for Breast Cancer Data. i-manager’s
Journal on Mathematics, Vol. 4 l No. 2 l April - June
2015.
A.GebremedhinandM. Shamebo (1998). “Clinical profile
of Ethiopianpatients with breast cancer,” East African
Medical Journal, vol.175, no. 11, pp. 640–643.
Asrat Demeke (2015). Bayesian Analysis of Survival of
Breast Cancer Patients: A Case Study at Tikur Anbessa
Specialzed Hospital, Addis Ababa, Ethiopia. Master’s
thesis.
Bast RC, Kufe DW, Pollock RE, et al (2011). Cancer
Medicine (e.5 ed.). Hamilton, Ontario: B.C. Decker.
ISBN 1-55009-113-1. Retrieved 27 January 2011.
Burstein H( 2008). Malignant tumors of the breast. In:
DeVita, VT Jr., Lawrence TS, Rosenberg SA, DePinho
RA, Weinberg RA, editors. DeVita, Hellman, and
Rosenberg’s Cancer: Principles and Practice of
Oncology. 8th ed. Philadelphia: Lippincott Williams &
Wilkins; p. 1606-54.
Byers T, Graham S, Rzepka T, Marshall J ( 1985).
"Lactation and breast cancer. Evidence for a negative
association in premenopausal women". American
Journal of Epidemiology 121 (5): 664–74.
doi:10.1093/aje/121.5.664. PMID 4014158.
B.M.Wadler,C.M. Judge,M.Prout, J.D.Allen, andA.
C.Geller (2011). “Improving breast cancer control via
the use of communityhealth workers in South Africa: a
critical review,” Journal of Oncology, Article ID 150423,
8 pages, 2011.
Collett, D. (1994). Modeling survival data in medical
research. London: Chapman & Hall.
http://www.medscape.com/viewarticle/517532
http://www.csrwire.com/.
https://en.wikipedia.org/wiki/Risk_factors_for_breast_can
ce.
Gebremariam A, Addissie A, Worku A, et al (2019). Breast
and cervical cancer patients’ experience in Addis
Ababa city, Ethiopia: a follow-up study protocol. BMJ
Open 2019; 9:e027034. Doi: 10.1136/ bmjopen-2018-
027034.
Howlader N, Noone AM, Krapcho M, Neyman N, Aminou
R, Altekruse SF, et al (2009). SEER Cancer Statistics
Review, 1975-2009 (Vintage 2009 Populations)
[internet]. Bethesda (MD): National Cancer Institute;
c2009-3 [updated 2012 Aug 20; cited 2013 Jan 7].
Hosmer, D. and S. Lemeshow (1998).Applied survival
analysis: Regression Modeling of Time to Event
Data.Wiley New York.
Jatoi, I. (1999). Breast cancer screening. American
Journal of Surgery, 177, 518-524.
Kaplan, E.L.; Meier, P. (1958)."Non parametric estimation
from incomplete observations" .J.Amer. Statist.Asn,
53(282):457-481. JS TOR 2281868.
Li, H. (2017) Survival Analysis for a Breast Cancer Data
Set. Advances in Breast Cancer Research, 6, 1-15.
http://dx.doi.org/10.4236/abcr.2017.61001.
MacMahon B, Trichopoulos D, Cole P, Brown J (1982).
Cigarette smoking and urinary estrogens. N Engl J
Med.;307(17):1062-5.
Madigan MP, Ziegler RG, Benichou J, Byrne C, Hoover RN
(1995). "Proportion of breast cancer cases in the United
States explained by well-established risk factors". J.
Natl. Cancer Inst.87 (22): 1681–5.
doi:10.1093/jnci/87.22.1681. PMID 7473816.
Margolese, Richard G, Bernard Fisher, Gabriel N
Hortobagyi, and William D Bloomer (2000). "118". In
McTiernan A, Thomas DB (September 1986).
"Evidence for a protective effect of lactation on risk of
breast cancer in young women. Results from a case-
control study". Am. J. Epidemiol.124 (3): 353–8.
PMID 3740036.
Mohammad A Tabatabai (2007).Hypertabastic survival
model. Theoretical Biology and Medical
Modelling2007,4:40doi:10.1186/1742-4682-4-40.
(http://creativecommons.org/licenses/by/2.0).
T. Ersumo (2006). “Breast cancer in an Ethiopian
population, Addis Ababa,” East Africa Journal of
Surgery”, vol. 11, no. 1, pp. 81–86.