Generative AI on Enterprise Cloud with NiFi and Milvus
Xie et al 2016 risk analysis
1. Risk Analysis DOI: 10.1111/risa.12561
A Generalized QMRA Beta-Poisson Dose-Response Model
Gang Xie,1,2,∗
Anne Roiko,2,3
Helen Stratton,2
Charles Lemckert,2,4
Peter K. Dunn,1
and Kerrie Mengersen5
Quantitative microbial risk assessment (QMRA) is widely accepted for characterizing the
microbial risks associated with food, water, and wastewater. Single-hit dose-response models
are the most commonly used dose-response models in QMRA. Denoting PI (d) as the proba-
bility of infection at a given mean dose d, a three-parameter generalized QMRA beta-Poisson
dose-response model, PI (d|α, β, r∗
), is proposed in which the minimum number of organisms
required for causing infection, Kmin, is not fixed, but a random variable following a geometric
distribution with parameter 0 < r∗
≤ 1. The single-hit beta-Poisson model, PI (d|α, β), is a
special case of the generalized model with Kmin = 1 (which implies r∗
= 1). The generalized
beta-Poisson model is based on a conceptual model with greater detail in the dose-response
mechanism. Since a maximum likelihood solution is not easily available, a likelihood-free
approximate Bayesian computation (ABC) algorithm is employed for parameter estimation.
By fitting the generalized model to four experimental data sets from the literature, this study
reveals that the posterior median r∗
estimates produced fall short of meeting the required
condition of r∗
= 1 for single-hit assumption. However, three out of four data sets fitted by
the generalized models could not achieve an improvement in goodness of fit. These combined
results imply that, at least in some cases, a single-hit assumption for characterizing the dose-
response process may not be appropriate, but that the more complex models may be difficult
to support especially if the sample size is small. The three-parameter generalized model pro-
vides a possibility to investigate the mechanism of a dose-response process in greater detail
than is possible under a single-hit model.
KEY WORDS: A generalized beta-Poisson model; approximate Bayesian computation; QMRA; single-
hit beta-Poisson models
1Faculty of Science, Health, Education and Engineering, Univer-
sity of the Sunshine Coast, Queensland, Australia.
2Smart Water Research Centre, Griffith University, Queensland,
Australia.
3Menzies Health Institute Queensland, Griffith University,
Queensland, Australia.
4School of Engineering, Griffith University, Queensland
Australia.
5Science and Engineering Faculty, Queensland University of
Technology, Queensland, Australia.
∗Address correspondence to Gang Xie, Building G51, Gold
Coast Campus, Griffith University, Queensland 4222, Australia;
g.xie@griffith.edu.au.
1. INTRODUCTION
Quantitative microbial risk assessment (QMRA)
provides an alternative or supplementary framework
to epidemiological approaches for identifying po-
tential excess risk for defined pathways of selected
pathogens from source to recipients.(1,2)
The core
part of the QMRA framework is the dose-response
analysis, which models the mathematical charac-
terization of the relationship between the dose
administered and the probability of an adverse effect
(typically, the probability of infection, denoted by
PI (·)) in the exposed population.(1,2)
Among differ-
ent microbial dose-response models proposed in the
literature, the exponential and beta-Poisson models
1 0272-4332/16/0100-0001$22.00/1 C 2016 Society for Risk Analysis
2. 2 Xie et al.
are two of the most commonly used dose-response
models in QMRA applications.(1–3)
A standard
two-parameter beta-Poisson dose-response model
(hereafter referred to as “the single-hit beta-Poisson
model” and denoted by PI (d|α, β)) is a special
case of a more general mechanistic framework
(the threshold models framework) as derived and
discussed by Haas et al. (p. 277(1)
). Possibly because
of the difficulty of model parameter estimation,
however, few published dose-response data analyses
have implemented this generalized framework.
This study examines a three-parameter general-
ized QMRA beta-Poisson model (hereafter referred
to as “the generalized beta-Poisson model” and
denoted by PI (d|α, β, r∗
)) in which the single-hit
beta-Poisson model is a special case with r∗
= 1.
The mathematical intractability of this model makes
parameter estimation via maximum likelihood or
the method of moments extremely difficult, if not
impossible. On the other hand, the generation of
simulation samples from PI (d|α, β, r∗
) is rather
straightforward, so an ABC (approximate Bayesian
computation) algorithm(4)
is employed for parameter
estimation. The newly proposed generalized beta-
Poisson model is an implementation of the threshold
models framework and provides greater flexibility in
description of the dose-response mechanism than is
possible under a single-hit model.
The rest of the article is organized as follows.
Section 2 provides a description of the data sets ana-
lyzed and the notation scheme adopted in this study.
The generalized beta-Poisson model is developed
and specified in Section 3. The model parameter esti-
mation algorithms are presented in Section 4. Section
5 presents the model parameter estimation results
and discusses the interpretation, significance, and im-
plication of these results. Finally, Section 6 provides
a summary of our research findings.
2. DATA AND NOTATION
Well-studied dose-response experiment data
sets provide a reliable baseline for validation of our
research findings. In this study, four experimental
data sets from the literature are analyzed for com-
parison between the single-hit beta-Poisson model
PI (d|α, β) and the generalized beta-Poisson model
PI (d|α, β, r∗
) with respect to parameter estimation
and model performance. The selected experiment
data sets are: rotavirus (CJN) and infection in
healthy volunteers (Data set 1);(5,6)
Campylobacter
and infection in healthy volunteers (Data set 2);(5,6)
Salmonella (nontyphoid strains) and infection in
human volunteers (Data set 3) (p. 399(7)
); Listeria
monocytogenes and infection in mice (Data set 4).(2)
These data sets have been used for dose-response
analyses in Refs. 2 and 5-7 although these references
are not the original data sources. These four data
sets are chosen because of their variety of coverage,
e.g., virus and bacteria, few data points (e.g., ࣘ 10)
and many data points (> 40), human volunteers and
mice, and old (back to 1996) and recent research
sources. The details of these selected dose-response
data sets are given in Table IV. For the purpose
of a clear and consistent model specification and
easy comparison with the literature results, we have
adopted a notation scheme as detailed in Table I
throughout this article. The term “organism” is used
as a short name for pathogenic microorganism.
3. MODEL SPECIFICATION
This section presents the details of the de-
velopment and specification of the generalized
beta-Poisson model. The connection between the
generalized beta-Poisson model (PI (d|α, β, r∗
))
and the single-hit beta-Poisson models (PI (d|α, β))
is established and rationale for the extension of the
single-hit models is provided.
The conceptual dose-response process upon
which the generalized beta-Poisson model is devel-
oped is depicted in Fig. 1. This conceptual process
comprises three biologically plausible steps: (1) in-
gestion of organisms by a host individual (assuming
a known mean dose); (2) ingested organisms sur-
viving to target site; and (3) surviving organism(s)
resulting in infection. The generalized beta-Poisson
model is specified upon the assumptions that encom-
pass all three probabilistic elements involved in this
process: (1) the random number of organisms actu-
ally ingested; (2) the random number of organisms
surviving to the target site inside a host body; (3) the
probability of a surviving organism resulting in infec-
tion.
A generalized beta-Poisson dose-response rela-
tionship framework (under the name of “threshold
models”) was proposed by Haas et al. as specified by
Equation 8.25 of Ref. 1 (p. 277):
PI (d) = 1 −
(α + β)
(α)
Kmin−1
k=0
(α + k)
(α + β + k)
dk
1 F1
(α + k, α + β + k, −d) , for Kmin = 1, 2, . . . , (1)
3. A Generalized QMRA Beta-Poisson Dose-Response Model 3
Table I. Notation and Definition
Notation Definition
Ds = {D1, D2, . . . , Dm} Mean dose levels (i.e., the average number of organisms per dose or average concentrations); where the
meaning is clear, denote d ≡ Di for i = 1, 2, . . . , m
N = {N1, N2, . . . , Nm} Ni is the number of individuals participating in the ith exposure group (m groups, each corresponding to
one mean dose level Di )
y = {y1, y2, . . . , ym} yi is the number of infected individuals in the ith exposure group
r Survival probability of each single organism ingested
r∗ Probability of a single surviving organism resulting in infection
K Number of organisms surviving to the target site inside a host body
Kmin The threshold value of the surviving organisms, i.e., the minimum number of organisms required for
causing an infection event
PI Probability of infection estimated by a dose-response model
πo
i Observed proportion of infected individuals in the ith exposure group, i.e., πo
i = yi /Ni for
i = 1, 2, . . . , m
πi Predicted binary response by a model, e.g., if response is defined by probability of infection,
πi = PI (Di ) ≡ PI (d)
exp(.) and log exp(1) = e ≈ 2.71828 is the natural logarithm base such that log(e) = 1
( · ) The gamma function(8)
Fig. 1. A plausible three-step conceptual
dose-response process for derivation of
the generalized beta-Poisson model.
where 1 F1(., ., .) is the Kummer confluent hypergeo-
metric function(9)
defined by:
1 F1 (α, α + β, −d) = 1 +
(α + β)
(α)
∞
j=1
(α + j)
(α + β + j)
(−1)j−1
(d)j
j!
. (2)
Since there are no analytic solutions to
1 F1(α, α + β, −d), only approximate solutions or
asymptotic solutions can be obtained.(9,10)
For Equa-
tion (1), it is the combination of the mathematically
intractable nature of the hypergeometric function
and the infinite summation series that makes model
parameter estimation extremely difficult, if not
impossible, via standard statistical procedures such
as maximum likelihood estimation,(11)
the method of
moments,(11)
or MCMC simulation.(12)
Equation (1) is called a dose-response relation-
ship framework, or a conceptual model, because it is
not a specification of a particular model, but repre-
sents a family of beta-Poisson dose-response models.
Equation (1) can be considered as a specific charac-
terization of the dose-response process depicted in
Fig. 1 by defining (1) the actual number of ingested
organisms to follow a Poisson distribution with pa-
rameter d; (2) the number of surviving organisms (at
the target site inside the host individual) to be deter-
mined by the survival probability r, a random vari-
able following a beta distribution with parameters α
and β; and (3) the minimum number of organisms re-
sulting in infection, Kmin to be 1, 2, . . . Ý. The deriva-
tion of Equation (1) assumes that any single event of
the organisms surviving and/or resulting in infection
is independent of any other such events.(1,13)
When the minimum number of organisms caus-
ing infection is fixed to Kmin = 1, the commonly used
single-hit beta-Poisson dose-response model results
in the following:
PI (d) = 1 − 1 F1 (α, α + β, −d) . (3)
Conceptually, a more realistic representation of
the dose-response process could allow the threshold
value Kmin to vary according to different conditions
such as the host individual’s immunity status and
the surviving organism’s virulence. Therefore, a
natural and plausible generalization of the single-hit
beta-Poisson model is to allow Kmin to be a random
integer rather than a fixed value. This leads us
to proposing K* = Kmin–1 as a random variable
following a geometric distribution such that the
probability mass function is:
Pr (K∗
= x|r∗
) = r∗
(1 − r∗
)x
, (4)
4. 4 Xie et al.
Table II. Details of the Fitted Models
Single-Hit Beta-Poisson Model, MLEs Generalized Beta-Poisson Model ABC Estimates (posterior medians)
ˆα ˆβ Y ˆα (95% interval) ˆβ (95% interval) r∗ (95% interval)
(parameter estimation search range)
Data set 1 0.167 0.191 4.48 0.172 (0.162, 0.182) 0.156 (0.135, 0.177) 0.574 (0.514, 0.621)
(0, 0.25) (0, 0.35) (0, 1)
Data set 2 0.145 8.01 2.42 0.162 (0.155, 0.169) 7.62 (6.59, 8.54) 0.490 (0.422, 0.562)
(0, 0.88) (0, 14.3) (0, 1)
Data set 3 0.314 3100 53.2 0.374 (0.333, 0.430) 3170 (2,830, 3,500) 0.380 (0.301, 0.461)
(0, 50) (0, 5500) (0, 1)
Data set 4 0.253 19.4 8.77 0.262 (0.247, 0.277) 16.2 (13.8, 19.1) 0.624 (0.562, 0.681)
(0, 0.75) (0, 44) (0, 1)
Note: The summary statistic Y is calculated using Equation (5); 95% interval = 95% credible interval; the parameter estimation search
range specifies the lower and upper limits of a uniform distribution, which serves as a prior distribution in the ABC parameter estimation
procedure (Section 4). Bold numbers are parameter point estimates.
Table III. Comparison of the Conditional Probability of Infection Between the Single-Hit Models and the Generalized Models Based on
the Point Estimates of the Model Parameters
Single-Hit Beta-Poisson Model Generalize Beta-Poisson Model
Pr(infection | any one Pr(infection | first Pr(infection | more-than-
Probability surviving organism) surviving organism) one surviving organisms)
Data set 1 1 0.574 0.426
Data set 2 1 0.490 0.510
Data set 3 1 0.380 0.620
Data set 4 1 0.624 0.376
Note: Pr(infection | first surviving organism) = Pr(infection|K* = 0) = r∗ (1–r∗)0 = r∗; K* = 0 implies Kmin = 1 by definition.
for x = 0, 1, 2, . . . , and 0 < r∗
≤ 1. There is a good
reason for choosing a geometric distribution to char-
acterize Kmin with this parameterization. By standard
definition (e.g., Ref. 11), the random number x in
Equation (4) is typically interpreted as the number
of failures before the first success and the probabil-
ity of success on any trial is r∗
. In the dose-response
model context, if causing infection is defined as a
“success” event, K* = Kmin–1 would represent the
number of surviving organisms not causing infection
before the first infection event and the parameter r∗
could naturally be interpreted as the probability of
any single surviving organism resulting in infection.
Note that if r∗
= 1, Equation (4) is reduced to K*
= 0, i.e., Kmin = 1 with probability one. Therefore,
conceptually, only a two-step process is required for
deriving the single-hit beta-Poisson model, namely,
steps (2) and (3) in Fig. 1 can be combined into a
single step in which the parameter r represents the
probability of surviving and causing infection as per
the literature to date. Therefore, Equations (1) and
(4) completely define the generalized beta-Poisson
model, which provides greater detail in representing
the dose-response mechanism than is possible under
a single-hit model.
In the same way as Haas et al.(1)
derived Equa-
tion (1) for the beta-Poisson model, it is straight-
forward to derive a similar exponential conceptual
model.6
Details of a generalized exponential dose-
response model (equivalent to the specification of
Equations (1) and (4)) are given in the Appendix.
4. PARAMETER ESTIMATION ALGORITHMS
To our knowledge, few analysis applications
of Equation (1) appear in the literature and no
maximum likelihood estimates (MLE) solution has
been found for the generalized beta-Poisson model.
There are many statistical models with no MLE solu-
tions available or for which parameter estimation is
difficult for other reasons. For the generalized
beta-Poisson model, other parameter estimation
6Interested readers are referred to page 276–277 in Ref. 1
5. A Generalized QMRA Beta-Poisson Dose-Response Model 5
Table IV. Selected dose-response Experiment Data Sets from
the Literature
Data set 1: Rotavirus (CJN) and infection in healthy volunteers(5,6)
Ds (mean dose) N (total) y (infected)
9×10−3 5 0
9×10−2 7 0
9×10−1 7 1
9 11 8
9×101 7 6
9×102 8 7
9×103 7 5
9×104 3 3
Data set 2: Campylobacter and infection in healthy volunteers(5,6)
Ds (mean dose) N (total) y (infected)
8×102 10 5
8×103 10 6
9×104 13 11
8×105 11 8
1×106 19 15
1×108 5 5
Data set 3: Salmonella (nontyphoid strains) and infection in
human volunteers (p. 399(7))
Ds (mean dose) N (total) y (infected)
1.52×105 6 3
3.85×105 8 6
1.35×106 6 6
1.39×105 6 3
7.05×105 6 4
1.66×106 6 4
1.5×107 6 4
1.25×105 6 5
6.95×105 6 6
1.7×106 6 5
1.2×104 5 2
2.4×104 6 3
6.6×104 6 4
1.41×105 6 3
2.56×105 6 5
5.87×105 6 4
8.6×105 6 6
8.9×104 6 5
4.48×105 6 4
1.04×106 6 6
3.9×106 6 4
1×107 6 6
2.39×107 6 5
4.45×107 6 6
6.73×107 8 8
1.26×106 6 6
4.68×106 6 6
1.2×104 6 3
2.4×104 6 4
5.2×104 6 3
9.6×104 6 3
1.55×105 6 5
(Continued)
Table IV. Continued
Data set 1: Rotavirus (CJN) and infection in healthy volunteers(5,6)
3×105 6 6
7.2×105 5 4
1.15×106 6 6
5.5×106 6 5
2.4×107 5 5
5×107 6 6
1×106 6 6
5.5×106 6 6
1×107 6 5
2×107 6 6
4.1×107 6 6
1.5×106 6 5
7.68×106 6 6
1×107 6 5
1.58×105 6 1
Data set 4: (292 and 295 pooling) Listeria monocytogenes and
infection in mice(2)
Ds (mean dose) N (total) y (infected)
2 6 0
5 6 1
1.1×102 6 2
5.5×103 10 7
3.24×104 10 7
3.9×104 6 4
5.5×104 10 9
2.51×105 10 10
5.5×105 10 10
2.82×106 10 10
Ds = {D1, D2, . . . , Dm} represents m mean dose levels (i.e., the
average number of organisms per dose or average concentra-
tions) applied to mexposure groups; N = {N1, N2, . . . , Nm}, where
Ni represents the number of individuals participated in the ith
exposure group (corresponding to mean dose level Di ); y =
{y1, y2, . . . , ym}, where yi represents the number of infected in-
dividuals in the ith exposure group.
procedures, such as the method of moments(11)
or
standard MCMC algorithm,(12)
are either impossi-
ble or computationally impractical to implement.
On the other hand, the generation of the simu-
lation samples from a generalized beta-Poisson
model PI(d|α, β, r∗
) is relatively straightforward.
These are the primary reasons for choosing an
ABC algorithm to estimate the model param-
eters α, β, and r∗
in PI(d|α, β, r∗
). The ABC
is a likelihood-free method for parameter esti-
mation and is essentially an acceptance/rejection
algorithm.(4)
The two key elements required for
an ABC algorithm are (i) generation of random
samples from a selected model; and (ii) good, ideally,
6. 6 Xie et al.
a sufficient, summary statistics for minimiza-
tion/optimization.
In this article, the same optimization summary
statistic for parameter estimation is used for both the
single-hit and the generalized beta-Poisson models.
This is defined as:
Y=−2
m
i=1
yi log
πi
π0
i
+ (Ni − yi ) log
1 − πi
1 − π0
i
.(5)
Note that Equation (5) is a reproduction of
Equation 8.33 in Haas et al. (p. 285(1)
). In Equation
(5), m is the number of exposure groups in terms of
the mean dose level Di ; Ni is the total number of indi-
viduals in the ith exposure group (total m groups); yi
is the number of infected individuals in the ith expo-
sure group; πi is the predicated probability of infec-
tion of the ith exposure group; and πo
i = yi /Ni is the
ratio of the observed number of infected individuals
to the number of all individuals at risk at each mean
dose level Di (see Table I for the notation and def-
inition details). The summary statistic Y defined by
Equation (5) is a measure of the discrepancy between
the observed responses (πo
i ) and the model-predicted
responses (πi ) of the dose-response data.(14)
Note
that by estimating πi = PI(d) from Equation (3) and
then substituting the results into Equation (5), the
MLEs of the model parameters, ˆα and ˆβ, can be ob-
tained for a single-hit beta-Poisson model by mini-
mizing Y.(1,15)
The evaluation of the Kummer confluent hy-
pergeometric function (Equation (2)) is nontrivial
but can be obtained by using the R(16)
function “hy-
perg_1F1” in the R package “gsl.”(17)
The built-in
optimization function “optim” in R is employed to
obtain the values of ˆα and ˆβ that minimize Y.
The ABC algorithm used for the generalized
beta-Poisson model parameter estimation is an im-
plementation of Algorithm 2 in Ref. 4 Given a dose-
response data set, the ABC algorithm applied in this
study can be summarized as a four-step parameter
estimation procedure:7
r Step 1: generation of S = 20,000 parameter sets
(e.g., α, β, r∗
) from wide uniform distributions;
r Step 2: generation of S simulated data sets (ysim)
using the parameter sets obtained from Step 1
(using R code in Table V);
7The R source code is available on request by contacting the cor-
responding author.
r Step 3: calculation of the summary statistic Y for
each of the S simulated data sets using Equation
(5), with πo
i = yi /Ni and πi = ysim(i)/Ni ;
r Step 4: identification of the top 1% best-fit sam-
ples (i.e., the 200 simulated data sets with the
smallest Y values) and the corresponding pa-
rameter sets. The median values of these 200 pa-
rameter sets are accepted as the point estimate
of the model parameters (i.e., ˆα, ˆβ, and r∗).
Note that the posterior sampling distributions of
the accepted parameter estimates can be obtained
through generation of a large number of sets of
ˆα, ˆβ, and r∗ by running the above four-step es-
timation procedure many times (e.g., ࣙ 100 runs).
The medians of the corresponding posterior distribu-
tions are considered as the “true” point estimates for
α, β, and r∗
, respectively. The credible intervals
(e.g., the 95% posterior quantile intervals) and other
inferences can also be made based on these posterior
distributions.
A key consideration in Step 1 above is the choice
of the prior distributions for the parameters, which
is about the determination of the search range (i.e.,
the lower and upper limits for defining a uniform
distribution).(4)
Although the range of the uniform
distribution can be specified to be extremely large,
this can be severely computationally inefficient since
most proposed parameter values will be rejected. In
this study, the results obtained for the single-hit beta-
Poisson models (Equation (3)) are used to broadly
calibrate the priors for α and β for the generalized
model, and the prior for r∗
was set to the full range
(0,1).
The foremost element in the ABC estimation
process is the generation of random samples of the
dose-response data (i.e., the simulated data set ysim)
based on a selected model. Table V contains the
core R code used for generation of simulated sam-
ples based on a single-hit beta-Poisson dose-response
model PI (d|α, β) or on a generalized beta-Poisson
model PI (d|α, β, r∗
).
5. RESULTS AND DISCUSSION
The estimated parameters and other model fit-
ting information are presented in Table II. Note that,
for single-hit models, the parameter estimates (i.e., ˆα
and ˆβ) are the MLEs; for the generalized models, the
parameter estimates (i.e., ˆα, ˆβ, and r∗) are obtained
by reading the median values from the posterior dis-
tributions and the credible intervals are constructed
7. A Generalized QMRA Beta-Poisson Dose-Response Model 7
Table V. Core R Code for Generation of Simulation Samples Based on a Single-Hit Beta-Poisson dose-response Model (α), β or a
Generalized Beta-Poisson Model (α), β, r∗
Single-hit beta-Poisson dose-response model: inputs: Ds, N, α,
β; output: y ࣕ ysim
Generalized beta-Poisson dose-response model: inputs: Ds, N,
α, β, r*; output: y ࣕ ysim
nx = length(N) # need to model the process nx = length(N) # need to model the process
n1 = sum(N) # based on each individual host n1 = sum(N) # based on each individual host
Ds1 = rep(Ds,N) # exposed to the hazard Ds1 = rep(Ds,N) # exposed to the hazard
y = as.numeric(nx) y = as.numeric(nx)
Nsum = cumsum(N); si = c(1,(Nsum[-nx]+1)) Nsum = cumsum(N); si = c(1,(Nsum[-nx]+1))
yy1 = Di = as.numeric(n1) yy1 = Di = as.numeric(n1)
for (i in 1:n1) { for (i in 1:n1) {
Di[i] = rpois(1,Ds1[i]) Di[i] = rpois(1,Ds1[i])
rr = rbeta(1,a,b) # a ≡ α; b ≡ β. rr = rbeta(1,a,b) # a ≡ α; b ≡ β.
# Ki = rgeom(1, r*)
pinfe1 ← 1- pbinom(0,Di[i],rr) pinfe1 ← 1- pbinom(Ki,Di[i],rr)
yy1[i] = rbinom(1,1,pinfe1)} yy1[i] = rbinom(1,1,pinfe1)}
for (j in 1:nx) y[j] = sum(yy1[si[j]:Nsum[j]]) for (j in 1:nx) y[j] = sum(yy1[si[j]:Nsum[j]])
by covering the values between the 2.5% and 97.5%
quantiles of the posterior parameter estimates. Pos-
terior parameter estimate samples are generated by
performing 500 runs of the ABC estimation process
as described in Section 4 and the resulting posterior
distributions are presented in Fig. 2. The median val-
ues of these posterior parameter distributions are in-
dicated by the (red) vertical dashed lines in Fig. 2 and
the numeric results are presented in Table II.
Once the parameter estimates are obtained, the
natural questions to be asked are: Does this gener-
alized model improve the model fit? and What ex-
tra benefit do we gain from the generalized model?
Figs. 2 and 3 and Tables II and III present the anal-
ysis results that contain the information required for
answering these questions.
5.1. Estimated Parameters and Goodness of Fit
The goodness-of-fit performances of the single-
hit beta-Poisson models and the generalized beta-
Poisson models are compared with respect to the
sample size (Nsim) and the results are graphically
shown in Fig. 3. By “sample size,” we mean the num-
ber of individuals participated in the ith exposure
group (m groups, each corresponding to one mean
dose level) as defined for entry N = {N1, N2, . . . , Nm}
in Table I. The notation Nsim is used here for repre-
senting the simulation setting of the number of indi-
viduals in each exposure group. Therefore, Nsim can
be chosen arbitrarily without restriction to the actual
number of host individuals participating in a dose-
response experiment. In the studies reported here,
the asymptotic approximations are calculated using
the simulation setting Nsim = {Ni =1,000}. The model
goodness of fit is measured by the summary statistic Y
in the same way as specified in Step 3 of the ABC al-
gorithm (Section 4). For each sample size condition,
5,000 Monte Carlo simulation runs are performed
and the median values of the summary statistic Y are
obtained and plotted in Fig. 3.
Fig. 3 shows some interesting model perfor-
mance results. For Data set 1 (rotavirus [CJN]) as
shown in Fig. 3(a), the generalized model shows a
consistent and clear improvement in goodness of fit
over the full sample size range over the single-hit
model. At Nsim = 1,000, the resulting summary
statistic Y of the generalized model is substantially
smaller than the MLE result of the single-hit model.
This confirms the potential of the generalized model
to improve the model fit performance achievable
by the single-hit model. However, Figs. 3(b)–(d)
show that, asymptotically (i.e., as Nsim → ∞), this
potential improvement is not realized for the other
three data sets. For Data set 3 (to a lesser extent) and
Data set 4, the model fit results for the generalized
models are poorer at Nsim = 20, suggesting that the
model is overfitting and that the single-hit models
are the better choices for these data sets. This is in
keeping with the principle of parsimony (i.e., given
the same model fit performance, the simpler model
is the better model). Models with one, two, and three
parameters have been proposed for dose-response
modeling for QMRA.(18)
The single-hit beta-
Poisson models have been found as the preferred
QMRA dose-response model for their plausibility in
8. 8 Xie et al.
Fig. 2. The histograms are the posterior sampling distributions of the parameter estimates of the generalized beta-Poisson model. The (red)
vertical dashed lines indicate the medians of the posterior distributions (color visible in on-line version).
representing the real dose-response process and its
goodness-of-fit performance in fitting experimental
data.(1,2,18)
Therefore, one plausible interpretation
is that the goodness-of-fit results with Data sets 2,
3, and 4 indeed represent the cases in which the
single-hit models are the best-fit ones among all the
candidate models.
Alternatively, the lack of improvement could be
due to the relative small sample sizes. There are three
parameters to be estimated for the generalized model
(one parameter more than the single-hit model) and
a small sample size (e.g., max(N) < 20) will impact
on parameter estimation. For example, in Data set
3, the number of individuals exposed to each mean
dose level is between five and eight (Table IV). This
implies that, in the best case, the observed response
πo
i = yi /Ni can only be one of the nine possible out-
comes: {0/8, 1/8, . . . , 8/8}, while in fact πo
i should take
any value between zero and one. The limited infor-
mation contained in the data may have made the
more complex generalized model unnecessary. The
data-gaps issue in QMRA, especially for modeling
the dose-response relation, is well recognized and ad-
dressed by researchers in the field.(1,19–21)
The fact
that the generalized beta-Poisson model does not im-
prove the goodness of fit with some of the existing
data sets suggests the need for better dose-response
data for testing the new model.
9. A Generalized QMRA Beta-Poisson Dose-Response Model 9
Fig. 3. Comparison of goodness of fit between
the single-hit (circle points), and the generalized
(triangle points) beta-Poisson models. The (red)
dashed lines represent the asymptotic summary
statistic Y levels obtained by MLE (i.e., when
sample size approaches infinity). The horizontal
axis is on the natural log scale for convenience of
displaying plotting positions.
Another possible cause for the lack of improve-
ment could come from the model fitting method-
ology. As the name implies, the ABC algorithm is
an approximate model fitting procedure and many
things can compromise its accuracy. For example, the
summary statistic for minimization may not be a suf-
ficient statistic as required in the ideal case and a set
of summary statistics may be needed. Different prior
distributions and/or the tolerance level for selecting
the accepted posterior samples will have impacts on
the ABC estimation results. The high computational
demand in the optimization process is also a draw-
back of ABC.(4,22)
These methodology concerns sug-
gest the need to pursue improvements in the model
fitting procedure.
5.2. Estimated Parameters and dose-response
Process
Because the generalized beta-Poisson model
PI (d|α, β, r∗
) is based on a conceptual model with
greater detail in the dose-response mechanism, the
estimated parameters (i.e., ˆα, ˆβ, and r∗) could po-
tentially provide a better characterization and inter-
pretation of the actual dose-response process. This
is demonstrated and discussed through the compar-
ison between the single-hit models and the general-
ized models on what we can learn about the dose-
response process based on the estimated parameters
in Table II, and the uncertainty in the estimated pa-
rameters of the generalized models shown in Fig. 2.
As defined in Table I, the parameter r∗
is
the conditional probability, Pr(infection|any-single-
surviving-organism). In Section 3, we have shown
that the single-hit beta-Poisson model is a special
case of the generalized beta-Poisson model with r∗
= 1. However, the parameter estimates r∗ in Table
II show that the probabilities of resulting in infec-
tion by a single surviving organism are all clearly less
than one (minimum 0.380, maximum 0.624). The de-
gree of deviation between the two models is further
highlighted in Table III. While the single-hit mod-
els assume Kmin = 1 with probability 1 (column 2 of
Table III), the analysis results based on the general-
ized models (columns 3 and 4 of Table III) show a
different picture. For example, with rotavirus (Data
set 1), there is only a 57.4% chance that a single sur-
viving organism will result in infection, i.e., a 42.6%
chance that at least two surviving organisms are re-
quired to cause infection. The Salmonella data (Data
set 3) provide the most extreme example of deviating
from the single-hit condition: a 62.0% chance that at
least two surviving organisms are required to cause
infection.
10. 10 Xie et al.
Recall that in Section 5.1 the analysis results
showed that the generalized model indeed improves
the goodness of fit with Data set 1 but not with Data
sets 2, 3, and 4. The single-hit probability results pro-
duced here by the generalized beta-Poisson models
(column 3 of Table III) are consistent in the case of
Data set 1, but contradictory in the cases of Data sets
2, 3, and 4 because an improvement in goodness of fit
is expected to be observed if the single-hit assump-
tion is indeed false. These results can be reconciled
to some extent by taking into account the uncertain-
ties around the point estimates, e.g., through a care-
ful examination of the r∗
posterior distributions pre-
sented in Fig. 2. This finding is important because it
provides motivation for further investigation into the
true Kmin distribution from a microbiology or patho-
genesis perspective.
6. CONCLUSIONS
The generalized beta-Poisson model presented
here is a natural and nontrivial extension of the
commonly used single-hit dose-response models. The
generalized model has provided an opportunity to
examine the single-hit assumption through modeling
the available dose-response data, providing a com-
plement to any possible experimental approaches.
This study has shown that both the point and inter-
val r∗
estimates produced fall short of meeting the
required single-hit condition of r∗
= 1. It is noted,
however, that three out of four data sets fitted by the
generalized models could not achieve an improve-
ment in goodness of fit over the single-hit model.
These analysis results combined imply that, at least
in some cases, a single-hit assumption for character-
izing the dose-response process may not be appropri-
ate. Two possible reasons have been proposed for the
lack of improvement in the goodness-of-fit results:
(1) due to the small sample size of the experimen-
tal data; (2) due to the approximation accuracy limit
of the ABC algorithm (e.g., sufficiency of the sum-
mary statistic for minimization, impact of the prior
distribution, etc.). A lack of improvement in good-
ness of fit of a more general model has at least three
implications for future research: the need for bet-
ter dose-response data, improvement in model fitting
methodology, and greater understanding of the un-
derlying biological and physical processes. In sum-
mary, this article provides the risk community with
a new dose-response model and methodology for fit-
ting it to data. This generalization of the most com-
monly used beta-Poisson single-hit model provides
an opportunity to investigate the dose-response pro-
cess in greater detail and implications for the future
development of the mechanistic dose-response mod-
els.
ACKNOWLEDGMENTS
This article is a part of the research results of
the pond project funded by Department of Science,
Information Technology and Innovation (DSITI)
of Queensland government, Australia. The authors
would like to thank Professor Charles Haas for his
valuable comments and advice on the article. Author
K.M. acknowledges research support from the QUT
Institute for Future Environments and the ARC
Centre of Excellence for Mathematical and Statisti-
cal Frontiers.
APPENDIX: THE GENERALIZED
EXPONENTIAL DOSE-RESPONSE MODEL
The generalized exponential dose-response
model
Similar to specification of the generalized beta-
Poisson model, a generalized exponential model can
be defined as:
PI (d) = 1 −
Kmin−1
K=0
dk
rk
K!
exp (−dr) , (A1)
where r is a constant (0 < r ≤ 1), and
Pr (K∗
= x|r∗
) = r∗
(1 − r∗
)x
, f or x = 0, 1, 2, . . . ,(A2)
where K* = Kmin–1 and 0 < r∗
≤ 1. As with the beta-
Poisson models, when K* = 0, i.e., Kmin = 1 (the
single-hit assumption) is true, the single-hit exponen-
tial dose-response model falls out as:
PI (d) = 1 − exp (−rd) . (A3)
It is a well-established result in QMRA
literature(1,23)
that, in Equation (A3), if the sur-
vival probability r is allowed to follow a gamma
distribution with parameters α and β, the single-hit
beta-Poisson model of Equation (3) is obtained.
Therefore, the exponential dose-response model
(Equation (A3)) is considered as a special case of
the single-hit beta-Poisson model (Equation (3))
in the sense that, as the parameters α and β both
approach infinity, the expected value of the survival
11. A Generalized QMRA Beta-Poisson Dose-Response Model 11
probability r approaches a constant (0 < r ≤ 1) and
its variance approaches zero.(23)
REFERENCES
1. Haas CN, Rose JB, Gerba CP. Quantitative Microbial Risk
Assessment, 2nd ed. New York: John Wiley & Sons, Inc.,
2014.
2. Rose J, Haas C, Gurian P, Mitchell J, Weir M. QMRA
wiki: Center for Advancing Microbial Risk Assessment.
Quantitative Microbial Risk Assessment (QMRA) Wiki,
2015. Available at: http://qmrawiki.canr.msu.edu/index.php/
Quantitative_Microbial_Risk_Assessment_%28QMRA%29_
Wiki, Accessed April 2015.
3. The Interagency Microbiological Risk Assessment Guide-
line Workgroup. Microbial Risk Assessment Guideline,
Pathogenic Microorganisms with Focus on Food and Water.
U.S. Environment Protection Agency (EPA) and U.S. De-
partment of Agriculture/Food Safety and Inspection Service
(USDA/FSIS), 2012.
4. Marin J-M, Pudlo P, Robert CP, Ryder RJ. Approximate
Bayesian computational methods. Statistics and Computing,
2012; 22(6):1167–1180.
5. Teunis P, Havelaar A. The beta Poisson dose-response model
is not a single-hit model. Risk Analysis, 2000; 20(4):513–520.
6. Teunis P, van der Heijden OG, van der Giessen JWB, Have-
laar A. The Dose-Response Relation in Human Volunteers
for Gastro-Intestinal Pathogens. The Netherlands: National
Institute of Public Health and the Environment (RIVM), 1996.
Contract No.: Report no. 284550002.
7. Haas CN, Rose JB, Gerba CP. Quantitative Microbial Risk
Assessment, 1st ed. New York: John Wiley & Sons, Inc., 1999.
8. Kreyszig E. Advanced Engineering Mathematics, 10th ed.
New York: John Wiley & Sons, Inc., 2011.
9. Muller KE. Computing the confluent hypergeometric func-
tion, M (a, b, x). Numerische Mathematik, 2001; 90(1):179–
196.
10. Butler RW, Wood ATA. Laplace approximations for hyperge-
ometric functions with matrix argument. Annals of Statistics,
2002; 30(4):1155–1177.
11. Casella G, Berger RL. Statistical Inference, 2nd ed. Belmont,
CA: Duxbury Press, 2002.
12. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data
Analysis. London: Taylor & Francis, 2014.
13. Haas CN. Conditional dose-response relationships for mi-
croorganisms: Development and application. Risk Analysis,
2002; 22(3):455–463.
14. McCullagh P, Nelder J. Generalized Linear Models. London:
Chapman and Hall, 1989.
15. Morgan BJ. Analysis of Quantal Response Data. London:
CRC Press, 1992.
16. R Development Core Team. R: A Language and Environment
for Statistical Computing. Vienna, Austria: R Foundation
for Statistical Computing, 2014. Available at: http://www.R-
project.org.
17. Hankin RK. Package “gsl”, 2013.
18. Moon H, Chen JJ, Gaylor DW, Kodell RL. A comparison of
microbial dose-response models fitted to human data. Regula-
tory Toxicology and Pharmacology, 2004; 40(2):177–184.
19. Haas C. Progress and data gaps in quantitative microbial
risk assessment. Water Science & Technology, 2002; 46(11-
12):277–284.
20. O’Toole J. Identifying Data Gaps and Refining Estimates
of Pathogen Health Risks for Alternative Water Sources.
Canberra, Australia: National Water Commission, Australian
Government, 2011.
21. Toze S. PCR and the detection of microbial pathogens in wa-
ter and wastewater. Water Research, 1999; 33(17):3545–3556.
22. Csillery K, Blum MG, Gaggiotti OE, Francois O. Ap-
proximate Bayesian computation (ABC) in practice.
Trends in Ecology & Evolution, 2010;25(7):410–418. doi:
10.1016/j.tree.2010.04.001. PubMed PMID: 20488578.
23. Teunis P, Takumi K, Shinagawa K. Dose response for infec-
tion by Escherichia coli O157: H7 from outbreak data. Risk
Analysis, 2004;24(2):401–407.