This document describes a Bayesian approach to identifying individuals of high concern regarding infection in a population using contact network and epidemiological data. The approach models infection spread and uses Bayesian statistical methods to calculate the probability that an individual is infected but not yet presenting symptoms or will become infected soon. The approach was tested using a simulation model and showed reasonable results, though further work is needed to improve integration methods to allow inference on larger contact networks.
Highlighted notes while preparing for project on Computational Epidemics:
Computational Epidemiology (Review)
By Madhav Marathe, Anil Kumar S. Vullikanti
Communications of the ACM, July 2013, Vol. 56 No. 7, Pages 88-96
10.1145/2483852.2483871
An epidemic is said to arise in a community or region when cases of an illness or other health-related events occur in excess of normal expectancy. Epidemics are considered to have influenced significant historical events, including the plagues in Roman times and Middle Ages, the fall of the Han empire in the 3rd century in China, and the defeat of the Aztecs in the 1500s, due to a smallpox outbreak. The 1918 flu pandemic in the U.S. was responsible for more deaths than those due to World War I. The last 50 years have seen epidemics caused by HIV/AIDS, SARS, and influenza-like illnesses. Despite significant medical advances, according to the World Health organization (WHO), infectious diseases account for more than 13 million deaths a year.
On Semantics and Deep Learning for Event Detection in Crisis SituationsCOMRADES project
In this paper, we introduce Dual-CNN, a semantically-enhanced deep learning model to target the problem of event detection in crisis situations from
social media data. A layer of semantics is added to a traditional Convolutional Neural Network (CNN) model to capture the contextual information that is generally scarce in short, ill-formed social media messages. Our results show that
our methods are able to successfully identify the existence of events, and event types (hurricane, floods, etc.) accurately (> 79% F-measure), but the performance of the model significantly drops (61% F-measure) when identifying fine-grained event-related information (affected individuals, damaged infrastructures, etc.).
These results are competitive with more traditional Machine Learning models, such as SVM.
http://oro.open.ac.uk/49639/1/event_detection.pdf
Highlighted notes while preparing for project on Computational Epidemics:
Computational Epidemiology (Review)
By Madhav Marathe, Anil Kumar S. Vullikanti
Communications of the ACM, July 2013, Vol. 56 No. 7, Pages 88-96
10.1145/2483852.2483871
An epidemic is said to arise in a community or region when cases of an illness or other health-related events occur in excess of normal expectancy. Epidemics are considered to have influenced significant historical events, including the plagues in Roman times and Middle Ages, the fall of the Han empire in the 3rd century in China, and the defeat of the Aztecs in the 1500s, due to a smallpox outbreak. The 1918 flu pandemic in the U.S. was responsible for more deaths than those due to World War I. The last 50 years have seen epidemics caused by HIV/AIDS, SARS, and influenza-like illnesses. Despite significant medical advances, according to the World Health organization (WHO), infectious diseases account for more than 13 million deaths a year.
On Semantics and Deep Learning for Event Detection in Crisis SituationsCOMRADES project
In this paper, we introduce Dual-CNN, a semantically-enhanced deep learning model to target the problem of event detection in crisis situations from
social media data. A layer of semantics is added to a traditional Convolutional Neural Network (CNN) model to capture the contextual information that is generally scarce in short, ill-formed social media messages. Our results show that
our methods are able to successfully identify the existence of events, and event types (hurricane, floods, etc.) accurately (> 79% F-measure), but the performance of the model significantly drops (61% F-measure) when identifying fine-grained event-related information (affected individuals, damaged infrastructures, etc.).
These results are competitive with more traditional Machine Learning models, such as SVM.
http://oro.open.ac.uk/49639/1/event_detection.pdf
The Living in London presentation from the International Student Orientation Programme (ISOP) for new international/EU students at UCL. This presentation was delivered for ISOP January 2017.
This simple innovation assessment can be filled out in five minutes. It is used to help organization quickly assess successes and gaps in their innovation process.
Per contact probability of infection by Highly Pathogenic Avian InfluenzaHarm Kiezebrink
Estimates of the per-contact probability of transmission between farms of Highly Pathogenic Avian Influenza virus of H7N7 subtype during the 2003 epidemic in the Netherlands are important for the design of better control and biosecurity strategies.
We used standardized data collected during the epidemic and a model to extract data for untraced contacts based on the daily number of infectious farms within a given distance of a susceptible farm.
With these data, the ‘maximum likelihood estimation’ approach was used to estimate the transmission probabilities by the individual contact types, both traced and untraced.
The outcomes were validated against literature data on virus genetic sequences for outbreak farms. The findings highlight the need to
1) Understand the routes underlying the infections without traced contacts and
2) To review whether the contact-tracing protocol is exhaustive in relation to all the farm’s day-to-day activities and practices.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
The Living in London presentation from the International Student Orientation Programme (ISOP) for new international/EU students at UCL. This presentation was delivered for ISOP January 2017.
This simple innovation assessment can be filled out in five minutes. It is used to help organization quickly assess successes and gaps in their innovation process.
Per contact probability of infection by Highly Pathogenic Avian InfluenzaHarm Kiezebrink
Estimates of the per-contact probability of transmission between farms of Highly Pathogenic Avian Influenza virus of H7N7 subtype during the 2003 epidemic in the Netherlands are important for the design of better control and biosecurity strategies.
We used standardized data collected during the epidemic and a model to extract data for untraced contacts based on the daily number of infectious farms within a given distance of a susceptible farm.
With these data, the ‘maximum likelihood estimation’ approach was used to estimate the transmission probabilities by the individual contact types, both traced and untraced.
The outcomes were validated against literature data on virus genetic sequences for outbreak farms. The findings highlight the need to
1) Understand the routes underlying the infections without traced contacts and
2) To review whether the contact-tracing protocol is exhaustive in relation to all the farm’s day-to-day activities and practices.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
The SIR Model and the 2014 Ebola Virus Disease Outbreak in Guinea, Liberia an...CSCJournals
This research presents a mathematical model aimed at understanding the spread of the 2014 Ebola Virus Disease (EVD) using the standard SIR model. In modelling infectious disease dynamics, it is necessary to investigate whether the disease spread could attain an epidemic level or it could be wiped out. Data from the 2014 Ebola Virus Disease outbreak is used and Guinea where the outbreak started is considered in this study. A three dimensional non-linear differential equation is formulated and solved numerically using the Runge-Kutta 4th order method in the Vensim Personal Learning Edition Software. It is shown from the study that, with public health interventions, the effective reproductive number can be reduced making it possible for the outbreak to die out. It is also shown mathematically that the epidemic can only die out when there are no new infected individuals in the population.
A COMPUTER VIRUS PROPAGATION MODEL USING DELAY DIFFERENTIAL EQUATIONS WITH PR...IJCNCJournal
The SIR model is used extensively in the field of epidemiology, in particular, for the analysis of communal
diseases. One problem with SIR and other existing models is that they are tailored to random or Erdos type networks since they do not consider the varying probabilities of infection or immunity per node. In this paper, we present the application and the simulation results of the pSEIRS model that takes into account the probabilities, and is thus suitable for more realistic scale free networks. In the pSEIRS model, the death rate and the excess death rate are constant for infective nodes. Latent and immune periods are assumed to be constant and the infection rate is assumed to be proportional to I (t) N(t) , where N (t) is the size of the total population and I(t) is the size of the infected population. A node recovers from an infection
temporarily with a probability p and dies from the infection with probability (1-p).
Design of a Clinical Decision Support System Framework for the Diagnosis and ...Editor IJCATR
This paper proposes an adaptive framework for a Knowledge Based Intelligent Clinical Decision Support System for the
prediction of hepatitis B which is one of the most deadly viral infections that has a monumental effect on the health of people afflicted
with it and has for long remained a perennial health problem affecting a significant number of people the world over. In the framework
the patient information is fed into the system; the Knowledge base stores all the information to be used by the Clinical Decision
Support System and the classification/prediction algorithm chosen after a thorough evaluation of relevant classification algorithms for
this work is the C4.5 Decision Tree Algorithm with its percentage of correctly classified instances given as 61.0734%; it searches the
Knowledge base recursively and matches the patient information with the pertinent rules that suit each case and thereafter gives the
most precise prediction as to whether the patient is prone to hepatitis B or not. This approach to the prediction of hepatitis B provides a
very potent solution to the problem of determining if a person has the likelihood of developing this dreaded illness or is almost not
susceptible to the ailment.
Epidemiological modeling of online social network dynamicsDario Caliendo
Facebook si è diffuso come un'epidemia virale, alla quale gli utenti stanno lentamente diventando immuni. Ad affermarlo sono John Cannarella e Joshua A. Spechler, due ricercatori della Princeton Universiry che in un'analisi nella quale hanno studiato il fenomeno del social network applicando un modello utilizzato per lo studio dello sviluppo delle epidemie, hanno determinato che la diffusione della piattaforma di Zuckerberg ha ormai superato il sui punto più alto ed a breve inizierà ad implodere, fino a perdere l'80% degli utenti entro il 2017.
Epidemiological modeling of online social network dynamics
ysf_report
1. A Bayesian approach to identifying high-concern individuals in an infection-bearing population
A Bayesian approach to identifying high-concern
individuals in an infection-bearing population
Anqi Dong
March 9, 2012
1 Background
The delayed diagnosis and treatment of in-
dividuals who are carrying infectious diseases
can place a large burden on the healthcare sys-
tem [4, 8] and the general population. Such
delays are especially worrisome for infections
with prolonged asymptomatic periods, such as
HIV, chlamydia, and gonorrhea, in environments
where many persons have vulnerable immune
systems, such as hospitals and nursing homes,
and in environments characterized by high inter-
personal contact rates, such as schools and pris-
ons. Public health authorities use methods such
as contact tracing to locate contacts of reported
infected individuals and determine whether they
are infected as well. The lack of direct connec-
tion between an individual’s contact patterns and
that individual’s infection state means that con-
tact tracing procedures often need to be fairly
exhaustive in order to not miss infected individu-
als [9,13].
In certain contexts, there is much data col-
lected about the individual-level characteristics
of an infection within a population. One of the
key types of data about populations that is col-
lected is descriptions of potentially infection-
transmitting person-to-person interactions. These
interactions are generally represented using con-
tact networks—graphs, with vertices represent-
ing persons and edges connecting persons who in-
teract with each other [3]. For large, well-mixed
populations, such as those of cities, it is difficult
and infeasible to obtain useful sets of data [11].
However, traditional contact tracing provides
data on individuals during outbreak investiga-
tions. In addition, for smaller, relatively closed
settings, such as hospitals, nursing homes, and
schools, it is possible to gather detailed epidemi-
ological data by distributing on-body proximity
sensors to the entire population to log persons’
contacts [6,12].
Bayesian statistical approaches are valu-
able tools for analysis of epidemiological data.
Current literature on using Bayesian tech-
niques in the context of infection spread on a
heterogeneously-mixing contact network gen-
erally focuses on inferring infection parame-
ters such as the average probability of infec-
tion [1, 2, 5, 7, 10]. Current techniques assume
that data on the structure of contact networks is
virtually nonexistent, and therefore either ignore
the effects of contact networks on the infection,
or sample from generated networks of a simple
family of graphs (for example, Bernoulli random
graphs) as part of the Bayesian approach [2,5].
This means that there is very little work on in-
ferring additional individual-level characteris-
tics from epidemiological data, especially when
knowledge of the contact network is good.
Analyses of available data on contact net-
works and individual epidemiological records
could better inform healthcare systems about
various infection patterns and trends, so that
March 9, 2012 Anqi Dong Page 1 of 6
2. A Bayesian approach to identifying high-concern individuals in an infection-bearing population
they can better target their limited efforts and
resources. Such efficiencies include prioritizing
contact tracing and testing to first assess persons
who haven’t reported but who are likely to be
infected, and first vaccinating those who are un-
infected but who are at high risk of immediate
infection.
Coupled with an automated, continuous data-
gathering system such as iEpi [6], an inference
system could provide quasi-real-time predictions
in institutional settings, identifying unknown
sources of infection or patients at high risk of
becoming infected. The spatial data provided by
such a data-gathering system could even consider
hidden environmental pathogen reservoirs, such
as a contaminated surface at a particular location.
2 Project goals
To implement an inference system that iden-
tifies persons of interest or concern in a contact
network using an incomplete set of epidemio-
logical data. These persons of interest include
likely high spreaders of infection, persons who
are probably infected but whose infection sta-
tuses are unknown, and persons who probably
are not currently infected, but are likely to be-
come infected soon.
To design a general mathematical framework
for performing inference on individual-level con-
tact data and infection histories.
3 Methods
3.1 Epidemiological model
We consider a contact network C of n per-
sons. In terms of infection spread, the network is
idealized as a closed one, meaning that infection
cannot enter C except via a small set of persons—
the index infectives (infectious individuals) of the
population. This network can be heterogeneous—
the persons in C do not necessarily have the same
number of contacts or patterns of connection.
Let C be made up of persons p1, p2,..., pn.
In C , each person pi has a number of contacts,
the set C (pi). For each person pj that is an ele-
ment of C (pi), we can define cpj→pi(t), the rate
of contact from pj to pi at time t. This rate is not
symmetric, so cpj→pi(t) = cpi→pj (t) in general.
The function cpj→pi(t) reflects the number of po-
tentially infection-transmitting contacts from pj
to pi. Depending on the pathogen, a “contact”
can be events like sneezing, needle-sharing, or
sexual contact.
We use the convention that cpj→pi(t) = 0 if
pj is not contacting pi at t. This occurs, for ex-
ample, if pj and pi do not contact each other,
and also when pj is not infected. Per contact, an
uninfected person has a certain probability β of
becoming infected as the result of that contact.
Infectiousness is a boolean state—a person is
either infected or not, and a person cannot be re-
infected when already infected. We refer to the
product of β and the cumulative number of con-
tacts pi experiences per unit time at some time
as the “infection pressure” felt by pi at that time.
After becoming infected, we represent each
person as presenting their infection following
a second-order delay. In this paper, we make
the simplifying assumption that all patients are
treated for their infection upon presentation,
meaning that they will not continue spreading
infection after presentation.
We assume that a patient cannot naturally re-
cover from an infection—recover without health-
care intervention. We also assume herein that
each person can become infected at most once.
In order to create a general inference frame-
work, this model is not specific to a particular
pathogen, but instead can be applied reasonably
well to a range of microparasitic infections. Our
model parameters are thus not necessarily repre-
sentative of a specific disease.
3.2 Inference methods
Let ip and tp be respectively the infection and
presentation times of some arbitrary person p.
Consider
P tp ip C , (1)
the probability density of some presentation and
infection times, given a contact network. For
March 9, 2012 Anqi Dong Page 2 of 6
3. A Bayesian approach to identifying high-concern individuals in an infection-bearing population
brevity, we omit some terms of this equation
when discussing it below.
By integrating or summing over some range
of values, and comparing the cumulative proba-
bility of this subset of values to the probability
of the universal set of all permissible values, we
can determine how likely the subset of data is to
occur. As there are many types of data embed-
ded in (1), the above probability density, with
manipulation, provides a rich set of probabilis-
tic information about the sets of infection and
presentation times and C . Here, we focus on de-
termining probabilities related to the infection
time of a certain person p.
P(ip) means “the probability density of per-
son p becoming infected at time ip”. This implies
two things: that person p was not infected before
time ip, and that person p was infected exactly at
time ip. Moreover, P(ip) is a probability density.
To find the probability of person p becoming
infected during the interval [a,b], we integrate,
finding the value of b
a P(ip)dip. Alternately, if
we know that person p must have been infected
somewhere in the interval [c,d], we can use the
definition of conditional probabilities to find that
the probability of ip being in the interval [a,b] is
b
a P(ip)dip
P(U)
=
b
a P(ip)dip
d
c P(ip)dip
,
where U is the universal set.
If we know the presentation or infection time
for a person, we can simply insert that value
into (1). However, often epidemiological data is
more scarce, and many presentation and infec-
tion times are not available. In this case, we can
marginalize the probability through integration.
For example, for some person q, if tq is known
to be in the range [a,b], and iq is known to be
within the range [c,d], the probability that iq < k
for some k, where a < k < b, can be calculated
as
k
a P(iq)diq
b
a P(iq)diq
=
k
a
d
c P(iqtq)diq dtq
b
a
d
c P(iqtq)diq dtq
. (2)
Equation 2 has some important consequences:
the probability that person p is already infected
but has not presented is equivalent to the prob-
ability that ip < T and tp > T (where T is the
current time). Also, the probability that p is un-
infected but will be infected “soon” is equivalent
to the probability that T < tp < T +∆t, where ∆t
quantifies the duration of “soon”.
In our inference model, we additionally allow
for the case where some individuals have been
tested in the past for their “infection status” (in-
fected/uninfected) at that time. We incorporate
this testing data by enforcing additional bounds
on the infection times. For example, if person p
was tested to be uninfected at time x1 and found
to be infected at time x2 (where x2 > x1), we
know that x1 < ip < x2. We can perform similar
bounding with a presentation time: if p has not
yet presented at the present time T, we know that
tp > T. However, the probability density function
itself remains unchanged, as knowledge about in-
fection status does not affect how the infection
behaves.
To calculate the numerical value of (1) and
related equations, we factor the probability into a
product of probabilities, with terms of the general
form P(tp|ip)P ip q∈C (p)
iq . Both of these
two probability terms are expressed in closed
form using typical epidemiological representa-
tions of infection.
3.2.1 Infection time partial ordering
If, say, both iA and iB are unknown, and A and
B are connected, we may not be able to determine
the direction of infection pressure (A → B versus
B → A). To resolve this ambiguity, we marginal-
ize the probability in Equation 1 as follows:
P tp ip C = ∑
d∈D
P tp ip d C .
The directed acyclic graph (dag) d imposes a
topological ordering on C . For each edge, d spec-
ifies which person of the pair was infected first,
thereby also specifying the directionality of infec-
tion pressure. The set D contains all the permis-
March 9, 2012 Anqi Dong Page 3 of 6
4. A Bayesian approach to identifying high-concern individuals in an infection-bearing population
sible dags that contain all the vertices of C and
provide an ordering for all edges of C . Dags that
contradict other knowledge about the ordering of
infection times are excluded from D.
When computing probabilities considering
only infection and presentation times, d is a nui-
sance parameter, and we mathematically rewrite
the expressions to eliminate the use of d and D
in the final integral to be evaluated.
3.3 Assessment of inference
Sources in the literature generally assess the
accuracy and performance of their developed in-
ference algorithms by running their inference
models on historical datasets and discussing the
logicality of the results of the inference. While
such demonstrations are valuable in showing the
practicality of inference results, it is difficult to
validate statistical measures of historical data.
To assess the performance of our inference
algorithm, I instead developed and used a sim-
ulation model of infection spread. This model
representing the infection mechanisms described
above on a best-effort basis, simulating infection
spread and testing for all individuals contained
within a computer-generated contact network.
4 Results and discussion
My primary method of assessing the numer-
ical behavior of the inference algorithm was to
plot the probability of person p becoming in-
fected before time x (given the contact network,
some presentation times, and some patient his-
tory as recorded at some time t, where t may be
less than x) as a function of x. An example of
this can be seen in Figure 1. I plotted this figure
by manually splitting the desired integral along
ip into several integrals with mutually exclusive
regions.
Generally, the results produced by the infer-
ence appear reasonable, considering the addi-
tional data produced by the model (data that was
not used in the inference). That is, the time at
which p was infected in the simulation model is
usually close to or at a part of the integral with
a high rate of change. However, the results of
the simulation model are not necessarily highly
probable, and the probability of ip < t is not a
perfect analogue to the probability density that
ip = t, so such a comparison is not definitive.
Making use of data on patients’ infection his-
tory can lead to the probability densities exhibit-
ing subtle behavior. For example, if person X was
tested to be uninfected at time 2.95, inference
will usually suggest that there is a low probabil-
ity that X was infected by time 3.00. However, if
it was not known that X was uninfected at time
2.95, the probability that iX < 3.00 may be much
higher, for there would then be no restriction
that iX ≥ 2.95. Here, knowledge that iX ≥ 2.95
did not change the probability that iX = 3.00
but it did change the probability that iX < 3.00.
This further demonstrates that the probability that
ip < t is not analogous to the probability density
that ip = t.
Monte Carlo numerical integration (MCI)
techniques were used to evaluate the probabil-
ities required for inference. As MCI is a stochas-
tic technique, it is difficult to properly estimate
the technique’s precision without detailed math-
ematical knowledge of the specific integrand.
While error estimators such as the one in Mathe-
matica proved to be inaccurate assessors of the
MCI’s precision, testing smaller, symbolically
integrable functions showed that the used MCI
implementation was usually within an order of
magnitude of the exact integration value. Consid-
ering that different intervals of integration rou-
tinely differ from each other by ratios of 1010
or more, a magnitude of precision is probably
sufficient for most statistical uses.
Increasing the number of individuals in
C leads to higher-dimensional integrals and a
smaller integrand (in terms of absolute value).
For larger graphs, because of the high variance
observed when performing multiple evaluations
of the same integral, the simple MCI techniques
in Mathematica (the techniques that are currently
used) will be inadequate when scaling up the
inference algorithm to large contact networks.
March 9, 2012 Anqi Dong Page 4 of 6
5. A Bayesian approach to identifying high-concern individuals in an infection-bearing population
0%
20%
40%
60%
80%
100%
0 1 2 3 4 5 6
Probabilitythatpersonhas
alreadybecameinfected
Model time
ip4 ip5 ip6 ip7
Figure 1: A plot of the cumulative probability of each person in a four-person contact network being
already infected, as a function of time.
However, using a well-designed Markov chain
process for point sampling during the integration
would lead to better integrand stability and allow
for inference to be performed on large contact
networks in a reasonable amount of time. Work
is being done towards implementing this feature
into the inference algorithm.
It may be the case that inputting higher-
degree contact networks into the inference model
may lead to tighter inferred distributions, because
larger contact networks generally embed more
information and heterogeneity. The increased
amount of available data means better statistics
can be inferred. However, some reengineering of
the integration mechanism is likely required be-
fore large-scale testing of higher-degree contact
networks can be performed.
5 Conclusions
The inference algorithms I developed demon-
strate that it is possible to infer distributions for
the likelihood of becoming infected at a certain
time from limited epidemiological data (contact
network structure, some presentation times, and
some infection testing history), even if this time
is in the future. However, when using Bayesian
probabilistic techniques, it is important to remem-
ber that they are not omniscient or failproof. The
inference techniques described herein, while po-
tentially very powerful, can be not very informa-
tive or even misleading if used to analyze sig-
nificantly erroneous data or insufficient sets of
data.
Though discussions of the required proce-
dures is beyond the scope of this report, it is
clear that our inference algorithm can be eas-
ily adapted mathematically to represent an even
wider range of scenarios and to infer more types
of data. Possible extensions of this inference
model in the near future include representing
natural recovery, allowing for dynamic (evolv-
ing) contact networks, modeling static sources
of infection, and determining the probability of a
particular directed edge spreading infection.
6 Acknowledgments
I would like to thank Dr. Michael Horsch
and Dr. Nathaniel Osgood of the University of
Saskatchewan for their oversight and their sug-
gestions.
March 9, 2012 Anqi Dong Page 5 of 6
6. A Bayesian approach to identifying high-concern individuals in an infection-bearing population
References
[1] T. Britton, T. Kypraios, and P. D. O’Neill. Inference for epidemics with three levels of mixing:
Methodology and application to a measles outbreak. Scandinavian Journal of Statistics,
38(3):578–599, 2011.
[2] T. Britton and P. D. O’Neill. Bayesian inference for stochastic epidemics in populations with
random social structure. Scandinavian Journal of Statistics, 29:375–390, 2002.
[3] K. T. D. Eames and M. J. Keeling. Contact tracing and disease control. Proceedings of the
Royal Society of London B, 270:2565–2571, 2003.
[4] J. A. Fleishman, B. R. Yehia, R. D. Moore, K. A. Gebo, and HIV Research Network. The
economic burden of late entry into medical care for patients with hiv infection. Med Care,
48(12):1071–1079, 2010.
[5] C. Groendyke, D. Welch, and D. R. Hunter. Bayesian inference for contact networks given
epidemic data. Scandinavian Journal of Statistics, 38:600–616, 2011.
[6] M. Hashemian, K. G. Stanley, D. L. Knowles, J. Calver, and N. D. Osgood. Human network
data collection in the wild: The epidemiological utility of micro-contact and location data. In
Proceedings of the ACM SIGHIT International Health Informatics Symposium (IHI 2012),
Miami, FL, January 28–30 2012.
[7] Y. Hosseinkashi. Statistical Inference on Stochastic Graphs. PhD thesis, Department of
Statistics, University of Waterloo, 2011.
[8] H. B. Krentz, M. C. Auld, and M. J. Gill. The high cost of medical care for patients who
present late (CD4 < 200 cells/µl) with HIV infection. HIV Medicine, 5:93–98, 2004.
[9] C. Mulder, C. G. M. Erkens, P. M. Kouw, E. M. Huisman, W. Meijer-Veldman, M. W. Borgdorff,
and F. van Leth. Missed opportunities in tuberculosis control in the netherlands due to
prioritization of contact investigations. European Journal of Public Health (advance access),
2011.
[10] J. Ray and Y. M. Marzouk. A Bayesian method for inferring transmission chains in a partially
observed epidemic. In Proceedings of the Joint Statistical Meetings, Denver, CO, 2010. Sandia
National Laboratories.
[11] J. Read, K. Eames, and W. Edmunds. Dynamic social networks and the implications for the
spread of infectious disease. Journal of the Royal Society Interface, 5:1001–1007, 2008.
[12] M. Salath´e, M. Kazandjieva, J. W. Lee, P. Levis, M. W. Feldman, and J. H. Jones. A high-
resolution human contact network for infectious disease transmission. Proceedings of the
National Academy of Sciences of the USA, 107(51):22020–22025, 2010.
[13] J. Veen. Microepidemics of tuberculosis: the stone-in-the-pond principle. Tubercle and Lung
Disease, 73:73–76, 1992.
March 9, 2012 Anqi Dong Page 6 of 6