HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in settings where anti-retroviral therapy is provided

1
Estimating HIV incidence from grouped cross-sectional
data in settings where anti-retroviral therapy is provided
Humphrey Misiri
hmisiri@gmail.com
Public Health Department, College of Medicine, University of Malawi, Blantyre, Malawi
Key words: incidence, antiretroviral

2
Abstract
Prevalence and incidence are measures that are used for monitoring the occurrence of a
disease. Prevalence can be computed from readily available cross-sectional data but incidence
is traditionally computed from longitudinal data from longitudinal studies. Longitudinal
studies are characterised by financial and logistical problems where as cross-sectional studies
are easy to conduct. This paper introduces a new method for estimating HIV incidence from
grouped cross-sectional sero-prevalence data from settings where antiretroviral therapy is
provided to those who are eligible according to recommended criteria for the administration
of such drugs.
Introduction
Antiretroviral therapy (ART) has helped to alleviate the suffering of AIDS patients in the
world. In many countries, patients have access to ART. In Malawi, ART is also available for
free but not all HIV positive persons have access to ART. By 2011, over 30% of HIV
positive persons were on ART [1].
Incidence is a very important measure of disease occurrence. If the incidence of HIV is
known, it is easy to monitor its spread. On the other hand, prevalence alone does not give
complete information about the magnitude of the spread of HIV or any disease in general.
Consider a virulent disease like Ebola which kills after just a few days from infection.
Individuals who are infected with the Ebola virus die after a very short illness if no
meaningful therapeutic intervention is available. In that case, prevalence can never give a true
picture of the extent of an Ebola epidemic since those who die from the disease are never
counted. As a result, a low prevalence of Ebola does not mean Ebola is about to be non-
existent or is almost eradicated from a community. On the other hand, the incidence of Ebola

3
is the best measure which can be used to monitor the disease since Ebola deaths are included
in its computation. Consequently, incidence gives a true picture of an Ebola epidemic. In the
same vein, HIV incidence gives a true picture of the spread of HIV in a community.
Traditionally, incidence is computed from data from longitudinal studies. Unfortunately,
there are many financial and logistical problems associated with conducting longitudinal
studies. To avoid these drawbacks, a viable alternative is to estimate incidence from data
from cross-sectional studies. Two good examples of methods for achieving this are models by
Podgor and Leske (1986) and Misiri et al (2012)[2, 3]. These models produce estimates of
incidence which are adjusted for differential mortality. Both approaches are for estimating
HIV incidence where ART is not properly rolled out in the community. It is possible to
estimate the incidence of HIV from cross-sectional data from a population where ART is
provided.
The aim of this paper is to introduce a new method of estimating HIV incidence in settings
where ART is provided to HIV positive people who need it regardless of the extent of
coverage of such services. This method also adjusts for differential mortality.
Materials and Methods
Motivation
Podgor and Leske(1986) proposed a method for estimating incidence from grouped cross-
sectional data [3]. In the spirit of Podgor and Leske(1986), we proceed to motivate our
approach. Let 1λ be the rate of natural mortality, 2λ be the HIV incidence, 3λ be the rate of
HIV mortality in the absence of ART, 4λ be the rate of recruitment to ARV therapy, 5λ be
the rate of mortality among ART recipients.

4
Let X1, X2, X3, X4 and X5 be independent random variables where X1 is the time to death
from natural causes, X2 is the time to HIV infection, X3 is the time to death whilst HIV
positive, X4 is the time to ART registration and X5 is the time to death whilst on ART. It
follows from the above description that X1, X2, ... , X5 have exponential distributions with
parameters 4321 ,,, λλλλ and 5λ respectively.
We will proceed by dividing the population into three strata namely: HIV negative persons,
HIV positives on ART and HIV positives who are not on ART. Denote the total proportion
of HIV positives by P0, the proportion of positives who are not on ART by P01 and the
proportion of positives who are on ART by P02. Both P01 and P02 are proportions of the
population.
Consider an interval [x, x+t]. The number of HIV positives at the end of the interval is
N1P1=N0P0S1+ N0(1-P0)S2 (1)
where
S1 is the probability of surviving the interval given that one entered the interval already
infected.
S2 is the probability of being infected in the interval given that one was HIV negative at the
beginning of the interval.
Furthermore, the number of HIV negatives at the end of the interval is
N1(1-P1)=N0(1-P0)S3 (2)
where S3 is the probability of surviving the interval without contracting HIV
According to the relationship among these exponential random variables [3, 4]

5
( ) ( )4343
1
0
431 )(1 λλλλ
λλ +−+−
=+−= ∫ edteS ,
S2= ( )
( )
( )
( )321
2
1
0
2
213
3321
1
λλλ
λ
λ
λλλ
λλλλ
−+
−
=−
+−−
−−+−
∫
ee
dte t
and
( ) ( )2121
1
0
213 )(1 λλλλ
λλ +−+−
=+−= ∫ edteS t
[3]
In the interval [x, x+t], some people may have just been registered to receive ART but some
were already registered prior to entering the interval. Therefore the formula in (1) above does
not capture the number of infected people in [x, x+t] in a setting where ART is provided. If
ART is provided, at the end of the interval there are two groups of HIV positive individuals
namely those who are not on ART and those who are on ART.
Not every infected person is eligible for ART. For example, an individual who gets infected
with HIV in a 5-year interval can never be eligible for ART as the therapy is for HIV
positives who are in a reasonably advanced stage of infection. Therefore, the number of HIV
positive individuals who are on ART at the end of the interval is the sum of old HIV positives
who entered the interval already on ART and those HIV positives who are newly registered
to receive ART. This can be denoted by N0P02S4 + N0P01S5 (4)
where
S4 is the probability of surviving to the end of the interval whilst on ART given than one
was already on ART at the beginning of the interval
S5 is the probability of surviving to the end of the interval having been newly recruited to
receive ART given than one was not on ART at the beginning of the interval

6
Using the relationship between independent exponential random variables as described in
Lagakos(1976) on pages 553 through 555[4], these probabilities are defined as follows:
5
4
λ−
= eS
( )
( )
( )
( )543
4
1
0
45
435
5543
λλλ
λ
λ
λλλ
λλλλ
−+
−
==
+−−
−−+−
∫
ee
dteS t
.
Therefore (4) becomes
( )
( )
( )543
4
010020
435
5
λλλ
λ λλλ
λ
−+
−
+
+−−
− ee
PNePN . (5)
The number of HIV positives at the end of the interval is therefore
( )
( )
( )
( )
( )
( )
( )
( ) 





−+
−
++





−+
−
−+=
+−−
−
+−−
+−
543
4
010020
321
2
0001011
435
5
213
43
1
λλλ
λ
λλλ
λ λλλ
λ
λλλ
λλ ee
PNePN
ee
PNePNPN .
(6)
The number of HIV negative persons at the end of the interval is
N1(1-P1) =N0(1-P0)S6 (7)
where S6 is the probability of remaining HIV negative having survived the interval
Now, ( ) ( )2121
1
0
216 )( λλλλ
λλ +−+−
∫ =+= edteS . Therefore the equation in (7) becomes
( ) ( ) ( )21
0011 11 λλ +−
−=− ePNPN (8)
From (8) we have that
( )
( )
( )21
1
0
01
1
1 λλ +−
−
−
= e
P
P
NN . Therefore the left hand side of equation (6)
becomes
( )
( )
( )21
1
1
0
0
1
1 λλ +−
−
−
eP
P
P
N .
Consequently, equation (6) becomes
( )
( )
( )
( )
( )
( )
( )
( )
( )
( ) 





−+
−
++





−+
−
+=
−
− +−−
−
+−−
+−
+−
543
4
010020
321
2
010010
1
00
435
5
213
43
21
1
1
λλλ
λ
λλλ
λ λλλ
λ
λλλ
λλ
λλ
ee
PNePN
ee
PNePN
P
ePP

8
Description of the data
The estimated population of Malawi in 2011 was 14,388,550[5]. The national prevalence of
HIV was 10% in 2010[6]. The provision of ARV therapy in Malawi is overseen by the HIV
Unit in the Ministry of Health and Population. By 2011, 382,953 people were on ARV
therapy[1]. The remaining 1,055,902 were not on ARV therapy. In the same year, the number
of deaths due to HIV was 43,000 [1].
From the ARV Supervision database for 2004-2009 which was maintained by the HIV Unit,
in 2004 there were 3,262 ART registrations [7]. By the end of 2008, a total of 20,393 HIV
positive persons were recruited to receive ARV therapy. This gives a recruitment rate ( 4λ ) of
3,426 people per year on average. Studies [8, 9] conducted in Malawi found that ART
reduces mortality by 10%[8]. Therefore given HIV mortality rates, the rate of mortality
among those on ART is 35 *9.0 λλ = .
The age-specific HIV sero-prevalence data analysed for this paper are extracted from the
database of the Malawi Demographic and Health Survey (MDHS2010) which was conducted
in 2010. The data are in Table 1 below.
Table 1:Nationally representative HIV sero-prevalence data for Malawi, 2010
HIV- % HIV +
Not on
ARV % On ARV %
Agegroup Number (p03) Number % Number (p01) Number (p02)
15-19 3208 0.022 71 0.022 63 0.020 8 0.002
20-24 2370 0.051 122 0.051 114 0.048 8 0.003
25-29 2141 0.108 232 0.108 197 0.092 35 0.016
30-34 1560 0.181 283 0.181 227 0.146 56 0.036
35-39 1224 0.246 301 0.246 232 0.190 69 0.056
40-44 870 0.247 215 0.247 155 0.178 60 0.069
45-49 817 0.193 158 0.193 95 0.116 63 0.077
50-54 295 0.129 38 0.129 25 0.085 13 0.044

9
In 1992, HIV was not endemic as it is today. Mortality, in general, was mainly due to causes
other than HIV. As HIV spread throughout Malawi, HIV became the leading cause of
mortality. The provision of ART to HIV positives has reversed this trend in mortality.
Therefore, the mortality estimates for 1992 represent true natural mortality rates for Malawi
which are not contaminated by HIV mortality. The source of HIV mortalities is a study by
Crampin et al (2002). This study reports mortality rates for HIV persons not on ARV
therapy from a study conducted in a typical rural setting representative of an average rural
area in Malawi [10. These estimates represent HIV mortality rates in rural Malawi in the
absence of ARV therapy. Table 2 below contains the natural and HIV mortality rates.
Table 2: Natural and AIDS Mortality rates for Malawi
Age group
index
Natural mortality
rates( 1λ ) AIDS mortality rates( 3λ )
(j)
Age
group Men Women for men and women
1 15-19 0.0038 0.0053 0.0471
2 20-24 0.0041 0.0036 0.0593
3 25-29 0.0068 0.0068 0.0675
4 30-34 0.0084 0.0072 0.1354
5 35-39 0.0076 0.009 0.1354
6 40-44 0.0101 0.0089 0.1427
7 45-49 0.0097 0.0096 0.1427
8 50+ 0.0097 0.0096 0.2339
Results
HIV incidence estimates for 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54 age
groups are in Table 3. The 95% confidence interval for each estimate is also presented. The
incidence estimates were obtained by using the Newton-Raphson method. The initial values

10
of 2λ plucked into the Newton-Raphson algorithm were obtained by a combination of
methods which include inspection , use of the R function uniroot and numerical search
procedures.
Table 3: Nationally representative HIV incidence estimates for Malawi
Incidence 95% CI for Incidence
Agegroup FOI SE
per
5years
Lower
limit
Upper
Limit
15-19 0.0607 0.000358 61 60 61
20-24 0.0858 0.000779 86 84 87
25-29 0.1171 0.001806 117 114 121
30-34 0.1628 0.00157 163 160 166
35-39 0.1428 0.00108 143 141 145
40-44 0.1446 0.000897 145 143 146
45-49 0.1447 0.000755 145 143 146
50-54 - - - -
The age group with the highest incidence is the 30-34 year age group. The smallest incidence
is for the 15-19 year age group. Although 40-44 and 45-49 age groups have the same
incidence estimate the two estimates are different correct to 6 decimal places. All the
standard errors of the FOI estimates are very small. Furthermore, the 95 % confidence
intervals for the 15-19 through 45-49 age groups are very narrow.
Discussion
This method is a very good way of estimating incidence from cross-sectional data. It is
impossible to estimate the HIV incidence for the age group 50-54 years because the structure
of the model does not permit it.
Our new method relies heavily on the existence of the roots of ( )2λf . For the 50-54 year age
group, no estimate is possible because of the nature of the model used.

11
We tested the sensitivity of the method to the size of P01 and P02. According to our findings,
big values of P01 and P02 resulted in ( )2λf whose roots were hard to estimate. In order to
have reasonable smaller P01 and P02 for the Newton-Raphson method to converge efficiently,
both parameters (P01 and P02) must be defined as proportions of the sample for each age
group. In any case, the number of people on ART is bound to be small, therefore as a fraction
of the sample for each age group, this produces proportions which make it easy to achieve
convergence when using the Newton - Raphson algorithm.
The objective of the method is to produce incidence estimates. Therefore, defining P01 and
P02 as proposed above does not make the results of the current method unusable. The reader
who wants the proportions P01 and P02 to be defined otherwise can do so and can compute the
proportions based on his own definitions from data [3].
The fact that all the confidence intervals were narrow can be explained by the size of the
samples for each age group. All sample sizes were very big. In such cases, standard errors are
very small. These affect the size of the margin of error. Eventually, confidence intervals
computed from such standard errors are likewise narrow. Besides, the narrow confidence
intervals are indicative of high precision in the estimation of FOI.
Conclusion
The novel method introduced in this paper is a very good approach for estimating HIV
incidence from aggregated data collected from settings where ART is provided to HIV
infected individuals. This method is timely as it comes at a time when provision of ART is
rampant in many countries of the world.
Competing interests

12
There are no competing interests.
Acknowledgments
I am very grateful to ORC Macro International for allowing me to analyse the MDHS2010
data.
Authors' contributions
HM conceived the study, conceived the method, obtained the data, analyzed the data, drafted
the manuscript and revised it.
References
1. HIV Unit: 2012 Global AIDS Response Progress Report:Malawi Country Report
for 2010 and 2011. Lilongwe,Malawi: Ministry of Health,Malawi Government; 2012.
2. Misiri HE, Edriss A, Aalen OO, Dahl FA: Estimation of HIV incidence in Malawi
from cross-sectional population-based sero-prevalence data. Journal of the
International AIDS Society 2012, 15(14).
3. Podgor MJ, Leske M: Estimating incidence from age-specific prevalence for
irreversible diseases with differential mortality. Statistics in Medicine 1986, 5:573-
578.
4. Lagakos SW: A stochastic model for censored-survival data in the presence of an
auxiliary variable. Biometrics 1976, 32(3):551-559.
5. "Population projections for Malawi."
[http://www.nso.malawi.net/index.php?option=com_content&view=article&id=134%
3Apopulation-projections-for-malawi&catid=8&Itemid=3. ]
6. National Statistical Office (NSO) ORC Macro: Malawi Demographic and Health
Survey 2010. Zomba: National Statistical Office (NSO) and O. R. C. Macro; 2010.
7. HIV Unit. ART Supervision Database. Lilongwe: Ministry of Health and Population,
Malawi Government; 2009.

13
8. Jahn A, Floyd S, Crampin AC, Mwaungulu F, Mvula H, Munthali F, McGrath N,,
JMwalaso J MV, Mangongo B, Fine PEM, Glynn JR: Population-level effect of HIV
on adult mortality and early evidence of reversal after introduction of
antiretroviral therapy in Malawi. Lancet 2008, 371:1603-1611.
9. Floyd S, Molesworth A, Dube A, Banda E, Jahn A, Mwafulirwa C, Ngwira B,
Branson K, Crampin AC, Zaba B, Glynn JR, French N : Population-level reduction
in adult mortality after extension of free Anti-Retroviral Therapy provision into
rural areas in Northern Malawi. PLoS ONE 2010, 5(10).
10. Crampin AC, Floyd S, Glynn JR, Sibande F, Mulawa D, Nyondo A, Broadbent P,
Bliss L, Ngwira B, Fine PE: Long-term follow-up of HIV-positive and
HIVnegative individuals in rural Malawi. AIDS 2002, 16:1545–1550.

15
The variances of P0 is ( ) ( )
0
00
0
1
N
PP
PVar
−
= . Similarly the variance of P1 is
( ) ( )
1
11
1
1
N
PP
PVar
−
= .

HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in settings where anti-retroviral therapy is provided

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in settings where anti-retroviral therapy is provided

Similar to HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in settings where anti-retroviral therapy is provided (20)

More from College of Medicine(University of Malawi)

More from College of Medicine(University of Malawi) (13)

Recently uploaded

Recently uploaded (20)

HumphreyMisiri_Estimating HIV incidence from grouped cross-sectional data in settings where anti-retroviral therapy is provided