The Epidemiology of Tuberculosis in SanFrancisco -- A Population-Based Study UsingConventional and Molecular MethodsPeter M. Small, Philip C. Hopewell, Samir P. Singh, Antonio Paz, Julie Parsonnet, Delaney C.Ruston, Gisela F. Schecter, Charles L. Daley, and Gary K. SchoolnikN Engl J Med 1994; 330:1703-1709June 16, 1994DOI: 10.1056/NEJM199406163302402Share:AbstractArticleReferencesCiting Articles (295)Letters Tuberculosis and its recent resurgence are predominantly urban phenomena in the United States, where case rates in large cities are almost two and a half times higher than the national average1. A combination of biologic and social factors has been postulated to account for this situation. In many cities, the number of persons who are immunosuppressed by infection with the human immunodeficiency virus (HIV) and the prevalence of drug-resistant tuberculosis have increased in the face of deteriorating socioeconomic conditions and public systems of health care delivery2. As a result, important changes seem to have occurred in the patterns of Mycobacterium tuberculosis transmission. In particular, the long-held assumption that only 10 percent of tuberculosis cases are the result of recent infection needs to be reconsidered3. The combination of molecular fingerprinting of M. tuberculosis strains and conventional epidemiologic investigation has improved understanding of the transmission of tuberculosis. Molecular fingerprinting by restriction-fragment-length polymorphism (RFLP) analysis yields a unique, strain-specific pattern of bands (the “fingerprint”) that is stable for at least two years4-8. Comparison of M. tuberculosis fingerprints from tuberculosis strains isolated during circumscribed outbreaks has demonstrated matching patterns among persons who were clearly infected from a common source4,8-19. By showing that patients with no obvious epidemiologic relation are infected with the same strain, molecular fingerprinting has revealed that M. tuberculosis can be transmitted during brief contact between persons who do not live or work together18,20,21. Taken together, these studies suggest that patients with the same M. tuberculosis RFLP pattern constitute an epidemiologically linked cluster. Furthermore, because tuberculosis developed during a relatively short period in patients in a cluster, clustering indicates recent infection and rapid progression to clinical illness22.
We conducted a population-based molecular epidemiologic study of tuberculosis in SanFrancisco. In addition to providing an estimate of the incidence of tuberculosis thatresults from recently transmitted infection, we identified some of the risk factors for thetransmission of M. tuberculosis. Our results suggest that current tuberculosis-controlstrategies have important limitations in contemporary urban environments.MethodsPatient Identification and Routine Data CollectionThe population studied included all patients with tuberculosis who were reported to theSan Francisco Department of Public Health, Division of Tuberculosis Control, betweenJanuary 1, 1991, and December 31, 1992. The routine demographic data collectedincluded age, sex, race or ethnicity, country of birth, number of years of residency in theUnited States, and address at the time of diagnosis. Specific information concerningtuberculosis included the date of diagnosis, site or sites of disease, results of chestradiographs, and results of microbiologic studies.The registries for tuberculosis and the acquired immunodeficiency syndrome (AIDS)maintained by the San Francisco Department of Public Health were cross-matched toidentify all patients reported to have both tuberculosis and AIDS as of September 1993.Confidentiality was ensured by having health department personnel remove allidentifying information before the data analysis. The subjects socioeconomic status wasestimated by matching patients addresses at the time of diagnosis to census-tract data(including indexes of unemployment, income, poverty level, education level, crowding,immigration status, and racial or ethnic distribution). Census-tract information was notincluded for 14 homeless persons.Collection of M. tuberculosis Isolates and RFLP AnalysisLowenstein-Jensen slant cultures used for mycobacterial identification and drug-susceptibility testing were prospectively collected for all microbiologically confirmednew cases of tuberculosis in San Francisco. RFLP analysis was performed with aninternationally standardized method with internal molecular-weight standards23. Theresulting autoradiographs were compared with the Bio Image Whole Band Analyzer,version 3.0 (Millipore, Ann Arbor, Mich.). All lanes that were found by computeranalysis to have similar patterns were compared visually and classified as havingmatching RFLP patterns if the number and molecular weights of the bands were identical.Microbiology records were scrutinized for all patients who had only a single positiveculture for which a smear for acid-fast bacilli was negative. These cultures wereconsidered to be false positive if they were processed in the microbiology laboratory onthe same day as a specimen with a positive smear from another patient with the sameRFLP pattern24.Epidemiologic Investigations
For all patients treated by the Department of Tuberculosis Control, an investigation ofcontacts was conducted by trained, multilingual disease-control investigators usingstandard methods25. For patients whose care was not managed by the Division ofTuberculosis Control, contact investigation was conducted either by the treatingphysician or by Tuberculosis Control personnel. In addition to the routine contactinvestigation, selected groups of patients infected with organisms with identical RFLPpatterns were studied further by a more intensive review of the Division of TuberculosisControl records. For patients in the largest cluster, all available clinic and hospital recordswere reviewed and the patients were interviewed.Statistical AnalysisData were entered and analyzed with FoxPro 2.5 (Microsoft, Redmond, Wash.), EpiInfo(Centers for Disease Control and Prevention, Atlanta), Egret (Statistics and EpidemiologyResearch Corporation, Seattle), and PC SAS (SAS Institute, Cary, N.C.) computerprograms. A cluster was defined as two or more patients with identical RFLP patterns.Patients with unmatched RFLP patterns were considered nonclustered.Students t-test and the chi-square test were used to assess univariate risk factors for beingin a cluster. Risk factors for clustering identified by univariate analysis were thenincluded in multivariate logistic-regression models, with clustered and nonclustered asthe dependent outcomes. Because age appeared to be related to clustering in a nonlinearfashion, with a marked decrease in risk at the age of 60 years, age was categorized aseither less than 60 years or 60 years or older. Odds ratios were calculated from regressionestimates based on the chi-squared approximation for the likelihood-ratio statistic; 95percent confidence intervals were based on the estimated variance of the regressioncoefficients26. The likelihood-ratio statistics were also used to contrast the relativegoodness of fit between competing logistic-regression models. Tests for interaction wereconducted for all likely interacting variables. Age, sex, and factors that remainedsignificant after adjustment for related variables were included in a final model.ResultsPatient Population and RFLP Patterns ObtainedDuring 1991 and 1992, 688 cases of tuberculosis were reported to the Division ofTuberculosis Control, 585 of which were confirmed by the isolation of M. tuberculosis.Viable isolates of M. tuberculosis were not available from 89 patients. These patientswere similar to the 496 patients included in this study except that they were slightly older(median age, 46 years; P = 0.02) and more likely to be Asian (RFLP data were notavailable on 20 percent of Asian patients, P = 0.003).Nine of the 496 patients were excluded from further study because their culture resultsfulfilled the criteria for laboratory cross-contamination. RFLP analysis of the strainsisolated from the remaining 487 patients identified 326 distinct patterns, 282 of whichwere found in only 1 patient.
Previously published molecular biologic and epidemiologic studies have concluded that aclonal relation cannot be inferred to exist between strains of M. tuberculosis that haveonly one copy of IS611027,28. Accordingly, the 12 M. tuberculosis strains with only onecopy of IS6110 were not included in the epidemiologic analysis. Consequently, thestatistical analysis was based on 473 patients (Table 1Table 1 Analysis ofRisk Factors for Clustering in 473 San Francisco Patients.) and 324 RFLP patterns, ofwhich 44 were found to be shared by at least 2 patients (i.e., they were in clusters). The44 shared RFLP patterns were obtained from 191 patients (Table 2Table 2 Cluster Sizes and the Number of Clusters among 473 San FranciscoPatients with Tuberculosis.). The RFLP patterns of strains isolated from clusterscontaining three or more patients are shown in Figure 1Figure 1 Resultsof RFLP Analysis of M. tuberculosis Strains Isolated from Three or More Patients.. Thus,191 of the 473 patients (40 percent) were in 1 of the 44 clusters; the clusters ranged insize from 2 to 30 patients.Identification of Risk FactorsTo identify risk factors for recent infection with M. tuberculosis, the 191 patients inclusters were compared with the 282 patients not in clusters. Univariate analysis (Table1) showed that patients in clusters were more likely to be male, young (mean age, 40.8years, vs. 48.4 years for patients not in clusters; P<0.001), black or Hispanic, and born inthe United States; to have AIDS; to have received care at the Division of TuberculosisControl clinic; and to reside in a census tract with a poverty rate of more than 20 percent.In contrast, a history of tuberculosis and Asian race were associated with a significantlydecreased risk of being in a cluster. Infection with drug-resistant M. tuberculosis and thelevel of crowding and education in the census tract were not associated with clustering(data not shown).Multivariate analysis of the risk factors for clustering revealed significant differencesbetween younger and older patients (Table 3Table 3 Analysis of Risk
Factors for Clustering after Adjustment for Sex and Age at Diagnosis.). For patientsyounger than 60 years, risk factors for clustering included Hispanic ethnicity (odds ratio,3.3; P = 0.02), black race (odds ratio, 2.3; P = 0.02), birth in the United States (odds ratio,5.8; P<0.001), and AIDS (odds ratio, 1.8; P = 0.04). In contrast, for patients 60 years ofage or older, the only significant risk factor was having been cared for at the Division ofTuberculosis Control clinic (odds ratio, 5.7; P = 0.008). In the older age group, Asianrace was again associated with a reduced risk of being in a cluster.Epidemiologic Investigation of RFLP ClustersIntensive epidemiologic investigations were conducted of the 3 largest clusters and the 20clusters composed of only two patients. Thus, 23 of the 44 clusters (52 percent), or 108 ofthe 191 patients (56 percent) with isolates with identical RFLP patterns, were included inthis analysis.Routine investigation had established that 12 patients in the largest cluster (Table 2) wereliving in or employed by a residential facility for patients with AIDS12. Our RFLPanalysis identified an additional 18 patients with isolates with the same fingerprint whowere not previously known to have any association with the facility. Seven of thesepatients were available for interview, eight had died, and three could not be located orrefused to be interviewed.The apparent index patient in this cluster was a 38-year-old white man with AIDS whowas receiving general assistance, was not compliant with antituberculous therapy, andhad had positive sputum smears for approximately six months. Specific transmissionlinks could be established among nine of the patients who were not associated with theresidential facility (Figure 2Figure 2 Transmission Links Identifiedbetween Patients with Isolates in the Largest Cluster.): two named one another ascontacts, three were on the same hospital ward, and four were in the same generalmedical clinic at a time when it was reasonable to assume that transmission had occurred.Although seven additional patients were homeless, homosexual, or substance abusers,they were not otherwise linked epidemiologically. Three patients had no discernibleconnection with any of the other patients.The second-largest cluster contained 23 patients who were primarily young (average age,33 years), born in the United States (18 patients), and male (19 patients); 13 had AIDS,and 8 were substance abusers. The index patient was a 28-year-old white HIV-infectedtranssexual man who was an intravenous drug user and a prostitute. He had been found tohave tuberculosis, with a positive sputum smear, shortly after moving to San Franciscoand was noncompliant with therapy. The M. tuberculosis strain found in this patient wasnext isolated from four other young homeless HIV-infected intravenous drug users over athree-month period and subsequently from a more diverse group of patients.
The apparent index patient in the third-largest cluster (15 patients) was a 36-year-oldHIV-seronegative black alcoholic man with cavitary pulmonary tuberculosis. Hefrequently used public facilities, including homeless shelters, detoxification centers,public clinics, and hospitals. This patient also was noncompliant with therapy and hadhad positive sputum smears for nine months. Most of the other patients in this clusterwere also black (12 patients) and alcoholic (8 patients); only 5 of the 15 patients wererecorded as having AIDS.Efficacy of Contact TracingA conventional investigation of the patients contacts identified connections among only19 of the 191 patients (10 percent) found to be connected by RFLP analysis. Of the threelargest clusters, contact tracing identified only the outbreak in the AIDS facility.To examine further the relation between the patients characteristics and the accuracy ofconventional contact tracing, we studied the 20 clusters that contained only two patientseach. Conventional contact tracing conducted before the RFLP results were availablepredicted transmission in only four of these clusters, all of them involving contactbetween an older patient who presumably had reactivated tuberculosis and a youngerperson in a traditional household setting. No instances of transmission betweenimmigrants, transients, or patients with AIDS were predicted from the contactinvestigation.DiscussionWe used a systematic, population-based RFLP analysis of M. tuberculosis isolates inconjunction with conventional epidemiologic methods to describe the contemporarypattern of tuberculosis transmission in San Francisco. The information produced by thisapproach is consistent with that yielded by traditional reporting practices in that itenumerates and characterizes the cases that occurred during a given period in a singlepublic health jurisdiction. However, our data provide considerably more informationabout tuberculosis transmission in this urban area, including evidence that an importantfactor in the resurgence of tuberculosis, despite an efficient tuberculosis-control program,is the ongoing transmission of a few strains of M. tuberculosis in specific subgroups ofthe population.The use of RFLP analysis to identify the pathways of tuberculosis transmission within acommunity is based on the premise that epidemiologically unrelated cases will haveoccurred as a result of the reactivation of latent infection and thus have unique RFLPpatterns, whereas cases that are linked as a consequence of recent infection will have thesame patterns (i.e., appear in a defined cluster). In this study, the first contention issupported by the vast diversity of RFLP patterns in San Francisco: 326 distinct patternsamong the 487 strains analyzed. The second is supported by the congruence of themolecular-fingerprinting data and results of the epidemiologic study of tuberculosisoutbreaks4,8-21.
We found that 191 of the 473 patients (40 percent) had 1 of 44 clustered RFLP patternsand thus may have been epidemiologically linked. Assuming that a typical cluster of npersons comprises one index patient with reactivated disease and n - 1 patients withrecently acquired disease, we estimate that at least 31 percent (191 - 44) of the 473 caseswere due to recent infection that had progressed to active disease during the two-yearstudy period. Because RFLP analysis can only be used to analyze microbiologicallyconfirmed cases, patients who became infected but whose infection remained latentduring the course of the study were not identified. Reactivation of infection in theselatently infected persons will continue to produce overt disease for decades. As a result,the true magnitude of the increased burden of tuberculosis due to recent M. tuberculosisinfection in San Francisco is probably greater than our estimate of 31 percent.A principal objective of this study was the identification of risk factors for recentinfection. Because we focused only on cases reported during a two-year period, ouranalysis of risk factors encompassed only the subgroup of recently infected patientswhose infection progressed to active disease during this interval. As a result,epidemiologic risk factors for transmission are necessarily combined with biologic riskfactors that are associated with rapid progression.For patients less than 60 years of age, a diagnosis of AIDS, birth in the United States,black race, and Hispanic ethnicity were found by multivariate analysis to be significant,independent risk factors. HIV seropositivity itself was not a significant risk factor forclustering in the patients for whom HIV serologic data were available (data not shown),probably reflecting the importance of the degree of immunosuppression in thedevelopment of tuberculosis. In contrast, patients with AIDS and severeimmunosuppression are at increased risk of being in a cluster. This probably reflects thecombined effects of a shortened interval between infection and active disease and thetendency for patients with AIDS to be brought together in common medical or livingfacilities.Being born in the United States also might act as a risk factor through a biologicmechanism, since most such persons will have a negative tuberculin test and thus lack therelative immunity associated with latent tuberculosis. In younger subjects, birth outsidethe United States protected against newly acquired infection. Even after adjustment forrace and ethnicity, the immigrant population was significantly more likely to havereactivated disease (and was less likely to be in a cluster) than persons born in the UnitedStates. This may reflect the high rate of latent tuberculosis infection in children born indeveloping countries. If so, our results suggest that childhood infection both protectsimmigrants from new infection and places them at risk for reactivation.Strikingly different risk factors were found for persons 60 years of age or older. In thisage group, treatment at the municipal tuberculosis clinic was the only variable identifiedas a risk factor for clustering. Because most patients cared for in this clinic have alreadybeen given a diagnosis of tuberculosis, the clinic itself is unlikely to have been a locus fortransmission. Instead, its use may be a proxy for the use of other social and medicalfacilities where transmission may have occurred. In the older age group, being Asian was
a significant negative risk factor for clustering, probably because many older patientshave latent infection that may become reactivated.Epidemiologic investigation of the three largest clusters reconfirms that a single patientwith highly infectious disease can have a major impact on urban programs of tuberculosiscontrol. Each of the index patients had positive smears and was poorly compliant intaking the prescribed antimicrobial therapy. In the largest cluster the putative indexpatient, one of the few patients not treated successfully by the San Francisco TuberculosisControl Program, apparently infected 29 additional patients. Thus, this one patientaccounted for 6 percent of the cases evaluated in San Francisco during the study period.Data collected by the Centers for Disease Control and Prevention show that suchnoncompliant patients are uncommon in San Francisco, where during the study period atleast 95 percent of patients completed their regimens of antituberculous drugs. Thecumulative contribution of such persons may be much greater in areas where compliancerates are lower and multidrug-resistant tuberculosis is prevalent.Overall, conventional contact tracing, conducted by an efficient tuberculosis-controlprogram, identified only 10 percent of the patients in clusters. This low level of efficacyis best explained by the overrepresentation in clusters of unemployed and homelesspersons, who may have become infected in settings determined primarily by lifestyle andby social subgroups. Contacts of this kind may have been multiple, transient, and difficultto reconstruct by routine tracing techniques. The overrepresentation of patients withAIDS may also have reduced the efficacy of contact tracing in this group, since thepresumably increased susceptibility of such persons to tuberculosis may have permittedtransmission to occur in settings where exposure is neither prolonged nor intense. Casualtransmission of this kind is hard to detect with current techniques of contact tracing.This study has three major implications for urban tuberculosis control. First, becausemore cases of tuberculosis are arising as a result of recent infection with M. tuberculosisthan has been heretofore appreciated, increased emphasis should be placed on theidentification of sites of transmission and the application of environmental controls.Second, because a single infectious patient may have devastating effects on tuberculosiscontrol, the treatment of patients with infectious tuberculosis must be prompt andeffective. Third, because only 10 percent of the patients in clusters were identified by aconventional investigation of contacts, novel approaches to contact tracing may need tobe developed and targeted to specific populations.Supported in part by the Howard Hughes Medical Institute, grants from the NationalInstitutes of Health (K08 AI01137-01 and R01 AI34238-01), and a grant from theCenters for Disease Control and Prevention (U52-CCU 900454).We are indebted to the personnel of the San Francisco Department of Public HealthDivision of Tuberculosis Control, whose high quality of service and cooperation havemade this work possible; to Aimee LaPerriere-Hunt for diligent research assistance; toArthur Back (deceased), Anna Babst of the San Francisco Public Health Laboratory,Arthur Reingold, Gretchen Anderson, and the Western Consortium for Public Health,
Bacterial and Mycotic Surveillance Project for assistance with the collection of M.tuberculosis; to Kevan Gross and Eric Preston for essential assistance with computer-software design; to Karl Reich for many thoughtful discussions regarding the molecularbiology of M. tuberculosis; to Lorene Nelson and Jerry Halpern of the StanfordUniversity Department of Health Research and Policy for important advice aboutstatistical analysis; and to Dr. Nancy Krieger, Kaiser Permanente Division of Research,Oakland, Calif., for San Francisco County census-tract information.