SlideShare a Scribd company logo
1 of 14
Download to read offline
Kellie Watkins 1
Final Project – Advanced Design Analysis
Introduction: In 2010, Texas had the fourth highest occurrence of new diagnoses of HIV/AIDS
and TB. The Texas Department of State Health Services (DSHS) identified HIV infected
persons as a high risk population for TB in Harris County. In order to explore this relationship
further, co-morbidity will be were examined through regression analyses.
Problem, ResearchQuestion, Hypothesis: A population based ecological study was conducted
to identify areas with a high number of TB and HIV new diagnoses in Harris County, Texas from
2009 through 2010. TB and HIV new diagnoses rates were linked to socio-economic variables at
the census tract level. The independent variables include the following: housing size, American
Indian/Alaska Native, Asian, Associates degree, Bachelor degree, Black, divorced, English
(primary language), female, foreign borne, graduate or professional degree, high school degree,
Hispanic, less than 9th grade education, male, married or separated, Native Hawaiian/Pacific
Islander, Native born, never married, no educational diploma, other language (non-English
speakers), non-Hispanic, Separated, some college level education, White, other ethnicity, two
ethnicities or more, widowed, poverty, and unemployed. All variables are measured as a
percentage of the total population per census tract according to the 2010 census data; therefore,
each independent variable is continuous. Literature demonstrates that these variables might be
important risk factors for HIV and/or TB among the Harris County population, but initial
analysis indicates that there is multicollinearity. Furthermore, particular variables might be
highly correlated (e.g., educational level). Factor analysis might reduce redundancy among these
highly correlated variables. Factor analysis was the primary focus of this project although a final
logistic regression model will be performed to analyze the results.
The average rate of HIV new diagnoses and the average rate of TB new diagnoses were
6.81 and 1.87 per 10000 persons, respectively, in Harris County over the two year period. A co-
morbidity variable was created that identified census tracts with above average rates of HIV and
TB, and this variable will represent the outcome of interest. Logistic regression analysis will
assess the relationship between the predictor variables and co-morbidity. The main research
question of interest is to determine whether known risk factors for HIV and known risk factors
for TB are significantly associated with the co-morbidity of HIV and TB at the population level.
Hypothesis: H0: m =… m+t = 0; Ha: m =… m+t ≠ 0
Description of Data: TB data is from the Department of State Health Services and originates
from the local TB control program. HIV data is from the Houston Department of Health and
Human Services. TB and HIV are reportable diseases and required by law. Although data is de-
identified and analyzed at the population level, HIV/TB data may not be shared between
unauthorized parties. The project received approval from relevant shareholders and the
University of Texas Internal Review Board.
Geocoding: Subject level data was the source data. The addresses of HIV new diagnoses
and the addresses of TB new diagnoses were aggregated at the census tract level through
geocoding. HIV new diagnoses were geocoded by staff at HDHHS through a Centers for
Disease Control and Prevention grant project. TB data was transmitted to HDHHS and included
all new diagnoses beginning in 2000 for the state of Texas. For the purposes of this project,
Kellie Watkins 2
point level addresses were geocoded using the online address locator ArcGIS 9.3 North America
Geocode Service. A total of 735 TB case records from DSHS were located within Harris County
after selecting cases by attribute. There were 395 cases of TB in 2009 and 340 cases of TB in
2010. The average match score was 93.75 with a maximum of 100 and a minimum of 73.45.
Census Level: HIV and TB individual level residential addresses needed to be linked to a
Harris County census tract shapefile at the polygon level. The summation of all the points in an
individual census tract created an output of the total counts/census tract. These points,
representing the location of new diagnoses of HIV and TB respectively, were divided by the total
census tract population for each tract to produce the rates for 2009 and 2010.
Census Variables: 2010 Census Data was collected by the U.S. Census Bureau through the
decennial census questionnaire that is mandated for every household in the United States. Detailed
population and housing data were released on 21 December 2010. Census tracts represent statistical
subdivisions within each county. According to the U.S. Census Bureau, most tracts contain a range of
2,500 to 8,000 individuals although the size is largely varied due to its dependence on the density of
resident. Census tracts do not cross county boundaries; therefore each census tract within Harris
County is limited to within these borders. There are a total of 786 census tracts located within the
borders of Harris County. Census tracts are designed to be homogenous across demographic
variables such as socioeconomic status and other population characteristics making them ideal for
public health research focused on community based intervention and outreach programs such as this
thesis.
Literature suggested that particular variables were potential risk factors for HIV or TB.
These variables include age, housing size, race, sex, educational level, employment status, poverty,
and marital status. These variables exist in various public databases available online through the
U.S. Census Bureau. Each potential variable of interest was extracted individually and linked to the
TB and HIV outcomes through ArcGIS using the census tract as the common factor. All variables
were continuous because of their aggregate nature. For instance, instead of an individual “race”
variable with subsequent divisions such as Black, Asian, White, etc., each individual race represents
a separate variable such as the percentage of Black residents in a census tract, the percentage of
White residents in a census tract, and so forth.
Data Analysis and Results: Initially, before the creation of the logistic outcome, ordinary least
squares regression was attempted using ArcGIS 9.3. However, an error message was produced
that stated that multicollinearity was preventing the program from running correctly. Despite
eliminating variables consistently, multicollinearity remained. As a result, it was determined that
SAS 9.3 would be used to run additional analyses in hopes of resolving the multicollinearity
issue.
An initial correlation matrix was produced that demonstrated the high degree of
correlation between each variable (see Appendix I for a partial representation of the correlation
coefficients). A remaining issue was that some variables were clearly independent of each other
while others, such as “race,” essentially measured the same entity but were not equal and,
consequently, could not be combined. Furthermore, continuing the example with race, Black
and Asian were each risk factors for HIV and TB, respectively. Maintaining each as independent
variables was deemed essential and logistic regression was still the preferred method of analysis.
Kellie Watkins 3
After consulting with a biostatistician, factor analysis was recommended for the variables that
were highly correlated and logically related. The goal was to create two subscales, education
and marital status, as predictors in the final regression model to reduce at least some of the
redundancy. The mean of each item in the factor, per education and marital status respectively,
were used as a continuous variable in the logistic regression model.
Prior to analyzing the items that were related among education and marital status, an
initial factor method was run (principal components) with rotation method varimax in the SAS
system using all variables of interest. There were a total of 32 Eigen values with an average of 1.
For the initial factor method (principal components), 7 factors would be retained by the mineigen
criterion of 1.0 (see Appendix II), where values greater than 0.35 within each factor were flagged
with an asterisk. The variance explained by each factor was 10.73 for Factor 1, 4.80 for Factor
2, 2.79 for Factor 3, 1.94 for Factor 4, 1.35 for Factor 5, 1.21 for Factor 6, and 1.06 for Factor 7.
Using the rotation method varimax, an orthogonal transformation matrix was produced (see
Appendix II). The variance explained by each factor was 8.81 for Factor 1, 5.12 for Factor 2,
2.55 for Factor 3, 2.36 for Factor 4, 2.07 for Factor 5, 1.69 for Factor 6, and 1.27 for Factor 7.
However, when viewing the seven factors and the variables with values greater than 0.35 per
factor, it was difficult to interpret the results. The related components, or pattern, were not
involving variables that could be easily or logically explained. While it was understood why
particular relationships or correlations between these variables might exist, it might not be
suitable for producing a meaningful interpretation.
Consequently, the same analysis was run but with fewer variables that were meaningfully
related. Due to the dramatically decreased number of variables analyzed in the factor analysis,
this type of analysis might not be the most desirable approach, but it was pursued in hopes of
diminishing the number of variables by finding a meaningful measure of similar components.
For marital status, the percent divorced, married but separated, never married, separated, and
widowed was analyzed. To maintain consistency, the factor procedure rotation method varimax
was used in the SAS system. Three factors were produced, and within each factor the values
greater than 0.35 were flagged with an asterisk (see Appendix III). The variance explained by
each factor was 0.85 for Factor 1, 0.56 for Factor 2, and 0.33 for Factor 3. A new variable was
created based on the factors that was named “single.” For education, the percent with an
associate’s degree, bachelor’s degree, graduate or professional level education, high school
degree, less than 9th grade, no educational diploma, and some college education was analyzed.
To maintain consistency, the factor procedure rotation method varimax was used in the SAS
system. Three factors were produced, and within each factor the values greater than 0.35 were
flagged with an asterisk (see Appendix IV). The variance explained by each factor was 2.01 for
Factor 1, 1.78 for Factor 2, and 1.19 for Factor 3. A new variable was created based on the
factors that was named “education.” The goal of the factor analysis was to reduce the number of
variables in the final model through logical and statistical reasoning.
Model Building: A univariate analysis was run for each of the remaining variables of interest
(i.e., potential risk factors) and the binary co-morbidity outcome. Housing size, single,
education, poverty, unemployment, Asian, Black, non-English speakers, foreign borne, and male
were significantly associated with HIV/TB at a 5% significance at the univariate level. Age (p-
value of 0.0510) and Hispanic (p-value of 0.1060) were marginally significant and still
Kellie Watkins 4
considered important risk factors for consideration in the model building process. Multiple
logistic regression models were fit using literature to determine which variables were best suited
as predictors of co-morbidity in addition to verifying their correlation to the outcome variable in
the individual analysis. There was some residual correlation between predictors, such as
between Black and poverty, but each variable was retained because of their established
importance as individual risk factors. Some interaction terms were considered but rejected based
on their lack of meaningful interpretation.
Assessment and Discussion: After careful consideration and significant debate, it was
concluded that a separate analysis for HIV and TB as individual, binary variables should be run
to determine which independent variables were significantly associated with above average rates
of HIV at the census tract level and above average rates of TB at the census tract level to
compare with the co-morbidity model. For the HIV model, significant variables (p-value<0.05)
were housing size, Asian, Black, foreign borne, and age. Poverty (p-value of 0.07) was
marginally significant. For the TB model, significant variables (p-value<0.05) were poverty,
Asian, and Black. The direction of the coefficient for Asian reversed, with increased percentage
of Asian populations associated with a higher rate of TB. Education (p-value of 0.08) was
marginally significant. The final co-morbidity model included the following independent
variables: housing size, single, education, poverty, unemployment, Asian, Black, Hispanic,
foreign borne, male, and age. Age and male were considered, although they were not significant
in the single models, because literature demonstrated that younger age and male were considered
risk factors for HIV and/or TB among Harris County. For the final model, housing size,
unemployment, Black, and foreign borne were significant (p-value<0.05). Furthermore, the AIC
value was smaller for this model (620.264) when compared to the other models run previously
for the co-morbidity outcome. The significance of unemployment was unexpected because
oftentimes poverty is considered a substitute for this variable, whereas poverty was not
significant in this model. For the significant predictors, the relationship indicated that increased
percentage in smaller households was associated with lower co-morbidity while an increased
percentage of unemployment, Black, and foreign borne were associated with higher co-
morbidity. Asian was another variable that needs additional consideration because it is
considered a protective factor against higher rates of HIV at the census tract level but a risk
factor for higher rates of TB at the census tract level. Results indicate that higher percentage of
Asian populations might be associated with lower co-morbidity.
Limitations and Recommendations: In attempting to address the multicollinearity issues
involved in this procedure, factor analysis was applied. However, there might have been other,
preferred methods to reduce the redundancy of variables, including simple review of the
literature. In addition, creating subscales based on the factor loadings might not have been
effective or done correctly based on the nature of factor analysis. It was determined that it was
the best method to apply within the limited timeframe and tools available, but additional
possibilities should be pursued in the future. Furthermore, the data cleaning and geocoding
process should be reviewed for accuracy due to the complicated and extensive labor that was
undertaken to create the final database. Additionally, the initial analysis was run before any
understanding of multilevel analysis was available. The next steps have already begun to pursue
this analysis using the subject level TB data that was initially transmitted via DSHS. In this case,
the subject level and census tract level can be considered instead of aggregating the data.
Kellie Watkins 5
APPENDIX I:
Kellie Watkins 6
Kellie Watkins 7
APPENDIX II: Initial Factor Method: Principal Components
Kellie Watkins 8
Kellie Watkins 9
Kellie Watkins 10
Appendix III: Marital Status
Kellie Watkins 11
APPENDIX IV: Education
Kellie Watkins 12
APPENDIX V: HIV, TB, and HIV/TB Co-Morbidity Models
Outcome: HIV (binary)
Kellie Watkins 13
Outcome: TB (binary)
Outcome: HIV/TB (co-morbidity, binary)
Kellie Watkins 14

More Related Content

Similar to Sample, advanced epidemiology factor analysis

The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...Sunil Nair
 
Tx dshs std hiv gis final
Tx dshs std hiv gis finalTx dshs std hiv gis final
Tx dshs std hiv gis finalmishtx
 
Estimating HIV prevalence and risk behaviors of transgender persons in the Un...
Estimating HIV prevalence and risk behaviors of transgender persons in the Un...Estimating HIV prevalence and risk behaviors of transgender persons in the Un...
Estimating HIV prevalence and risk behaviors of transgender persons in the Un...Santé des trans
 
Hutchinson and holtman, 2005
Hutchinson and holtman, 2005Hutchinson and holtman, 2005
Hutchinson and holtman, 2005Quyen Nguyen
 
2013 Council of State and Territorial Epidemiologists Annual Conference
2013 Council of State and Territorial Epidemiologists Annual Conference2013 Council of State and Territorial Epidemiologists Annual Conference
2013 Council of State and Territorial Epidemiologists Annual ConferenceKellieWatkins1
 
From Politics to Parity: Using a Health Disparitiies Index to Guide Legislati...
From Politics to Parity: Using a Health Disparitiies Index to Guide Legislati...From Politics to Parity: Using a Health Disparitiies Index to Guide Legislati...
From Politics to Parity: Using a Health Disparitiies Index to Guide Legislati...Jim Bloyd, DrPH, MPH
 
DemographyThe scientific study of population.U.S. Ce.docx
DemographyThe scientific study of population.U.S. Ce.docxDemographyThe scientific study of population.U.S. Ce.docx
DemographyThe scientific study of population.U.S. Ce.docxcuddietheresa
 
Urbanization and Fertility Rates in Ethiopia
Urbanization and Fertility Rates in EthiopiaUrbanization and Fertility Rates in Ethiopia
Urbanization and Fertility Rates in Ethiopiaessp2
 
COMMENTARYMinority Group Status and Healthful AgingSoci
COMMENTARYMinority Group Status and Healthful AgingSociCOMMENTARYMinority Group Status and Healthful AgingSoci
COMMENTARYMinority Group Status and Healthful AgingSociLynellBull52
 
Community-Academic Partnership to Conduct Demographic Surveillance: First Ste...
Community-Academic Partnership to Conduct Demographic Surveillance: First Ste...Community-Academic Partnership to Conduct Demographic Surveillance: First Ste...
Community-Academic Partnership to Conduct Demographic Surveillance: First Ste...CDC NPIN
 
Dissemination of an Evidence-Based Pregnancy, STD, and HIV Prevention Program...
Dissemination of an Evidence-Based Pregnancy, STD, and HIV Prevention Program...Dissemination of an Evidence-Based Pregnancy, STD, and HIV Prevention Program...
Dissemination of an Evidence-Based Pregnancy, STD, and HIV Prevention Program...bflores1
 
English 101 Essay #3 Ad Analysis Throughout this .docx
English 101 Essay #3 Ad Analysis   Throughout this .docxEnglish 101 Essay #3 Ad Analysis   Throughout this .docx
English 101 Essay #3 Ad Analysis Throughout this .docxSALU18
 
International AIDS Conference
International AIDS ConferenceInternational AIDS Conference
International AIDS ConferenceKellieWatkins1
 
Lisa Barnes PHC6946 Internship Paper
Lisa Barnes PHC6946 Internship PaperLisa Barnes PHC6946 Internship Paper
Lisa Barnes PHC6946 Internship PaperLisa Barnes
 
Social services utilization and need among a community sample .docx
Social services utilization and need among a community sample .docxSocial services utilization and need among a community sample .docx
Social services utilization and need among a community sample .docxrosemariebrayshaw
 

Similar to Sample, advanced epidemiology factor analysis (20)

GIS_Final Draft
GIS_Final DraftGIS_Final Draft
GIS_Final Draft
 
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
 
Tx dshs std hiv gis final
Tx dshs std hiv gis finalTx dshs std hiv gis final
Tx dshs std hiv gis final
 
The Internet as a tool for positive youth development: Study method overview
The Internet as a tool for positive youth development: Study method overviewThe Internet as a tool for positive youth development: Study method overview
The Internet as a tool for positive youth development: Study method overview
 
Ias 2017
Ias 2017Ias 2017
Ias 2017
 
Estimating HIV prevalence and risk behaviors of transgender persons in the Un...
Estimating HIV prevalence and risk behaviors of transgender persons in the Un...Estimating HIV prevalence and risk behaviors of transgender persons in the Un...
Estimating HIV prevalence and risk behaviors of transgender persons in the Un...
 
Hutchinson and holtman, 2005
Hutchinson and holtman, 2005Hutchinson and holtman, 2005
Hutchinson and holtman, 2005
 
2013 Council of State and Territorial Epidemiologists Annual Conference
2013 Council of State and Territorial Epidemiologists Annual Conference2013 Council of State and Territorial Epidemiologists Annual Conference
2013 Council of State and Territorial Epidemiologists Annual Conference
 
From Politics to Parity: Using a Health Disparitiies Index to Guide Legislati...
From Politics to Parity: Using a Health Disparitiies Index to Guide Legislati...From Politics to Parity: Using a Health Disparitiies Index to Guide Legislati...
From Politics to Parity: Using a Health Disparitiies Index to Guide Legislati...
 
DemographyThe scientific study of population.U.S. Ce.docx
DemographyThe scientific study of population.U.S. Ce.docxDemographyThe scientific study of population.U.S. Ce.docx
DemographyThe scientific study of population.U.S. Ce.docx
 
Urbanization and Fertility Rates in Ethiopia
Urbanization and Fertility Rates in EthiopiaUrbanization and Fertility Rates in Ethiopia
Urbanization and Fertility Rates in Ethiopia
 
COMMENTARYMinority Group Status and Healthful AgingSoci
COMMENTARYMinority Group Status and Healthful AgingSociCOMMENTARYMinority Group Status and Healthful AgingSoci
COMMENTARYMinority Group Status and Healthful AgingSoci
 
Community-Academic Partnership to Conduct Demographic Surveillance: First Ste...
Community-Academic Partnership to Conduct Demographic Surveillance: First Ste...Community-Academic Partnership to Conduct Demographic Surveillance: First Ste...
Community-Academic Partnership to Conduct Demographic Surveillance: First Ste...
 
Dissemination of an Evidence-Based Pregnancy, STD, and HIV Prevention Program...
Dissemination of an Evidence-Based Pregnancy, STD, and HIV Prevention Program...Dissemination of an Evidence-Based Pregnancy, STD, and HIV Prevention Program...
Dissemination of an Evidence-Based Pregnancy, STD, and HIV Prevention Program...
 
English 101 Essay #3 Ad Analysis Throughout this .docx
English 101 Essay #3 Ad Analysis   Throughout this .docxEnglish 101 Essay #3 Ad Analysis   Throughout this .docx
English 101 Essay #3 Ad Analysis Throughout this .docx
 
International AIDS Conference
International AIDS ConferenceInternational AIDS Conference
International AIDS Conference
 
HIV and Islam: is HIV prevalence lower among Muslims?
HIV and Islam: is HIV prevalence lower among Muslims?HIV and Islam: is HIV prevalence lower among Muslims?
HIV and Islam: is HIV prevalence lower among Muslims?
 
Lisa Barnes PHC6946 Internship Paper
Lisa Barnes PHC6946 Internship PaperLisa Barnes PHC6946 Internship Paper
Lisa Barnes PHC6946 Internship Paper
 
Social services utilization and need among a community sample .docx
Social services utilization and need among a community sample .docxSocial services utilization and need among a community sample .docx
Social services utilization and need among a community sample .docx
 
Final analysis & Discussion_Volen
Final analysis & Discussion_VolenFinal analysis & Discussion_Volen
Final analysis & Discussion_Volen
 

More from KellieWatkins1

Behavioral and Morphological Variation in Brachycentrids (Brachycentridae Bra...
Behavioral and Morphological Variation in Brachycentrids (Brachycentridae Bra...Behavioral and Morphological Variation in Brachycentrids (Brachycentridae Bra...
Behavioral and Morphological Variation in Brachycentrids (Brachycentridae Bra...KellieWatkins1
 
HIV/STD Electronic Lab Reporting
HIV/STD Electronic Lab ReportingHIV/STD Electronic Lab Reporting
HIV/STD Electronic Lab ReportingKellieWatkins1
 
Data to Care, a randomized study
Data to Care, a randomized studyData to Care, a randomized study
Data to Care, a randomized studyKellieWatkins1
 
2012 MD Anderson Summer Exposition
2012 MD Anderson Summer Exposition2012 MD Anderson Summer Exposition
2012 MD Anderson Summer ExpositionKellieWatkins1
 
Relinkage: The Sticking Point in HIV Prevention
Relinkage: The Sticking Point in HIV PreventionRelinkage: The Sticking Point in HIV Prevention
Relinkage: The Sticking Point in HIV PreventionKellieWatkins1
 
Poster Presentation for the International Association of Providers of AIDS Ca...
Poster Presentation for the International Association of Providers of AIDS Ca...Poster Presentation for the International Association of Providers of AIDS Ca...
Poster Presentation for the International Association of Providers of AIDS Ca...KellieWatkins1
 
Human Trafficking in the Context of a Legal Organization
Human Trafficking in the Context of a Legal OrganizationHuman Trafficking in the Context of a Legal Organization
Human Trafficking in the Context of a Legal OrganizationKellieWatkins1
 
Sample, disease modeling vzv
Sample, disease modeling vzvSample, disease modeling vzv
Sample, disease modeling vzvKellieWatkins1
 
Sample, disease modeling dengue fever
Sample, disease modeling dengue feverSample, disease modeling dengue fever
Sample, disease modeling dengue feverKellieWatkins1
 
SRNT Symposium of TROs and Smoking Urges
SRNT Symposium of TROs and Smoking UrgesSRNT Symposium of TROs and Smoking Urges
SRNT Symposium of TROs and Smoking UrgesKellieWatkins1
 

More from KellieWatkins1 (11)

Behavioral and Morphological Variation in Brachycentrids (Brachycentridae Bra...
Behavioral and Morphological Variation in Brachycentrids (Brachycentridae Bra...Behavioral and Morphological Variation in Brachycentrids (Brachycentridae Bra...
Behavioral and Morphological Variation in Brachycentrids (Brachycentridae Bra...
 
Evolution, A Report
Evolution, A ReportEvolution, A Report
Evolution, A Report
 
HIV/STD Electronic Lab Reporting
HIV/STD Electronic Lab ReportingHIV/STD Electronic Lab Reporting
HIV/STD Electronic Lab Reporting
 
Data to Care, a randomized study
Data to Care, a randomized studyData to Care, a randomized study
Data to Care, a randomized study
 
2012 MD Anderson Summer Exposition
2012 MD Anderson Summer Exposition2012 MD Anderson Summer Exposition
2012 MD Anderson Summer Exposition
 
Relinkage: The Sticking Point in HIV Prevention
Relinkage: The Sticking Point in HIV PreventionRelinkage: The Sticking Point in HIV Prevention
Relinkage: The Sticking Point in HIV Prevention
 
Poster Presentation for the International Association of Providers of AIDS Ca...
Poster Presentation for the International Association of Providers of AIDS Ca...Poster Presentation for the International Association of Providers of AIDS Ca...
Poster Presentation for the International Association of Providers of AIDS Ca...
 
Human Trafficking in the Context of a Legal Organization
Human Trafficking in the Context of a Legal OrganizationHuman Trafficking in the Context of a Legal Organization
Human Trafficking in the Context of a Legal Organization
 
Sample, disease modeling vzv
Sample, disease modeling vzvSample, disease modeling vzv
Sample, disease modeling vzv
 
Sample, disease modeling dengue fever
Sample, disease modeling dengue feverSample, disease modeling dengue fever
Sample, disease modeling dengue fever
 
SRNT Symposium of TROs and Smoking Urges
SRNT Symposium of TROs and Smoking UrgesSRNT Symposium of TROs and Smoking Urges
SRNT Symposium of TROs and Smoking Urges
 

Recently uploaded

High Profile Call Girls Coimbatore Saanvi☎️ 8250192130 Independent Escort Se...
High Profile Call Girls Coimbatore Saanvi☎️  8250192130 Independent Escort Se...High Profile Call Girls Coimbatore Saanvi☎️  8250192130 Independent Escort Se...
High Profile Call Girls Coimbatore Saanvi☎️ 8250192130 Independent Escort Se...narwatsonia7
 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...narwatsonia7
 
Call Girls Darjeeling Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Darjeeling Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Darjeeling Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Darjeeling Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...jageshsingh5554
 
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...narwatsonia7
 
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls JaipurCall Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipurparulsinha
 
Call Girls Service Navi Mumbai Samaira 8617697112 Independent Escort Service ...
Call Girls Service Navi Mumbai Samaira 8617697112 Independent Escort Service ...Call Girls Service Navi Mumbai Samaira 8617697112 Independent Escort Service ...
Call Girls Service Navi Mumbai Samaira 8617697112 Independent Escort Service ...Call girls in Ahmedabad High profile
 
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...astropune
 
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...astropune
 
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...Taniya Sharma
 
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...CALL GIRLS
 
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls ServiceKesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Servicemakika9823
 
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...Taniya Sharma
 
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual NeedsBangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual NeedsGfnyt
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.MiadAlsulami
 

Recently uploaded (20)

High Profile Call Girls Coimbatore Saanvi☎️ 8250192130 Independent Escort Se...
High Profile Call Girls Coimbatore Saanvi☎️  8250192130 Independent Escort Se...High Profile Call Girls Coimbatore Saanvi☎️  8250192130 Independent Escort Se...
High Profile Call Girls Coimbatore Saanvi☎️ 8250192130 Independent Escort Se...
 
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Cuttack Just Call 9907093804 Top Class Call Girl Service Available
 
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...Bangalore Call Girls Hebbal Kempapura Number 7001035870  Meetin With Bangalor...
Bangalore Call Girls Hebbal Kempapura Number 7001035870 Meetin With Bangalor...
 
Call Girls Darjeeling Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Darjeeling Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Darjeeling Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Darjeeling Just Call 9907093804 Top Class Call Girl Service Available
 
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCREscort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
Escort Service Call Girls In Sarita Vihar,, 99530°56974 Delhi NCR
 
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
VIP Service Call Girls Sindhi Colony 📳 7877925207 For 18+ VIP Call Girl At Th...
 
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
VIP Call Girls Tirunelveli Aaradhya 8250192130 Independent Escort Service Tir...
 
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls JaipurCall Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
Call Girls Service Jaipur Grishma WhatsApp ❤8445551418 VIP Call Girls Jaipur
 
Call Girls Service Navi Mumbai Samaira 8617697112 Independent Escort Service ...
Call Girls Service Navi Mumbai Samaira 8617697112 Independent Escort Service ...Call Girls Service Navi Mumbai Samaira 8617697112 Independent Escort Service ...
Call Girls Service Navi Mumbai Samaira 8617697112 Independent Escort Service ...
 
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Ooty Just Call 9907093804 Top Class Call Girl Service Available
 
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
Best Rate (Hyderabad) Call Girls Jahanuma ⟟ 8250192130 ⟟ High Class Call Girl...
 
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
♛VVIP Hyderabad Call Girls Chintalkunta🖕7001035870🖕Riya Kappor Top Call Girl ...
 
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
(👑VVIP ISHAAN ) Russian Call Girls Service Navi Mumbai🖕9920874524🖕Independent...
 
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
Call Girls Service Surat Samaira ❤️🍑 8250192130 👄 Independent Escort Service ...
 
Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...
Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...
Russian Call Girls in Delhi Tanvi ➡️ 9711199012 💋📞 Independent Escort Service...
 
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 9907093804 Top Class Call Girl Service Available
 
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls ServiceKesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
Kesar Bagh Call Girl Price 9548273370 , Lucknow Call Girls Service
 
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
💎VVIP Kolkata Call Girls Parganas🩱7001035870🩱Independent Girl ( Ac Rooms Avai...
 
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual NeedsBangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
Bangalore Call Girl Whatsapp Number 100% Complete Your Sexual Needs
 
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
 

Sample, advanced epidemiology factor analysis

  • 1. Kellie Watkins 1 Final Project – Advanced Design Analysis Introduction: In 2010, Texas had the fourth highest occurrence of new diagnoses of HIV/AIDS and TB. The Texas Department of State Health Services (DSHS) identified HIV infected persons as a high risk population for TB in Harris County. In order to explore this relationship further, co-morbidity will be were examined through regression analyses. Problem, ResearchQuestion, Hypothesis: A population based ecological study was conducted to identify areas with a high number of TB and HIV new diagnoses in Harris County, Texas from 2009 through 2010. TB and HIV new diagnoses rates were linked to socio-economic variables at the census tract level. The independent variables include the following: housing size, American Indian/Alaska Native, Asian, Associates degree, Bachelor degree, Black, divorced, English (primary language), female, foreign borne, graduate or professional degree, high school degree, Hispanic, less than 9th grade education, male, married or separated, Native Hawaiian/Pacific Islander, Native born, never married, no educational diploma, other language (non-English speakers), non-Hispanic, Separated, some college level education, White, other ethnicity, two ethnicities or more, widowed, poverty, and unemployed. All variables are measured as a percentage of the total population per census tract according to the 2010 census data; therefore, each independent variable is continuous. Literature demonstrates that these variables might be important risk factors for HIV and/or TB among the Harris County population, but initial analysis indicates that there is multicollinearity. Furthermore, particular variables might be highly correlated (e.g., educational level). Factor analysis might reduce redundancy among these highly correlated variables. Factor analysis was the primary focus of this project although a final logistic regression model will be performed to analyze the results. The average rate of HIV new diagnoses and the average rate of TB new diagnoses were 6.81 and 1.87 per 10000 persons, respectively, in Harris County over the two year period. A co- morbidity variable was created that identified census tracts with above average rates of HIV and TB, and this variable will represent the outcome of interest. Logistic regression analysis will assess the relationship between the predictor variables and co-morbidity. The main research question of interest is to determine whether known risk factors for HIV and known risk factors for TB are significantly associated with the co-morbidity of HIV and TB at the population level. Hypothesis: H0: m =… m+t = 0; Ha: m =… m+t ≠ 0 Description of Data: TB data is from the Department of State Health Services and originates from the local TB control program. HIV data is from the Houston Department of Health and Human Services. TB and HIV are reportable diseases and required by law. Although data is de- identified and analyzed at the population level, HIV/TB data may not be shared between unauthorized parties. The project received approval from relevant shareholders and the University of Texas Internal Review Board. Geocoding: Subject level data was the source data. The addresses of HIV new diagnoses and the addresses of TB new diagnoses were aggregated at the census tract level through geocoding. HIV new diagnoses were geocoded by staff at HDHHS through a Centers for Disease Control and Prevention grant project. TB data was transmitted to HDHHS and included all new diagnoses beginning in 2000 for the state of Texas. For the purposes of this project,
  • 2. Kellie Watkins 2 point level addresses were geocoded using the online address locator ArcGIS 9.3 North America Geocode Service. A total of 735 TB case records from DSHS were located within Harris County after selecting cases by attribute. There were 395 cases of TB in 2009 and 340 cases of TB in 2010. The average match score was 93.75 with a maximum of 100 and a minimum of 73.45. Census Level: HIV and TB individual level residential addresses needed to be linked to a Harris County census tract shapefile at the polygon level. The summation of all the points in an individual census tract created an output of the total counts/census tract. These points, representing the location of new diagnoses of HIV and TB respectively, were divided by the total census tract population for each tract to produce the rates for 2009 and 2010. Census Variables: 2010 Census Data was collected by the U.S. Census Bureau through the decennial census questionnaire that is mandated for every household in the United States. Detailed population and housing data were released on 21 December 2010. Census tracts represent statistical subdivisions within each county. According to the U.S. Census Bureau, most tracts contain a range of 2,500 to 8,000 individuals although the size is largely varied due to its dependence on the density of resident. Census tracts do not cross county boundaries; therefore each census tract within Harris County is limited to within these borders. There are a total of 786 census tracts located within the borders of Harris County. Census tracts are designed to be homogenous across demographic variables such as socioeconomic status and other population characteristics making them ideal for public health research focused on community based intervention and outreach programs such as this thesis. Literature suggested that particular variables were potential risk factors for HIV or TB. These variables include age, housing size, race, sex, educational level, employment status, poverty, and marital status. These variables exist in various public databases available online through the U.S. Census Bureau. Each potential variable of interest was extracted individually and linked to the TB and HIV outcomes through ArcGIS using the census tract as the common factor. All variables were continuous because of their aggregate nature. For instance, instead of an individual “race” variable with subsequent divisions such as Black, Asian, White, etc., each individual race represents a separate variable such as the percentage of Black residents in a census tract, the percentage of White residents in a census tract, and so forth. Data Analysis and Results: Initially, before the creation of the logistic outcome, ordinary least squares regression was attempted using ArcGIS 9.3. However, an error message was produced that stated that multicollinearity was preventing the program from running correctly. Despite eliminating variables consistently, multicollinearity remained. As a result, it was determined that SAS 9.3 would be used to run additional analyses in hopes of resolving the multicollinearity issue. An initial correlation matrix was produced that demonstrated the high degree of correlation between each variable (see Appendix I for a partial representation of the correlation coefficients). A remaining issue was that some variables were clearly independent of each other while others, such as “race,” essentially measured the same entity but were not equal and, consequently, could not be combined. Furthermore, continuing the example with race, Black and Asian were each risk factors for HIV and TB, respectively. Maintaining each as independent variables was deemed essential and logistic regression was still the preferred method of analysis.
  • 3. Kellie Watkins 3 After consulting with a biostatistician, factor analysis was recommended for the variables that were highly correlated and logically related. The goal was to create two subscales, education and marital status, as predictors in the final regression model to reduce at least some of the redundancy. The mean of each item in the factor, per education and marital status respectively, were used as a continuous variable in the logistic regression model. Prior to analyzing the items that were related among education and marital status, an initial factor method was run (principal components) with rotation method varimax in the SAS system using all variables of interest. There were a total of 32 Eigen values with an average of 1. For the initial factor method (principal components), 7 factors would be retained by the mineigen criterion of 1.0 (see Appendix II), where values greater than 0.35 within each factor were flagged with an asterisk. The variance explained by each factor was 10.73 for Factor 1, 4.80 for Factor 2, 2.79 for Factor 3, 1.94 for Factor 4, 1.35 for Factor 5, 1.21 for Factor 6, and 1.06 for Factor 7. Using the rotation method varimax, an orthogonal transformation matrix was produced (see Appendix II). The variance explained by each factor was 8.81 for Factor 1, 5.12 for Factor 2, 2.55 for Factor 3, 2.36 for Factor 4, 2.07 for Factor 5, 1.69 for Factor 6, and 1.27 for Factor 7. However, when viewing the seven factors and the variables with values greater than 0.35 per factor, it was difficult to interpret the results. The related components, or pattern, were not involving variables that could be easily or logically explained. While it was understood why particular relationships or correlations between these variables might exist, it might not be suitable for producing a meaningful interpretation. Consequently, the same analysis was run but with fewer variables that were meaningfully related. Due to the dramatically decreased number of variables analyzed in the factor analysis, this type of analysis might not be the most desirable approach, but it was pursued in hopes of diminishing the number of variables by finding a meaningful measure of similar components. For marital status, the percent divorced, married but separated, never married, separated, and widowed was analyzed. To maintain consistency, the factor procedure rotation method varimax was used in the SAS system. Three factors were produced, and within each factor the values greater than 0.35 were flagged with an asterisk (see Appendix III). The variance explained by each factor was 0.85 for Factor 1, 0.56 for Factor 2, and 0.33 for Factor 3. A new variable was created based on the factors that was named “single.” For education, the percent with an associate’s degree, bachelor’s degree, graduate or professional level education, high school degree, less than 9th grade, no educational diploma, and some college education was analyzed. To maintain consistency, the factor procedure rotation method varimax was used in the SAS system. Three factors were produced, and within each factor the values greater than 0.35 were flagged with an asterisk (see Appendix IV). The variance explained by each factor was 2.01 for Factor 1, 1.78 for Factor 2, and 1.19 for Factor 3. A new variable was created based on the factors that was named “education.” The goal of the factor analysis was to reduce the number of variables in the final model through logical and statistical reasoning. Model Building: A univariate analysis was run for each of the remaining variables of interest (i.e., potential risk factors) and the binary co-morbidity outcome. Housing size, single, education, poverty, unemployment, Asian, Black, non-English speakers, foreign borne, and male were significantly associated with HIV/TB at a 5% significance at the univariate level. Age (p- value of 0.0510) and Hispanic (p-value of 0.1060) were marginally significant and still
  • 4. Kellie Watkins 4 considered important risk factors for consideration in the model building process. Multiple logistic regression models were fit using literature to determine which variables were best suited as predictors of co-morbidity in addition to verifying their correlation to the outcome variable in the individual analysis. There was some residual correlation between predictors, such as between Black and poverty, but each variable was retained because of their established importance as individual risk factors. Some interaction terms were considered but rejected based on their lack of meaningful interpretation. Assessment and Discussion: After careful consideration and significant debate, it was concluded that a separate analysis for HIV and TB as individual, binary variables should be run to determine which independent variables were significantly associated with above average rates of HIV at the census tract level and above average rates of TB at the census tract level to compare with the co-morbidity model. For the HIV model, significant variables (p-value<0.05) were housing size, Asian, Black, foreign borne, and age. Poverty (p-value of 0.07) was marginally significant. For the TB model, significant variables (p-value<0.05) were poverty, Asian, and Black. The direction of the coefficient for Asian reversed, with increased percentage of Asian populations associated with a higher rate of TB. Education (p-value of 0.08) was marginally significant. The final co-morbidity model included the following independent variables: housing size, single, education, poverty, unemployment, Asian, Black, Hispanic, foreign borne, male, and age. Age and male were considered, although they were not significant in the single models, because literature demonstrated that younger age and male were considered risk factors for HIV and/or TB among Harris County. For the final model, housing size, unemployment, Black, and foreign borne were significant (p-value<0.05). Furthermore, the AIC value was smaller for this model (620.264) when compared to the other models run previously for the co-morbidity outcome. The significance of unemployment was unexpected because oftentimes poverty is considered a substitute for this variable, whereas poverty was not significant in this model. For the significant predictors, the relationship indicated that increased percentage in smaller households was associated with lower co-morbidity while an increased percentage of unemployment, Black, and foreign borne were associated with higher co- morbidity. Asian was another variable that needs additional consideration because it is considered a protective factor against higher rates of HIV at the census tract level but a risk factor for higher rates of TB at the census tract level. Results indicate that higher percentage of Asian populations might be associated with lower co-morbidity. Limitations and Recommendations: In attempting to address the multicollinearity issues involved in this procedure, factor analysis was applied. However, there might have been other, preferred methods to reduce the redundancy of variables, including simple review of the literature. In addition, creating subscales based on the factor loadings might not have been effective or done correctly based on the nature of factor analysis. It was determined that it was the best method to apply within the limited timeframe and tools available, but additional possibilities should be pursued in the future. Furthermore, the data cleaning and geocoding process should be reviewed for accuracy due to the complicated and extensive labor that was undertaken to create the final database. Additionally, the initial analysis was run before any understanding of multilevel analysis was available. The next steps have already begun to pursue this analysis using the subject level TB data that was initially transmitted via DSHS. In this case, the subject level and census tract level can be considered instead of aggregating the data.
  • 7. Kellie Watkins 7 APPENDIX II: Initial Factor Method: Principal Components
  • 10. Kellie Watkins 10 Appendix III: Marital Status
  • 11. Kellie Watkins 11 APPENDIX IV: Education
  • 12. Kellie Watkins 12 APPENDIX V: HIV, TB, and HIV/TB Co-Morbidity Models Outcome: HIV (binary)
  • 13. Kellie Watkins 13 Outcome: TB (binary) Outcome: HIV/TB (co-morbidity, binary)