SlideShare a Scribd company logo
Comparison of Genetic Risk Factors Between Two Type II
Diabetes Subtypes
Item type text; Electronic Thesis
Authors Schader, Lindsey Marie
Publisher The University of Arizona.
Rights Copyright © is held by the author. Digital access to this
material is made possible by the University Libraries,
University of Arizona. Further transmission, reproduction
or presentation (such as public display or performance) of
protected items is prohibited except with permission of the
author.
Downloaded 5-Feb-2017 22:02:43
Link to item http://hdl.handle.net/10150/595048
COMPARISON OF GENETIC RISK FACTORS BETWEEN TWO TYPE II DIABETES SUBTYPES
By
LINDSEY MARIE SCHADER
UND SDL
A Thesis Submitted to the Honors College
In Partial Fulfillment of the Bachelors degree
With Honors in
Biology
THE UNIVERSITY OF ARIZONA
DECEMBER 2015
Approved by:
. .
Dr. Yann Klimentidis
Department of Epidemiology and Biostatistics
Abstract
Type 2 Diabetes (T2D) is an extremely heterogeneous disease, and the heritability of T2D is not
fully accounted for. This study seeks to determine T2D subtypes based on clinical features
before T2D diagnosis, and to test whether genetic risk factors differ between the subtypes. A
sample of 13,459 White, GWAS study participants was obtained from FRAM, MESA, and ARIC.
This sample consisted of 832 cases (individuals who developed T2D during follow-up) and
12,066 controls (did not develop T2D). K-means clustering was used to cluster individuals in the
cases dataset based on metabolic and anthropometric characteristics. Cox proportional hazards
models were used to test whether T2D genetic risk factors differed between the groups. The
clustering analysis resulted in two clusters with cluster one consisting of a higher percentage of
women with higher WHR, lower HDL, and higher FI as compared to cluster two. There were no
statistically significant differences between the genetic risk factors of the two clusters. The
most significant differences in genetic risk factors were associated with adiposity, suggesting
some interaction between adiposity genes and the characteristic phenotypes of each cluster on
T2D development. Further research is needed to replicate subtypes and to find significant
genetic associations.
1
Introduction
Type 2 Diabetes (T2D) is an increasingly prevalent disease worldwide. In 2014 9.3% of the U.S.
population suffered from the disease (1), and it is projected that there will be around 300
million diabetes cases in 2025 (2). In addition, diabetes was the seventh leading cause of death
in the U.S. in 2010 (1). Diabetes along with its associated health complications cost the United
States a total of $245 billion in 2012, with $176 billion in direct costs (1). T2D is characterized
by the body’s inability to control blood glucose levels, which can lead to multiple health
problems including kidney failure, heart disease, hypoglycemia, hyperglycemia, eye problems,
and amputations, among others (1). Clearly diabetes is of epidemic proportions, and it is a
major health concern for the United States and the rest of the world.
In contrast to T1D which is characterized by the autoimmune destruction of pancreatic beta-
cells, T2D is an extremely heterogeneous disease characterized by either abnormal production
of insulin or insulin resistance (3). Some researchers consider the categorization of T2D as a
single disease entity to be a major error due to the large heterogeneity of the disease. T2D
exists on a continuum between insulin-resistant obese patients to insulin-deficient lean patients
(4), and patient phenotypes tend to differ in insulin dependency, metabolic characteristics, and
the presence of GAD antibodies (5). As a result of the phenotypic and genetic heterogeneity of
T2D, the disease is difficult to diagnose. T2D is typically diagnosed based on diabetes patients
simply not meeting the diagnosis criteria for other forms of the disease (6). Due to this
heterogeneity, researchers have suggested that there may be multiple subtypes of the disease.
2
Identifying such subtypes would improve diagnostic tools, and currently research is underway
to identify these subtypes.
Recent research has focused upon identifying T2D subtypes based on different physical
characteristics of patients. Faerch et. al grouped T2D patients based on their fasting insulin and
two hour glucose serum concentrations and found that these groups of patients had differing
trajectories of multiple phenotypic measurements such as beta-cell functioning and risk of
cardiovascular disease (7). More recently, Bapat et al. identified an increased level of Treg cells
(cells involved in immune responses and inflammation) in mice with age-associated T2D as
compared to healthy mice and mice with obesity-associated T2D. They also found that blocking
the growth of these Treg cells in mice prevented age-associated insulin resistance, suggesting a
major etiological difference between T2D patients (8). In addition to these studies, recent
research has used mathematical tools for grouping patients based on multiple characteristics.
One common method, called k-means clustering, has been used to classify different subtypes of
Parkinson’s disease as well as to group individuals based on their metabolic characteristics for
targeted nutritional advice (9,10). The use of clustering to subgroup T2D patients is very
limited. One study published in Science Translational Medicine used topological analysis to
create subgroups of T2D patients based on 73 clinical features. This process resulted in three
subgroups of T2D patients with differing genetic associations between the groups (11). Despite
this study, and other studies that focus on the genetics of T2D, only 15% of the heritability of
T2D has been explained, while around 80% of the heritability of T1D has been accounted for
(6). Even with these promising results on subtyping and the genetics of T2D, a comprehensive
picture is still lacking that includes both the genetics and the etiology of T2D.
3
The objective of our study is to cluster patients who continue on to develop T2D into distinct
subtypes and to test whether genetic risk factors differ between the two subtypes. This analysis
will allow us to identify genes that play a role in T2D development, but were previously
unidentified due to their effect on only one subgroup of T2D patients. This research is clinically
useful because it allows us to further understand the heterogeneity of T2D and thus to create
individualized treatments (4). In addition, the identification of genetic risk factors will allow us
to learn more about the etiology of T2D by performing research on the molecular functions of
any genes identified. Lastly, the ability to cluster patients based on phenotypic characteristics
before T2D development allows us to more accurately estimate disease risk by allowing us to
consider physical characteristics and genetics jointly for pre-diabetic patients. Currently, genetic
risk scores (GRS), or models that use genetic data as inputs to estimate disease risk, are used to
estimate patient risk for T2D. Some of these models also incorporate physical characteristics
(12). Our analysis allows us to more accurately describe how the physical environment and
genetics interact to lead to T2D development, by creating subtypes of patients based on
physical characteristics and by determining whether some genes cause increased risk of T2D for
a certain subtype.
Methods
Studies
We used data on 13,459 GWAS study participants obtained from the Framingham Heart Study
(FRAM), the MESA SHARe Study (MESA), and the Atherosclerosis Risk in Communities study
4
(ARIC). The first study, FRAM was conducted to identify risk factors for cardiovascular disease.
The data used in our study was from the offspring cohort which consisted of men and women
ages 30 to 62 years living in Framingham, Massachusetts. The phenotypic data used in our
study was from visit four of the FRAM Offspring cohort (13). The second study, MESA, was a
prospective cohort study with the purpose of investigating cardiovascular disease. The
genotyped cohort used in our study consisted of men and women ages 45 to 84 years from six
different communities across the United States (14). The last study, ARIC is another prospective
study with the aim of identifying the causes and clinical outcomes of atherosclerosis. The
cohort component used from this study consisted of men and women ages 45 to 64 years from
four communities across the United States (15).
The data on self-declared white participants from all three of these studies was compiled to
create our cohort. All individuals with prevalent T2D at the initiation of data collection were
excluded from the study, and subjects were divided into cases and controls. Cases consisted of
those individuals who developed T2D over the course of follow-up, while controls were subjects
who did not develop T2D over the course of follow-up. Among the cases we excluded
individuals on cholesterol-lowering or hypertension medication along with anyone with missing
values for the phenotypic variables of interest. The final cases dataset consisted of 178 FRAM
participants, 109 MESA participants, and 545 ARIC participants, for a total of 832 individuals.
The control dataset consisted of 2,602 FRAM participants, 2,119 MESA participants, and 7,345
ARIC participants, for a total of 12,066 individuals.
5
Phenotypes
The phenotypes chosen for clustering included the important metabolic and anthropometric
measurements that were available in all three studies. These variables included sex, body-mass
index (BMI), waist-to-hip ratio (WHR), triglycerides (TG), high-density lipoprotein (HDL), fasting
glucose (FG), fasting insulin (FI), total cholesterol (TC), systolic blood pressure (SBP), and
diastolic blood pressure (DBP). These phenotypes are common measurements used in risk
scores for predicting T2D development (12). Further glucose, total cholesterol, and triglyceride
levels have all been determined to be important indicators of metabolic health (10). The units
of measurement for TG, HDL, FI, and FG differed between the studies. To control for any
variation between studies, all phenotypic variables were scaled in the cohort dataset before
clustering analysis.
Genotypes
600 SNPs were selected for analysis along with 22 genetic risk scores (GRS). GRS were
calculated for the phenotypes of T2D, FI, FG, two hour glucose (THG), proinsulin (PRO), SNPs
relating to HbA1c (HBA), low levels of adiponectin (ADPN), BMI, WHR, TC, low-density
lipoprotein (LDL), HDL, adiposity (FAT), C-reactive protein (CRP), serum urate (URATE), blood
pressure (BP), insulin resistance (IR), beta-cell function (BC), WHR adjusted BMI, and
triglycerides (TG). GRS risk scores were calculated as weighted averages of alleles that have
been previously identified as risk alleles for the corresponding phenotypes.
6
Clustering Method
K-means clustering was used to create clusters of T2D patients from our cases cohort. The k-
means clustering method is a partitioning method where participants are grouped based on a
pre-specified number of clusters. In the initial stage, cluster membership is determined by
Euclidian distance from randomly chosen points. Then the mean of the clusters formed are
calculated and the shorter Euclidian distance from the cluster mean is used to determine new
clusters. This process is reiterated for a specified number of times or until stability (9). The
variables selected for the clustering analysis are listed in the phenotype section of this paper,
and all were measured before the patient was diagnosed with T2D. All phenotypic variables
were standardized, so one variable did not carry more weight than another in the analysis
based on the unit of measurement. Optimal cluster number was determined using the 2.0-10
version of the cascadeKM function from the package Vegan in R. The calinski method argument
was used, as determined most appropriate by Milligan et al. (16). An optimal cluster number of
two was determined by this method. For the clustering portion of analysis the k-means function
in R was used on our dataset of T2D cases only. The default algorithm of Hartigan and Wong
(1979) was used, and the argument of 25 repetitions was implemented in order to avoid local
optimal solutions.
To determine whether the clusters obtained were significantly different from one another a t-
test between means of the two clusters for all the phenotypic variables of interest was
performed.
7
Statistical Analyses
To test whether genetics has differing effects on T2D development between the two clusters a
Cox proportional hazards model was used to estimate the effects of genetics and cluster
membership upon the risk of T2D development over time. Two Cox proportional hazards
models were used for this analysis.
Primary Model
The first model was estimated on the cases only dataset. In this Cox proportional hazards
model, the outcome of hazard rate of T2D development was regressed upon cluster
membership, where cluster membership was included as an ‘as.factor’ variable, and an
interaction term between cluster membership and a genetic variable (SNP or GRS). This
interaction term was the coefficient of interest in our analysis as it conveys a difference in
genetic effects between the two datasets. The main effects of age, sex, and the genetic variable
were included in this model as control variables.
Secondary Model
coxph(Surv(time to development, diabetes incidence)~SNP (or GRS) + Age + Sex + as.factor(cluster)
+ as.factor(cluster)*SNP(or GRS), data = both clusters )
8
The second analysis consisted of running a Cox proportional hazards model on two separate
datasets. The first dataset consisted of members of the first cluster and all controls. The second
dataset consisted of members of the second cluster and all controls. A genetic variable (SNP or
GRS) was used as an input variable into the model, and the hazard rate of T2D development
was used as the output. Age and sex were controlled for. This model was estimated for both
datasets and the difference between their coefficients for the genetic variables was evaluated
to indicate the differing effects of SNPs and GRSs between the two clusters.
These models were run for all SNP and GRS of interest.
Results
A summary of phenotypic characteristics of the cases are shown in Table 1. The Calinski-index
algorithm determined that two clusters was the optimal cluster number for our cases dataset.
After performing k-means clustering in R, we obtained two clusters with significant differences
between cluster means for nine out of the eleven clustering variables as determined by a t-test
between cluster means (Table 2). The first cluster (C1) consists of mostly women with a higher
WHR, lower HDL, and higher FI than cluster 2 (C2) which consisted of mostly men with lower
WHR, higher HDL, and lower FI. The differences between the age and total cholesterol
phenotypes were insignificant between the clusters. Summary statistics of the phenotypic
characteristics of each cluster may be found in Table 3.
coxph(Surv(time to development, diabetes incidence)~ SNP(or GRS) +Age + Sex, data=cluster 1 or 2)
9
After clustering was performed, we fit our primary model to the cases only dataset. This model
consists of a Cox proportional hazards model with cluster membership coded for as an
‘as.factor’ variable. The most significant results of this analysis are listed in Table 4 (SNP) and
Table 5 (GRS). This model did not generate any statistically significant coefficients for the
interaction between cluster membership and genotype characteristic (SNP or GRS). The top five
most significant SNPs were rs1553318, rs731839, rs6882076, rs2294239, and rs2652834. These
SNPS relate to TG, TG, LDL, WHR, and HDL respectively. Furthermore, the most significant GRS
scores identified were FAT and TG. Our secondary Cox proportional hazards model was run on
the clustering datasets separately, to see what effects certain genetic characteristics had on
T2D development. The result of interest in this model is the coefficient corresponding to the
genetic characteristic in each cluster model and how they differ between the two clusters.
These results are summarized in Table 4 for the top 20 most significant SNPs identified by our
primary model and in Table 5 for all the GRS scores tested. The genes that the 20 most
significant SNPs correspond to are listed in Table 6.
Discussion
The objective of our study was to identify subtypes of individuals who go on to develop T2D and
to determine whether the role of specific genetic factors differs between the two groups. Our
cluster analysis resulted in two groups of phenotypically distinct patients. C1 was generally
characterized by women with higher WHR, lower HDL, and higher FI, as compared to C2 which
consisted mostly of men with lower WHR, higher HDL, and lower FI. Although there were no
10
statistically significant findings on differing genetic effects between the two groups, we did
identify some associations of interest that may be explored in future research. Both the most
significant SNPs and the most significant GRS scores relate to adiposity, suggesting a differing
interaction between the phenotypes that characterize each cluster and adiposity-related
genotypes.
Although our two clusters were distinct from one another, they do not reflect the disease
subtypes that have been found in recent research. One recent study identified that there is a
physiological difference between patients that have age-associated onset of T2D versus
obesity-associated onset of the disease. They found that altering Treg cells in mice prevented
age-associated onset of T2D while obesity-associated onset of T2D had no association with Treg
cells (8). Our clusters did not reflect these different subtypes of T2D. C1 had a slightly higher
age on average (55.06 vs. 54.86 years) but the difference is not significant. It is true that many
measurements of metabolic health differed between the two groups, but there is not a clear
distinction between late age of onset and obesity-related T2D. Another study, more similar to
our analysis, looked at subtyping T2D patients based on multiple phenotypic characteristics
using a topological approach. This study had more power to identify subtypes than our study
due to their wealth of phenotypic data (73 variables included in clustering) and large sample
size of 11,210 T2D patients. This analysis yielded three subtypes. The first subtype was
characterized by young, overweight patients, the second was characterized by low-weight
patients, while the third group of patients had high SBP, serum chloride, and troponin I levels.
These subtypes also do not reflect the subtypes discovered in our analysis or the analysis of
other researchers, and the detailed differences between their three subtypes included
11
metabolic measurements that were not available in our data, such as white blood cell count
and serum albumin (11). Current research is focusing on disease subtyping of T2D because the
disease is so heterogeneous, but our study, along with others does not create a clear consensus
on distinct subtypes.
In addition to the unique subtypes delineated by our analysis, some interesting genetic trends
were identified that reflect some current T2D research. The two most significant SNPs identified
in our analysis are both related to TG levels. The first SNP, rs1553318, is associated with the
HAVCR1 gene. This SNP increased the risk of T2D development in both clusters, but more so in
C2. The second SNP, rs731839, is near the PEPD gene. This SNP was associated with increased
T2D risk in C1 and decreased risk in C2. It is of interest that both SNPs are associated with TG
because the pre-diabetic state is characterized by elevated TG levels (a characteristic of
dyslipidemia). Furthermore, TG levels can be used to predict T2D risk, and one study found that
looking at the change in TG levels over time helped predict T2D risk in men (17). In addition to
the effects of TG on T2D risk, some research suggests that there is an association between a
certain variant of the PEPD gene and T2D development. A recent GWAS study on Chinese Hans
found that higher levels of n-3 fatty acids help mitigate the increased risk of T2D development
caused by the PEPD gene (18). This corresponds to our results where PEPD had a protective
effect against T2D development in C2, since C2 is characterized by lower TG levels ( associated
with high n-3 fatty acid consumption) (19).
The third most significant SNP in our analysis, rs6882076, is near the TIMD4 gene which affects
LDL levels. This SNP increased the risk of T2D development in C1 while decreasing risk in C2.
12
Although LDL is not a major characteristic used in calculating T2D risk (12) because LDL levels do
not differ significantly between diabetics and non-diabetics, research has found that LDL
particles in T2D patients are typically smaller than those in their non-diabetic peers (20).The
results of our analysis suggest that there may be some association between the TIMD4 gene
and the environment that affects T2D development.
There are far reaching implications for T2D research that incorporates both disease subtype
and genetic risk. First, by identifying distinct subtypes of patients, one may create more
accurate models of disease risk that incorporate both physical characteristics and genetics. The
interaction between genes and the environment is complex, and it is not fully detailed in a
mathematical analysis. By identifying characteristics of subgroups, researchers can gain a better
understanding of what traits may be involved in disease pathways and focus their research in
these areas. This may lead to a greater understanding of the disease itself and targeted
diagnosis and treatment based on disease subtype. The goal is to find distinct subtypes of the
disease that may be clearly defined and then to identify the metabolic pathways involved.
Our study had a number of limitations that prevented us from determining whether the
patients were clustered into truly distinct subtypes and which made it challenging to find
statistically significant results. First, we were not able to test whether the clusters could be
recreated in another dataset to confirm our findings. Further, our cohort was small for a genetic
analysis (only 832 T2D cases) because we removed all people on cholesterol lowering
medication and hypertension medication. Lastly, our cohort included only whites, so the results
may not be generalizable to other populations. Despite these weaknesses, our study has
13
multiple strengths including the inclusion of multiple studies which provided a wide range of
phenotypic and genetic data. In addition, the use of k-means clustering is not a common
method in the T2D literature, but its use has found successful subtypes for Parkinson’s disease
(9) and is therefore a promising method for disease subtyping.
Future research should focus upon replicating these subtypes in another cohort and including
more variables and subjects in the clustering analysis. Once clustering analyses produce
replicable results, thereby identifying distinct subtypes of T2D, animal models may be used to
further understand the disease etiology. There is still much to explore regarding how genetics
and the environment interact to influence the development of this disease.
14
References
1. National Diabetes Statistics Report, 2014. Centers for Disease Control and Prevention:
National Center for Chronic Disease Prevention and Health Promotion: Division of
Diabetes Translation. 2014;1-11.
2. King H, Aubert R, Herman W. Global Burden of Diabetes, 1995-2025. Diabetes Care.
1998;21:14141-1431.
3. Zimmet P, Albertit KGMM, Shaw J. Global and societal implications of the diabetes
epidemic. Nature. 2001;414(6865):782-787.
4. Gale EAM. Is type 2 diabetes a category error? The Lancet. 2013;381(9881):1956-1957.
5. Tuomi T, Santoro N, Caprio S, Cai M, Weng J, Groop L. The many faces of diabetes: a
disease with increasing heterogeneity. The Lancet. 2014;383(9922):1084-1094.
6. Groop L, Pociot F. Genetics of diabetes – are we missing the genes or the disease?
Molecular and Cellular Endocrinology. 2014;382(1):726-739.
7. Faerch K, Witte DR, Tabak AG, Perreault L, Herder C, et al. Trajectories of
cardiometabolic risk factors before diagnosis of three subtypes of type 2 diabetes: a
post-hoc analysis of the longitudinal Whiehall II cohort study. The Lancet Diabetes and
Endocrinology. 2013;1(1):43-51.
8. Bapat SP, Suh JM, Fang S, Liu S, Zhang Y, et al. Depletion of fat-resident Treg cells
prevent age-associated insulin resistance. Nature. 2015;528(7580):137-141.
9. Van Rooden SM, Heiser WJ, Kok JN, Verbaan D, van Hilten JJ, Marinus J. The
identification of Parkinson's disease subtypes using cluster analysis: a systematic
review. Movement Disorders. 2010;25(8):969–978.
15
10. O’Donovan CB, Walsh MC, Nugent AP, McNulty B, Walton J, et al. Use of metabotyping
for the delivery of personalized nutrition. Molecular Nutrition and Food Research.
2014;59(3):377-385.
11. Li L, Cheng W, Glicksberg BS, Gottesman O, Tamler R, et al. Identification of type 2
diabetes subgroups through topological analysis of patient similarity. Science
Translational Medicine. 2015;7(311): 311ra174-3.
12. Noble D, Mathur R, Dent T, Meads C, Greenhalgh T. Risk models and scores for type 2
diabetes: systematic review. BMJ. 2011;343:d7163.
13. Framingham Heart Study. NHBI. 2015.
https://www.framinghamheartstudy.org/index.php.
14. About MESA. MESA Coordinating Center: University of Washington. 2015.
http://www.mesa-nhlbi.org/aboutMESA.aspx.
15. Atherosclerosis Risk in Communities Study. Collaborating Studies Coordinating Center.
Department of Biostatistics Gillings School of Global Public Health. North Carolina
Chapel Hill. 2015. https://www2.cscc.unc.edu/aric/desc.
16. Milligan GW, Cooper MC. An examination of procedures for determining the number of
clusters in a data set. Psychometrika. 1985;50(2):159-179.
17. Tirosh A, Shai I, Bitzur R, Kochba I, Tekes-Manova D, Israeli E. Changes in triglyceride
levels over time and risk of type 2 diabetes in young men. Diabetes Care.
2008;31(10):2032+.
16
18. Zheng JS, Huang T, Li K, Chen Y, Xie H, et al. Modulation of the Association between the
PEPD variant and the risk of type 2 diabetes by n-3 fatty acids in Chinese Hans. Jounral
of Nutrigenetics and Nutrigenomics. 2015;8(1):36-43.
19. Harris WS, Bulchandani D. Why do omega-3 fatty acids lower serum triglycerides?
Current Opinion in Lipidology. 2006;17(4):387-393.
20. Nesto RW. LDL cholesterol lowering in type 2 diabetes: what is the optimum approach?
Clinical Diabetes. 2008;26(1):8-13.
17
Table 1: Baseline characteristics of participants in each of three cohorts.
ARIC FRAM MESA
n 545 178 109
Age (yrs) 54.02 (5.51) 54.92 (8.69) 59.87 (9.81)
% Female 59.82% 53.37% 52.29%
BMI (kg/m2
) 29.49 (4.76) 30.88 (5.84) 29.52 (6.40)
WHR 0.97 (0.06) 0.95 (0.09) 0.93 (0.08)
TG* 1.82 (1.04) 196.84 (128.23) 150.79 (95.71)
HDL** 1.12 (0.34) 43.19 (12.27) 49.58 (15.41)
FG**** 6.00 (0.54) 117.25 (33.65) 95.17 (13.13)
TC 5.50 (1.03) 208.77 (34.96) 205.39 (36.79)
SBP (mmHg) 121.07 (15.47) 132.88 (16.92) 125.95 (20.60)
DBP (mmHg) 73.73 (10.23) 78.92 (9.29) 71.10 (10.40)
FI*** 105.31 (60.86) 38.21 (16.28) 11.76 (8.36)
*TG units: ARIC - mmol/L; FRAM - Meq/L; MESA - mg/dL
**HDL units: ARIC - mmol/L; FRAM&MESA - mg/dL
***FI units: ARIC - pmol/L , FRAM - μmol/L, MESA - mU/L
****FG units: ARIC - mmol/L, FRAM - mg/dl, MESA - mg/dl
Table 2: Cluster phenotype comparisons via t-test.
Variable clust1mean clust1SD clust2mean clust2SD t-stat p-value
WHR_pheno 0.49058065 0.6669220 -0.72419048 0.9669827 20.0258690 2.456061e-67
HDL_pheno -0.46698033 0.6136599 0.68935191 1.0576818 -18.0838850 2.524972e-56
FI_pheno 0.38237082 1.0520284 -0.56445216 0.5506614 16.9133715 5.768705e-55
Sex 0.77620968 0.4172040 0.27678571 0.4480769 16.2165669 2.746904e-50
TG_pheno 0.31562379 1.1145907 -0.46592083 0.5232078 13.5651645 1.177836e-37
BMI_pheno 0.31943235 0.9644238 -0.47154299 0.8521667 12.4497415 1.460480e-32
DBP_pheno 0.29310623 0.9775874 -0.43268063 0.8646014 11.2642577 2.324341e-27
SBP_pheno 0.24978211 0.9528888 -0.36872597 0.9509042 9.1979556 3.866903e-19
FG_pheno 0.22054251 0.9732836 -0.32556276 0.9468849 8.0709721 2.857736e-15
TC_pheno 0.05963691 0.9818105 -0.08803544 1.0184503 2.0820662 3.769834e-02
Age 55.05443548 6.8291606 54.87500000 7.7795657 0.3426994 7.319345e-01
18
Table 3: Characteristics of each cluster organized by study.
cluster 1 Cluster 2
Study ARIC FRAM MESA ARIC FRAM MESA
N 332 98 68 213 80 41
Age (yrs) 54.33 (5.62) 54.67 (8.28) 59.18 (8.31) 53.54 (5.30) 55.23 (9.20) 61.02 (11.92)
% Female 79.00% 79.00% 71.00% 30.00% 22.00% 22.00%
BMI
(kg/m2
) 31.00 (4.60) 32.38 (5.52) 32.15 (6.26) 27.14 (4.00) 29.04 (5.71) 25.17 (3.73)
WHR 1.00 (0.04) 0.99 (0.06) 0.97 (0.06) 0.92 (0.06) 0.89 (0.08) 0.87 (0.08)
TG* 2.14 (1.13)
239.53
(151.69)
181.49
(108.30) 1.34 (0.60)
144.55
(59.68) 99.88 (28.76)
HDL** 0.97 (0.21) 36.49 (8.15) 42.71 (9.64) 1.35 (0.37) 51.39 (11.48) 60.98 (16.49)
FG**** 6.11 (0.50)
124.18
(39.78) 99.97 (11.71) 5.84 (0.57)
108.76
(21.48) 87.20 (11.45)
TC 5.56 (0.97)
209.60
(36.32)
209.65
(40.12) 5.42 (1.11)
207.75
(33.42)
198.34
(29.61)
SBP
(mmHg)
125.33
(15.03)
136.79
(16.07)
128.70
(17.66)
114.43
(13.73)
128.11
(16.81)
121.40
(24.28)
DBP
(mmHg)
76.69
(10.11) 82.21 (8.98) 73.29 (9.75) 69.11 (8.59) 74.89 (8.04) 67.48 (10.56)
FI***
127.72
(62.70) 44.60 (18.33) 15.13 (8.90) 70.37 (36.79) 30.39 (8.27) 6.18 (2.14)
*TG units: ARIC - mmol/L; FRAM - Meq/L; MESA - mg/dL
**HDL units: ARIC - mmol/L; FRAM&MESA - mg/dL
***FI units: ARIC - pmol/L , FRAM - μmol/L, MESA -
mU/L
****FG units: ARIC - mmol/L, FRAM - mg/dl, MESA -
mg/dl
19
Table 4: Most significant primary and secondary analysis results corresponding to SNP data.
SNP
Interaction
coefficient
for SNP and
cluster
membership
p-value
Hazard
Ratio
for
Cluster
1
Hazard
Ratio for
Cluster 2
Hazard
Ratio
Difference
rs1553318 0.3298372 0.0018933
2.48E-
03
0.0041739 1.69E-03
rs731839 -0.313511 0.0052293
-1.58E-
01
-0.055687 1.02E-01
rs6882076 0.2692967 0.0091396
1.58E-
02
-0.008271 2.41E-02
rs2294239 -0.267014 0.0093985
-8.76E-
02
-0.053721 3.39E-02
rs2652834 -0.307825 0.0115077
-2.15E-
02
0.0146121 3.61E-02
rs13139571 -0.315492 0.0115378
-1.52E-
01
-0.103202 4.84E-02
rs576674 0.3626719 0.0118196
-8.95E-
02
0.0898402 1.79E-01
rs4823006 -0.2515 0.0119255
-7.15E-
02
-0.013835 5.77E-02
rs849135 -0.236855 0.01251
-1.39E-
01
0.0084402 1.48E-01
rs6477694 0.2613947 0.0144687
5.31E-
03
0.0278352 2.25E-02
rs7998202 0.4151296 0.0148486
8.41E-
02
0.1806077 9.65E-02
rs1689800 -0.241809 0.0187326
6.38E-
02
0.0086597 5.51E-02
rs6804842 0.2314893 0.0218175
9.61E-
02
0.0206176 7.55E-02
rs7739232 -0.457728 0.0244218
8.93E-
02
0.0068656 8.25E-02
rs2779116 0.2566808 0.0257419
-6.03E-
02
-0.097148 3.69E-02
rs6795735 -0.225945 0.0266387
2.60E-
02
-0.022721 4.88E-02
rs2954022 0.231793 0.0267414
-4.37E-
02
0.0294779 7.32E-02
rs2078267 -0.228196 0.027513
-4.82E-
02
0.0031743 5.14E-02
rs2954029 0.2308958 0.0275867
-4.88E-
02
0.0217629 7.05E-02
rs12328675 0.3312844 0.0291462 -1.17E- -0.01858 9.84E-02
20
01
rs4865796 -0.241173 0.0300234
3.18E-
02
-0.051541 8.34E-02
rs11694172 -0.256426 0.0301398
3.00E-
02
-0.057769 8.77E-02
rs8182584 -0.2283 0.0331252
-1.00E-
01
-0.0663 3.38E-02
rs2290547 0.3144762 0.0451986
-1.89E-
01
0.1023083 2.91E-01
rs10929925 -0.216843 0.0457031
-3.29E-
02
-0.039378 6.48E-03
rs17367504 -0.281034 0.0471991
1.96E-
02
-0.021738 4.13E-02
rs7225700 -0.20684 0.0485637
1.79E-
02
-0.078474 9.64E-02
rs3780181 0.3922958 0.0497388
-4.78E-
02
0.0059285 5.38E-02
21
Table 5: Primary and Secondary analysis results corresponding to GRS data.
GRS
Interaction
coefficient
for GRS and
cluster
membership
p-value
Hazard
Ratio for
Cluster 1
Hazard
Ratio for
Cluster 2
Hazard
Ratio
Difference
FAT_GRS -1.035413 0.0464675 -0.125982 0.1108843 0.236866
TG_GRS -0.007944 0.0638599 -0.001277 -0.014616 0.013339
ADPN_GRS 1.7136247 0.0812667 1.0190954 0.9404897 0.078606
CRP_GRS 0.5235282 0.0934355 -0.138709 0.129723 0.268432
BMI_GRS 0.2146506 0.1011368 0.2438031 0.0369369 0.206866
PRO_GRS 1.4593472 0.1190019 0.5452933 -0.725566 1.270859
IR_GRS 0.0855827 0.1244277 0.0664784 0.0578434 0.008635
HDL_GRS -0.028909 0.142099 0.0128152 -0.032373 0.045188
URATE_GRS 0.3532343 0.1545706 -0.062828 0.139879 0.202707
HBA_GRS 1.3121146 0.1706568 0.4518716 1.6238976 1.172026
TG_GRS40 -0.431003 0.1836546 0.2381145 -0.864953 1.103068
BP_GRS 0.0507751 0.1887276 -0.032825 -0.02229 0.010535
FI_GRS 2.186906 0.1951746 2.8075312 2.6641247 0.143407
T2D_GRS 0.0767174 0.2964663 0.2875567 0.3214717 0.033915
LDL_GRS -0.008485 0.3233295 -0.006835 -0.010248 0.003413
TC_GRS -0.005176 0.4849864 -0.006943 -0.011592 0.004649
BC_GRS -0.027769 0.4905962 0.1140514 0.0997041 0.014347
WHR_GRS -0.648085 0.4958957 1.4135402 -0.021876 1.435416
FG_GRS 0.398858 0.5610801 1.7399193 2.3933806 0.653461
BMI96_GRS -0.163293 0.717723 0.5384569 0.209882 0.328575
WHRadjBMI48_GRS -0.212589 0.7464577 0.8287973 0.0777437 0.751054
THG_GRS 0.115274 0.796558 0.174948 0.0680792 0.106869
22
Table 6: Phenotypes and genes relating to the most significant SNPs in the primary analysis.
SNP
Associated
Trait
Effect
Risk
Allele
Gene (ID) Chromosome
rs1553318 TG 2.63 HAVCR1 5:157052312
rs731839 TG 0.022 PEPD 19:33408159
rs6882076 LDL 1.67 C TIMD4 5:156963286
rs2294239 WHR 0.025 ZNRF3 22:29053489
rs2652834 HDL 0.39 LACTB 15:63104668
rs13139571 BP 0.321259
GUCY1A3,
LOC105377506
4:155724361
rs576674 FG 0.016697 G
KL, ~36 kb
upstream
13:32980164
rs4823006 WHR 0.023 A ZNRF3 22:29055683
rs849135 T2D 1.11 G JAZF1 7:28156794
rs6477694 BMI C C9orf4 9:109170062
rs7998202 HBA 0.031 G ATP11AUN 13:112677554
rs1689800 HDL 0.47 G GLUL, ZNF648 1:182199750
rs6804842 BMI G LOC101927874 3:25064946
rs7739232 HIPadjBMI A Locus: KLHL31 6:53675537
rs2779116 HBA 0.024 T SPTA1 1:158615625
rs6795735 T2D 1.08 C ADAMTS9-AS2 3:64719689
rs2954022 TC 2.3 C LOC105375745 8:125470379
rs2078267 URATE 0.073 C SLC22A11 11:64566642
rs2954029 TG 5.64 A LOC105375745 8:125478730
rs12328675 HDL 0.68 T COBLL1 2:164684290
23
rs4865796 FI 0.015358 A ARL15 5:53976834
rs11694172 TC 0.028 G FAM117B 2:202667581
rs8182584 T2D 1.04 T PEPD 19:33418804
rs2290547 HDL -0.03 A SETD2 3:47019693
rs10929925 HIP C SOX11 2:6015425
rs17367504 BP 0.9030779 A MTHFR 1:11802721
rs7225700 LDL 0.87 C LOC102724508 17:47314438
rs3780181 TC -0.044 G VLDLR 9:2640759

More Related Content

What's hot

Physical activity and risk of cardiovascular disease—a
Physical activity and risk of cardiovascular disease—aPhysical activity and risk of cardiovascular disease—a
Physical activity and risk of cardiovascular disease—a
ArhamSheikh1
 
Cardiovascular diseases traditional_and_non-tradit
Cardiovascular diseases traditional_and_non-traditCardiovascular diseases traditional_and_non-tradit
Cardiovascular diseases traditional_and_non-tradit
ArhamSheikh1
 
1472 6874-4-s1-s15
1472 6874-4-s1-s151472 6874-4-s1-s15
1472 6874-4-s1-s15
ArhamSheikh1
 
Heart disease causes prevention and current
Heart disease causes prevention and currentHeart disease causes prevention and current
Heart disease causes prevention and current
ArhamSheikh1
 
Study on achievement of target LDC-C in Dyslipidimic patients
Study on achievement of target LDC-C in Dyslipidimic patientsStudy on achievement of target LDC-C in Dyslipidimic patients
Study on achievement of target LDC-C in Dyslipidimic patients
pharmaindexing
 
Barefoot_McGrath_Oliver_Poster_11032014
Barefoot_McGrath_Oliver_Poster_11032014Barefoot_McGrath_Oliver_Poster_11032014
Barefoot_McGrath_Oliver_Poster_11032014Danielle Barefoot
 
Total and Cause-Specific Mortality of U.S. Nurses Working Rotating Night Shifts
Total and Cause-Specific Mortality of U.S. Nurses Working Rotating Night ShiftsTotal and Cause-Specific Mortality of U.S. Nurses Working Rotating Night Shifts
Total and Cause-Specific Mortality of U.S. Nurses Working Rotating Night Shifts
Emergency Live
 
IJSRED-V2I1P1
IJSRED-V2I1P1IJSRED-V2I1P1
IJSRED-V2I1P1
IJSRED
 
Risk factors of chronic liver disease amongst patients receiving care in a Ga...
Risk factors of chronic liver disease amongst patients receiving care in a Ga...Risk factors of chronic liver disease amongst patients receiving care in a Ga...
Risk factors of chronic liver disease amongst patients receiving care in a Ga...
iosrjce
 
Acg guideline cdifficile_april_2013
Acg guideline cdifficile_april_2013Acg guideline cdifficile_april_2013
Acg guideline cdifficile_april_2013
cesar gaytan
 
Similarities between Type 1 Diabetes and Alopecia Areata
Similarities between Type 1 Diabetes and Alopecia AreataSimilarities between Type 1 Diabetes and Alopecia Areata
Similarities between Type 1 Diabetes and Alopecia Areata
National Alopecia Areata Foundation
 
The frailty syndrome final draft with references final draft
The frailty syndrome final draft with references final draftThe frailty syndrome final draft with references final draft
The frailty syndrome final draft with references final draftRuth Carry
 
Statistical analysis of risk factors associated with
Statistical analysis of risk factors associated withStatistical analysis of risk factors associated with
Statistical analysis of risk factors associated with
anamjavaid13
 
Childhood and later-onset vitiligo have diverse
Childhood  and later-onset vitiligo have diverseChildhood  and later-onset vitiligo have diverse
Childhood and later-onset vitiligo have diverse
tloanphan
 
The care of the sexual assault patient
The care of the sexual assault patientThe care of the sexual assault patient
The care of the sexual assault patient
TÀI LIỆU NGÀNH MAY
 

What's hot (16)

Physical activity and risk of cardiovascular disease—a
Physical activity and risk of cardiovascular disease—aPhysical activity and risk of cardiovascular disease—a
Physical activity and risk of cardiovascular disease—a
 
Cardiovascular diseases traditional_and_non-tradit
Cardiovascular diseases traditional_and_non-traditCardiovascular diseases traditional_and_non-tradit
Cardiovascular diseases traditional_and_non-tradit
 
1472 6874-4-s1-s15
1472 6874-4-s1-s151472 6874-4-s1-s15
1472 6874-4-s1-s15
 
Heart disease causes prevention and current
Heart disease causes prevention and currentHeart disease causes prevention and current
Heart disease causes prevention and current
 
Study on achievement of target LDC-C in Dyslipidimic patients
Study on achievement of target LDC-C in Dyslipidimic patientsStudy on achievement of target LDC-C in Dyslipidimic patients
Study on achievement of target LDC-C in Dyslipidimic patients
 
Barefoot_McGrath_Oliver_Poster_11032014
Barefoot_McGrath_Oliver_Poster_11032014Barefoot_McGrath_Oliver_Poster_11032014
Barefoot_McGrath_Oliver_Poster_11032014
 
Total and Cause-Specific Mortality of U.S. Nurses Working Rotating Night Shifts
Total and Cause-Specific Mortality of U.S. Nurses Working Rotating Night ShiftsTotal and Cause-Specific Mortality of U.S. Nurses Working Rotating Night Shifts
Total and Cause-Specific Mortality of U.S. Nurses Working Rotating Night Shifts
 
IJSRED-V2I1P1
IJSRED-V2I1P1IJSRED-V2I1P1
IJSRED-V2I1P1
 
MLGG_for_linkedIn
MLGG_for_linkedInMLGG_for_linkedIn
MLGG_for_linkedIn
 
Risk factors of chronic liver disease amongst patients receiving care in a Ga...
Risk factors of chronic liver disease amongst patients receiving care in a Ga...Risk factors of chronic liver disease amongst patients receiving care in a Ga...
Risk factors of chronic liver disease amongst patients receiving care in a Ga...
 
Acg guideline cdifficile_april_2013
Acg guideline cdifficile_april_2013Acg guideline cdifficile_april_2013
Acg guideline cdifficile_april_2013
 
Similarities between Type 1 Diabetes and Alopecia Areata
Similarities between Type 1 Diabetes and Alopecia AreataSimilarities between Type 1 Diabetes and Alopecia Areata
Similarities between Type 1 Diabetes and Alopecia Areata
 
The frailty syndrome final draft with references final draft
The frailty syndrome final draft with references final draftThe frailty syndrome final draft with references final draft
The frailty syndrome final draft with references final draft
 
Statistical analysis of risk factors associated with
Statistical analysis of risk factors associated withStatistical analysis of risk factors associated with
Statistical analysis of risk factors associated with
 
Childhood and later-onset vitiligo have diverse
Childhood  and later-onset vitiligo have diverseChildhood  and later-onset vitiligo have diverse
Childhood and later-onset vitiligo have diverse
 
The care of the sexual assault patient
The care of the sexual assault patientThe care of the sexual assault patient
The care of the sexual assault patient
 

Viewers also liked

HexA in pulping and bleaching
HexA in pulping and bleachingHexA in pulping and bleaching
HexA in pulping and bleaching
Suresh Kumar Nath
 
Chavez norma act2_recursos-actividades-aula-virtual
Chavez norma act2_recursos-actividades-aula-virtualChavez norma act2_recursos-actividades-aula-virtual
Chavez norma act2_recursos-actividades-aula-virtual
tutor3160
 
Identificacion de variables para los estudios de mercado tecnico
Identificacion de variables para los estudios de mercado  tecnicoIdentificacion de variables para los estudios de mercado  tecnico
Identificacion de variables para los estudios de mercado tecnicoBÖmböm ËlëctrÖö ZËsÖöal
 
Los conceptos de alfabetización crítica e inteligencia material: ¡un encuentr...
Los conceptos de alfabetización crítica e inteligencia material: ¡un encuentr...Los conceptos de alfabetización crítica e inteligencia material: ¡un encuentr...
Los conceptos de alfabetización crítica e inteligencia material: ¡un encuentr...
Conectarnos Soluciones de Internet
 
Dynamic data encryption using the concepts of ANN
Dynamic data encryption using the concepts of ANNDynamic data encryption using the concepts of ANN
Dynamic data encryption using the concepts of ANN
Rajan Brahma
 
ゼロから始める副業ブログ開設講座0213
ゼロから始める副業ブログ開設講座0213ゼロから始める副業ブログ開設講座0213
ゼロから始める副業ブログ開設講座0213
gorikidaisuke
 
Webquest
WebquestWebquest
Enseñanza virtual, universidad y EESS avances y limitaciones
Enseñanza virtual, universidad y EESS avances y limitacionesEnseñanza virtual, universidad y EESS avances y limitaciones
Enseñanza virtual, universidad y EESS avances y limitaciones
Conectarnos Soluciones de Internet
 

Viewers also liked (13)

HexA in pulping and bleaching
HexA in pulping and bleachingHexA in pulping and bleaching
HexA in pulping and bleaching
 
Chavez norma act2_recursos-actividades-aula-virtual
Chavez norma act2_recursos-actividades-aula-virtualChavez norma act2_recursos-actividades-aula-virtual
Chavez norma act2_recursos-actividades-aula-virtual
 
Word 2010 guía-columnas
Word 2010 guía-columnasWord 2010 guía-columnas
Word 2010 guía-columnas
 
Identificacion de variables para los estudios de mercado tecnico
Identificacion de variables para los estudios de mercado  tecnicoIdentificacion de variables para los estudios de mercado  tecnico
Identificacion de variables para los estudios de mercado tecnico
 
Los conceptos de alfabetización crítica e inteligencia material: ¡un encuentr...
Los conceptos de alfabetización crítica e inteligencia material: ¡un encuentr...Los conceptos de alfabetización crítica e inteligencia material: ¡un encuentr...
Los conceptos de alfabetización crítica e inteligencia material: ¡un encuentr...
 
Aventura en el duratón (jac oct. 2011)
Aventura en el duratón (jac oct. 2011)Aventura en el duratón (jac oct. 2011)
Aventura en el duratón (jac oct. 2011)
 
Dominios de internet
Dominios de internetDominios de internet
Dominios de internet
 
Oportunidad de negocio
Oportunidad de negocioOportunidad de negocio
Oportunidad de negocio
 
Chilean Capital Market
Chilean Capital MarketChilean Capital Market
Chilean Capital Market
 
Dynamic data encryption using the concepts of ANN
Dynamic data encryption using the concepts of ANNDynamic data encryption using the concepts of ANN
Dynamic data encryption using the concepts of ANN
 
ゼロから始める副業ブログ開設講座0213
ゼロから始める副業ブログ開設講座0213ゼロから始める副業ブログ開設講座0213
ゼロから始める副業ブログ開設講座0213
 
Webquest
WebquestWebquest
Webquest
 
Enseñanza virtual, universidad y EESS avances y limitaciones
Enseñanza virtual, universidad y EESS avances y limitacionesEnseñanza virtual, universidad y EESS avances y limitaciones
Enseñanza virtual, universidad y EESS avances y limitaciones
 

Similar to Schader_Honors_Thesis

RunningHead PICOT Question1RunningHead PICOT Question7.docx
RunningHead PICOT Question1RunningHead PICOT Question7.docxRunningHead PICOT Question1RunningHead PICOT Question7.docx
RunningHead PICOT Question1RunningHead PICOT Question7.docx
rtodd599
 
Evaluation of the risk factors for the development of metabolic syndrome in b...
Evaluation of the risk factors for the development of metabolic syndrome in b...Evaluation of the risk factors for the development of metabolic syndrome in b...
Evaluation of the risk factors for the development of metabolic syndrome in b...
Alexander Decker
 
Thesis On Psoriasis
Thesis On PsoriasisThesis On Psoriasis
Thesis On Psoriasis
Amanda Burkett
 
Cancer, alzheimers, diabetes – what do all have in common?
Cancer, alzheimers, diabetes – what do all have in common?Cancer, alzheimers, diabetes – what do all have in common?
Cancer, alzheimers, diabetes – what do all have in common?
morwenna2
 
T h e n e w e n g l a n d j o u r n a l o f m e d i c i n
T h e  n e w  e n g l a n d  j o u r n a l  o f  m e d i c i nT h e  n e w  e n g l a n d  j o u r n a l  o f  m e d i c i n
T h e n e w e n g l a n d j o u r n a l o f m e d i c i n
lisandrai1k
 
Prevalence of cvd risk factors among qatari patients with type 2 diabetes mel...
Prevalence of cvd risk factors among qatari patients with type 2 diabetes mel...Prevalence of cvd risk factors among qatari patients with type 2 diabetes mel...
Prevalence of cvd risk factors among qatari patients with type 2 diabetes mel...
Dr. Anees Alyafei
 
Medical Co Morbidities in Patients of Frontal Temporal Dementia -A Hospital B...
Medical Co Morbidities in Patients of Frontal Temporal Dementia -A Hospital B...Medical Co Morbidities in Patients of Frontal Temporal Dementia -A Hospital B...
Medical Co Morbidities in Patients of Frontal Temporal Dementia -A Hospital B...
CrimsonpublishersMedical
 
Comparison of autocorrelation between CV-RISK independent variables in groups...
Comparison of autocorrelation between CV-RISK independent variables in groups...Comparison of autocorrelation between CV-RISK independent variables in groups...
Comparison of autocorrelation between CV-RISK independent variables in groups...
YohanesFirmansyah1
 
subtypeoftype_2_diabetes.pptx
subtypeoftype_2_diabetes.pptxsubtypeoftype_2_diabetes.pptx
subtypeoftype_2_diabetes.pptx
Faculty of Medicine And Health Sciences
 
ANGINA: Treatment by Alternative Therapeutic Principal?
ANGINA: Treatment by Alternative Therapeutic Principal?ANGINA: Treatment by Alternative Therapeutic Principal?
ANGINA: Treatment by Alternative Therapeutic Principal?
Dr Tarique Ahmed Maka
 
The Metabolic Syndrome in a State Psychiatric Hospital Population 10.26.09
The Metabolic Syndrome in a State Psychiatric Hospital Population 10.26.09The Metabolic Syndrome in a State Psychiatric Hospital Population 10.26.09
The Metabolic Syndrome in a State Psychiatric Hospital Population 10.26.09
Leonard Davis Institute of Health Economics
 
A study on awareness of diabetic complications among type 2 diabetes patients
A study on awareness of diabetic complications among type 2 diabetes patientsA study on awareness of diabetic complications among type 2 diabetes patients
A study on awareness of diabetic complications among type 2 diabetes patients
iosrjce
 
Diabetes with Hypertension: Etiology, Pathogenesis and Management 443 ijiit
Diabetes with Hypertension: Etiology, Pathogenesis and Management 443 ijiitDiabetes with Hypertension: Etiology, Pathogenesis and Management 443 ijiit
Diabetes with Hypertension: Etiology, Pathogenesis and Management 443 ijiit
International Journal of Integrative sciences, Innovation and Technology (IJIIT) - AGSI
 
Burden of depressive disorders by country, sex, age, and year findings from t...
Burden of depressive disorders by country, sex, age, and year findings from t...Burden of depressive disorders by country, sex, age, and year findings from t...
Burden of depressive disorders by country, sex, age, and year findings from t...
Lilin Rosyanti Poltekkes kemenkes kendari
 
Presentation ppt.pptx
Presentation ppt.pptxPresentation ppt.pptx
Presentation ppt.pptx
NehaMasarkar1
 
Socio demographic profile of Diabetic cases attended at Diabetic clinic of a ...
Socio demographic profile of Diabetic cases attended at Diabetic clinic of a ...Socio demographic profile of Diabetic cases attended at Diabetic clinic of a ...
Socio demographic profile of Diabetic cases attended at Diabetic clinic of a ...
International Multispeciality Journal of Health
 
Factors associated with the presence of diabetic ketoacidosis at diagnosis of...
Factors associated with the presence of diabetic ketoacidosis at diagnosis of...Factors associated with the presence of diabetic ketoacidosis at diagnosis of...
Factors associated with the presence of diabetic ketoacidosis at diagnosis of...
松波動物病院メディカルセンター
 
A Study on Biomarkers in a Spatially Distributed Type – 2 Diabetes mellitus G...
A Study on Biomarkers in a Spatially Distributed Type – 2 Diabetes mellitus G...A Study on Biomarkers in a Spatially Distributed Type – 2 Diabetes mellitus G...
A Study on Biomarkers in a Spatially Distributed Type – 2 Diabetes mellitus G...
IOSRJPBS
 
obesidad en diabetes tipo 2
obesidad en diabetes tipo 2obesidad en diabetes tipo 2
obesidad en diabetes tipo 2
cesar gaytan
 
Nutritional assessment status of adult patients with multiple sclerosis: A na...
Nutritional assessment status of adult patients with multiple sclerosis: A na...Nutritional assessment status of adult patients with multiple sclerosis: A na...
Nutritional assessment status of adult patients with multiple sclerosis: A na...
Innspub Net
 

Similar to Schader_Honors_Thesis (20)

RunningHead PICOT Question1RunningHead PICOT Question7.docx
RunningHead PICOT Question1RunningHead PICOT Question7.docxRunningHead PICOT Question1RunningHead PICOT Question7.docx
RunningHead PICOT Question1RunningHead PICOT Question7.docx
 
Evaluation of the risk factors for the development of metabolic syndrome in b...
Evaluation of the risk factors for the development of metabolic syndrome in b...Evaluation of the risk factors for the development of metabolic syndrome in b...
Evaluation of the risk factors for the development of metabolic syndrome in b...
 
Thesis On Psoriasis
Thesis On PsoriasisThesis On Psoriasis
Thesis On Psoriasis
 
Cancer, alzheimers, diabetes – what do all have in common?
Cancer, alzheimers, diabetes – what do all have in common?Cancer, alzheimers, diabetes – what do all have in common?
Cancer, alzheimers, diabetes – what do all have in common?
 
T h e n e w e n g l a n d j o u r n a l o f m e d i c i n
T h e  n e w  e n g l a n d  j o u r n a l  o f  m e d i c i nT h e  n e w  e n g l a n d  j o u r n a l  o f  m e d i c i n
T h e n e w e n g l a n d j o u r n a l o f m e d i c i n
 
Prevalence of cvd risk factors among qatari patients with type 2 diabetes mel...
Prevalence of cvd risk factors among qatari patients with type 2 diabetes mel...Prevalence of cvd risk factors among qatari patients with type 2 diabetes mel...
Prevalence of cvd risk factors among qatari patients with type 2 diabetes mel...
 
Medical Co Morbidities in Patients of Frontal Temporal Dementia -A Hospital B...
Medical Co Morbidities in Patients of Frontal Temporal Dementia -A Hospital B...Medical Co Morbidities in Patients of Frontal Temporal Dementia -A Hospital B...
Medical Co Morbidities in Patients of Frontal Temporal Dementia -A Hospital B...
 
Comparison of autocorrelation between CV-RISK independent variables in groups...
Comparison of autocorrelation between CV-RISK independent variables in groups...Comparison of autocorrelation between CV-RISK independent variables in groups...
Comparison of autocorrelation between CV-RISK independent variables in groups...
 
subtypeoftype_2_diabetes.pptx
subtypeoftype_2_diabetes.pptxsubtypeoftype_2_diabetes.pptx
subtypeoftype_2_diabetes.pptx
 
ANGINA: Treatment by Alternative Therapeutic Principal?
ANGINA: Treatment by Alternative Therapeutic Principal?ANGINA: Treatment by Alternative Therapeutic Principal?
ANGINA: Treatment by Alternative Therapeutic Principal?
 
The Metabolic Syndrome in a State Psychiatric Hospital Population 10.26.09
The Metabolic Syndrome in a State Psychiatric Hospital Population 10.26.09The Metabolic Syndrome in a State Psychiatric Hospital Population 10.26.09
The Metabolic Syndrome in a State Psychiatric Hospital Population 10.26.09
 
A study on awareness of diabetic complications among type 2 diabetes patients
A study on awareness of diabetic complications among type 2 diabetes patientsA study on awareness of diabetic complications among type 2 diabetes patients
A study on awareness of diabetic complications among type 2 diabetes patients
 
Diabetes with Hypertension: Etiology, Pathogenesis and Management 443 ijiit
Diabetes with Hypertension: Etiology, Pathogenesis and Management 443 ijiitDiabetes with Hypertension: Etiology, Pathogenesis and Management 443 ijiit
Diabetes with Hypertension: Etiology, Pathogenesis and Management 443 ijiit
 
Burden of depressive disorders by country, sex, age, and year findings from t...
Burden of depressive disorders by country, sex, age, and year findings from t...Burden of depressive disorders by country, sex, age, and year findings from t...
Burden of depressive disorders by country, sex, age, and year findings from t...
 
Presentation ppt.pptx
Presentation ppt.pptxPresentation ppt.pptx
Presentation ppt.pptx
 
Socio demographic profile of Diabetic cases attended at Diabetic clinic of a ...
Socio demographic profile of Diabetic cases attended at Diabetic clinic of a ...Socio demographic profile of Diabetic cases attended at Diabetic clinic of a ...
Socio demographic profile of Diabetic cases attended at Diabetic clinic of a ...
 
Factors associated with the presence of diabetic ketoacidosis at diagnosis of...
Factors associated with the presence of diabetic ketoacidosis at diagnosis of...Factors associated with the presence of diabetic ketoacidosis at diagnosis of...
Factors associated with the presence of diabetic ketoacidosis at diagnosis of...
 
A Study on Biomarkers in a Spatially Distributed Type – 2 Diabetes mellitus G...
A Study on Biomarkers in a Spatially Distributed Type – 2 Diabetes mellitus G...A Study on Biomarkers in a Spatially Distributed Type – 2 Diabetes mellitus G...
A Study on Biomarkers in a Spatially Distributed Type – 2 Diabetes mellitus G...
 
obesidad en diabetes tipo 2
obesidad en diabetes tipo 2obesidad en diabetes tipo 2
obesidad en diabetes tipo 2
 
Nutritional assessment status of adult patients with multiple sclerosis: A na...
Nutritional assessment status of adult patients with multiple sclerosis: A na...Nutritional assessment status of adult patients with multiple sclerosis: A na...
Nutritional assessment status of adult patients with multiple sclerosis: A na...
 

Schader_Honors_Thesis

  • 1. Comparison of Genetic Risk Factors Between Two Type II Diabetes Subtypes Item type text; Electronic Thesis Authors Schader, Lindsey Marie Publisher The University of Arizona. Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author. Downloaded 5-Feb-2017 22:02:43 Link to item http://hdl.handle.net/10150/595048
  • 2. COMPARISON OF GENETIC RISK FACTORS BETWEEN TWO TYPE II DIABETES SUBTYPES By LINDSEY MARIE SCHADER UND SDL A Thesis Submitted to the Honors College In Partial Fulfillment of the Bachelors degree With Honors in Biology THE UNIVERSITY OF ARIZONA DECEMBER 2015 Approved by: . . Dr. Yann Klimentidis Department of Epidemiology and Biostatistics
  • 3. Abstract Type 2 Diabetes (T2D) is an extremely heterogeneous disease, and the heritability of T2D is not fully accounted for. This study seeks to determine T2D subtypes based on clinical features before T2D diagnosis, and to test whether genetic risk factors differ between the subtypes. A sample of 13,459 White, GWAS study participants was obtained from FRAM, MESA, and ARIC. This sample consisted of 832 cases (individuals who developed T2D during follow-up) and 12,066 controls (did not develop T2D). K-means clustering was used to cluster individuals in the cases dataset based on metabolic and anthropometric characteristics. Cox proportional hazards models were used to test whether T2D genetic risk factors differed between the groups. The clustering analysis resulted in two clusters with cluster one consisting of a higher percentage of women with higher WHR, lower HDL, and higher FI as compared to cluster two. There were no statistically significant differences between the genetic risk factors of the two clusters. The most significant differences in genetic risk factors were associated with adiposity, suggesting some interaction between adiposity genes and the characteristic phenotypes of each cluster on T2D development. Further research is needed to replicate subtypes and to find significant genetic associations.
  • 4. 1 Introduction Type 2 Diabetes (T2D) is an increasingly prevalent disease worldwide. In 2014 9.3% of the U.S. population suffered from the disease (1), and it is projected that there will be around 300 million diabetes cases in 2025 (2). In addition, diabetes was the seventh leading cause of death in the U.S. in 2010 (1). Diabetes along with its associated health complications cost the United States a total of $245 billion in 2012, with $176 billion in direct costs (1). T2D is characterized by the body’s inability to control blood glucose levels, which can lead to multiple health problems including kidney failure, heart disease, hypoglycemia, hyperglycemia, eye problems, and amputations, among others (1). Clearly diabetes is of epidemic proportions, and it is a major health concern for the United States and the rest of the world. In contrast to T1D which is characterized by the autoimmune destruction of pancreatic beta- cells, T2D is an extremely heterogeneous disease characterized by either abnormal production of insulin or insulin resistance (3). Some researchers consider the categorization of T2D as a single disease entity to be a major error due to the large heterogeneity of the disease. T2D exists on a continuum between insulin-resistant obese patients to insulin-deficient lean patients (4), and patient phenotypes tend to differ in insulin dependency, metabolic characteristics, and the presence of GAD antibodies (5). As a result of the phenotypic and genetic heterogeneity of T2D, the disease is difficult to diagnose. T2D is typically diagnosed based on diabetes patients simply not meeting the diagnosis criteria for other forms of the disease (6). Due to this heterogeneity, researchers have suggested that there may be multiple subtypes of the disease.
  • 5. 2 Identifying such subtypes would improve diagnostic tools, and currently research is underway to identify these subtypes. Recent research has focused upon identifying T2D subtypes based on different physical characteristics of patients. Faerch et. al grouped T2D patients based on their fasting insulin and two hour glucose serum concentrations and found that these groups of patients had differing trajectories of multiple phenotypic measurements such as beta-cell functioning and risk of cardiovascular disease (7). More recently, Bapat et al. identified an increased level of Treg cells (cells involved in immune responses and inflammation) in mice with age-associated T2D as compared to healthy mice and mice with obesity-associated T2D. They also found that blocking the growth of these Treg cells in mice prevented age-associated insulin resistance, suggesting a major etiological difference between T2D patients (8). In addition to these studies, recent research has used mathematical tools for grouping patients based on multiple characteristics. One common method, called k-means clustering, has been used to classify different subtypes of Parkinson’s disease as well as to group individuals based on their metabolic characteristics for targeted nutritional advice (9,10). The use of clustering to subgroup T2D patients is very limited. One study published in Science Translational Medicine used topological analysis to create subgroups of T2D patients based on 73 clinical features. This process resulted in three subgroups of T2D patients with differing genetic associations between the groups (11). Despite this study, and other studies that focus on the genetics of T2D, only 15% of the heritability of T2D has been explained, while around 80% of the heritability of T1D has been accounted for (6). Even with these promising results on subtyping and the genetics of T2D, a comprehensive picture is still lacking that includes both the genetics and the etiology of T2D.
  • 6. 3 The objective of our study is to cluster patients who continue on to develop T2D into distinct subtypes and to test whether genetic risk factors differ between the two subtypes. This analysis will allow us to identify genes that play a role in T2D development, but were previously unidentified due to their effect on only one subgroup of T2D patients. This research is clinically useful because it allows us to further understand the heterogeneity of T2D and thus to create individualized treatments (4). In addition, the identification of genetic risk factors will allow us to learn more about the etiology of T2D by performing research on the molecular functions of any genes identified. Lastly, the ability to cluster patients based on phenotypic characteristics before T2D development allows us to more accurately estimate disease risk by allowing us to consider physical characteristics and genetics jointly for pre-diabetic patients. Currently, genetic risk scores (GRS), or models that use genetic data as inputs to estimate disease risk, are used to estimate patient risk for T2D. Some of these models also incorporate physical characteristics (12). Our analysis allows us to more accurately describe how the physical environment and genetics interact to lead to T2D development, by creating subtypes of patients based on physical characteristics and by determining whether some genes cause increased risk of T2D for a certain subtype. Methods Studies We used data on 13,459 GWAS study participants obtained from the Framingham Heart Study (FRAM), the MESA SHARe Study (MESA), and the Atherosclerosis Risk in Communities study
  • 7. 4 (ARIC). The first study, FRAM was conducted to identify risk factors for cardiovascular disease. The data used in our study was from the offspring cohort which consisted of men and women ages 30 to 62 years living in Framingham, Massachusetts. The phenotypic data used in our study was from visit four of the FRAM Offspring cohort (13). The second study, MESA, was a prospective cohort study with the purpose of investigating cardiovascular disease. The genotyped cohort used in our study consisted of men and women ages 45 to 84 years from six different communities across the United States (14). The last study, ARIC is another prospective study with the aim of identifying the causes and clinical outcomes of atherosclerosis. The cohort component used from this study consisted of men and women ages 45 to 64 years from four communities across the United States (15). The data on self-declared white participants from all three of these studies was compiled to create our cohort. All individuals with prevalent T2D at the initiation of data collection were excluded from the study, and subjects were divided into cases and controls. Cases consisted of those individuals who developed T2D over the course of follow-up, while controls were subjects who did not develop T2D over the course of follow-up. Among the cases we excluded individuals on cholesterol-lowering or hypertension medication along with anyone with missing values for the phenotypic variables of interest. The final cases dataset consisted of 178 FRAM participants, 109 MESA participants, and 545 ARIC participants, for a total of 832 individuals. The control dataset consisted of 2,602 FRAM participants, 2,119 MESA participants, and 7,345 ARIC participants, for a total of 12,066 individuals.
  • 8. 5 Phenotypes The phenotypes chosen for clustering included the important metabolic and anthropometric measurements that were available in all three studies. These variables included sex, body-mass index (BMI), waist-to-hip ratio (WHR), triglycerides (TG), high-density lipoprotein (HDL), fasting glucose (FG), fasting insulin (FI), total cholesterol (TC), systolic blood pressure (SBP), and diastolic blood pressure (DBP). These phenotypes are common measurements used in risk scores for predicting T2D development (12). Further glucose, total cholesterol, and triglyceride levels have all been determined to be important indicators of metabolic health (10). The units of measurement for TG, HDL, FI, and FG differed between the studies. To control for any variation between studies, all phenotypic variables were scaled in the cohort dataset before clustering analysis. Genotypes 600 SNPs were selected for analysis along with 22 genetic risk scores (GRS). GRS were calculated for the phenotypes of T2D, FI, FG, two hour glucose (THG), proinsulin (PRO), SNPs relating to HbA1c (HBA), low levels of adiponectin (ADPN), BMI, WHR, TC, low-density lipoprotein (LDL), HDL, adiposity (FAT), C-reactive protein (CRP), serum urate (URATE), blood pressure (BP), insulin resistance (IR), beta-cell function (BC), WHR adjusted BMI, and triglycerides (TG). GRS risk scores were calculated as weighted averages of alleles that have been previously identified as risk alleles for the corresponding phenotypes.
  • 9. 6 Clustering Method K-means clustering was used to create clusters of T2D patients from our cases cohort. The k- means clustering method is a partitioning method where participants are grouped based on a pre-specified number of clusters. In the initial stage, cluster membership is determined by Euclidian distance from randomly chosen points. Then the mean of the clusters formed are calculated and the shorter Euclidian distance from the cluster mean is used to determine new clusters. This process is reiterated for a specified number of times or until stability (9). The variables selected for the clustering analysis are listed in the phenotype section of this paper, and all were measured before the patient was diagnosed with T2D. All phenotypic variables were standardized, so one variable did not carry more weight than another in the analysis based on the unit of measurement. Optimal cluster number was determined using the 2.0-10 version of the cascadeKM function from the package Vegan in R. The calinski method argument was used, as determined most appropriate by Milligan et al. (16). An optimal cluster number of two was determined by this method. For the clustering portion of analysis the k-means function in R was used on our dataset of T2D cases only. The default algorithm of Hartigan and Wong (1979) was used, and the argument of 25 repetitions was implemented in order to avoid local optimal solutions. To determine whether the clusters obtained were significantly different from one another a t- test between means of the two clusters for all the phenotypic variables of interest was performed.
  • 10. 7 Statistical Analyses To test whether genetics has differing effects on T2D development between the two clusters a Cox proportional hazards model was used to estimate the effects of genetics and cluster membership upon the risk of T2D development over time. Two Cox proportional hazards models were used for this analysis. Primary Model The first model was estimated on the cases only dataset. In this Cox proportional hazards model, the outcome of hazard rate of T2D development was regressed upon cluster membership, where cluster membership was included as an ‘as.factor’ variable, and an interaction term between cluster membership and a genetic variable (SNP or GRS). This interaction term was the coefficient of interest in our analysis as it conveys a difference in genetic effects between the two datasets. The main effects of age, sex, and the genetic variable were included in this model as control variables. Secondary Model coxph(Surv(time to development, diabetes incidence)~SNP (or GRS) + Age + Sex + as.factor(cluster) + as.factor(cluster)*SNP(or GRS), data = both clusters )
  • 11. 8 The second analysis consisted of running a Cox proportional hazards model on two separate datasets. The first dataset consisted of members of the first cluster and all controls. The second dataset consisted of members of the second cluster and all controls. A genetic variable (SNP or GRS) was used as an input variable into the model, and the hazard rate of T2D development was used as the output. Age and sex were controlled for. This model was estimated for both datasets and the difference between their coefficients for the genetic variables was evaluated to indicate the differing effects of SNPs and GRSs between the two clusters. These models were run for all SNP and GRS of interest. Results A summary of phenotypic characteristics of the cases are shown in Table 1. The Calinski-index algorithm determined that two clusters was the optimal cluster number for our cases dataset. After performing k-means clustering in R, we obtained two clusters with significant differences between cluster means for nine out of the eleven clustering variables as determined by a t-test between cluster means (Table 2). The first cluster (C1) consists of mostly women with a higher WHR, lower HDL, and higher FI than cluster 2 (C2) which consisted of mostly men with lower WHR, higher HDL, and lower FI. The differences between the age and total cholesterol phenotypes were insignificant between the clusters. Summary statistics of the phenotypic characteristics of each cluster may be found in Table 3. coxph(Surv(time to development, diabetes incidence)~ SNP(or GRS) +Age + Sex, data=cluster 1 or 2)
  • 12. 9 After clustering was performed, we fit our primary model to the cases only dataset. This model consists of a Cox proportional hazards model with cluster membership coded for as an ‘as.factor’ variable. The most significant results of this analysis are listed in Table 4 (SNP) and Table 5 (GRS). This model did not generate any statistically significant coefficients for the interaction between cluster membership and genotype characteristic (SNP or GRS). The top five most significant SNPs were rs1553318, rs731839, rs6882076, rs2294239, and rs2652834. These SNPS relate to TG, TG, LDL, WHR, and HDL respectively. Furthermore, the most significant GRS scores identified were FAT and TG. Our secondary Cox proportional hazards model was run on the clustering datasets separately, to see what effects certain genetic characteristics had on T2D development. The result of interest in this model is the coefficient corresponding to the genetic characteristic in each cluster model and how they differ between the two clusters. These results are summarized in Table 4 for the top 20 most significant SNPs identified by our primary model and in Table 5 for all the GRS scores tested. The genes that the 20 most significant SNPs correspond to are listed in Table 6. Discussion The objective of our study was to identify subtypes of individuals who go on to develop T2D and to determine whether the role of specific genetic factors differs between the two groups. Our cluster analysis resulted in two groups of phenotypically distinct patients. C1 was generally characterized by women with higher WHR, lower HDL, and higher FI, as compared to C2 which consisted mostly of men with lower WHR, higher HDL, and lower FI. Although there were no
  • 13. 10 statistically significant findings on differing genetic effects between the two groups, we did identify some associations of interest that may be explored in future research. Both the most significant SNPs and the most significant GRS scores relate to adiposity, suggesting a differing interaction between the phenotypes that characterize each cluster and adiposity-related genotypes. Although our two clusters were distinct from one another, they do not reflect the disease subtypes that have been found in recent research. One recent study identified that there is a physiological difference between patients that have age-associated onset of T2D versus obesity-associated onset of the disease. They found that altering Treg cells in mice prevented age-associated onset of T2D while obesity-associated onset of T2D had no association with Treg cells (8). Our clusters did not reflect these different subtypes of T2D. C1 had a slightly higher age on average (55.06 vs. 54.86 years) but the difference is not significant. It is true that many measurements of metabolic health differed between the two groups, but there is not a clear distinction between late age of onset and obesity-related T2D. Another study, more similar to our analysis, looked at subtyping T2D patients based on multiple phenotypic characteristics using a topological approach. This study had more power to identify subtypes than our study due to their wealth of phenotypic data (73 variables included in clustering) and large sample size of 11,210 T2D patients. This analysis yielded three subtypes. The first subtype was characterized by young, overweight patients, the second was characterized by low-weight patients, while the third group of patients had high SBP, serum chloride, and troponin I levels. These subtypes also do not reflect the subtypes discovered in our analysis or the analysis of other researchers, and the detailed differences between their three subtypes included
  • 14. 11 metabolic measurements that were not available in our data, such as white blood cell count and serum albumin (11). Current research is focusing on disease subtyping of T2D because the disease is so heterogeneous, but our study, along with others does not create a clear consensus on distinct subtypes. In addition to the unique subtypes delineated by our analysis, some interesting genetic trends were identified that reflect some current T2D research. The two most significant SNPs identified in our analysis are both related to TG levels. The first SNP, rs1553318, is associated with the HAVCR1 gene. This SNP increased the risk of T2D development in both clusters, but more so in C2. The second SNP, rs731839, is near the PEPD gene. This SNP was associated with increased T2D risk in C1 and decreased risk in C2. It is of interest that both SNPs are associated with TG because the pre-diabetic state is characterized by elevated TG levels (a characteristic of dyslipidemia). Furthermore, TG levels can be used to predict T2D risk, and one study found that looking at the change in TG levels over time helped predict T2D risk in men (17). In addition to the effects of TG on T2D risk, some research suggests that there is an association between a certain variant of the PEPD gene and T2D development. A recent GWAS study on Chinese Hans found that higher levels of n-3 fatty acids help mitigate the increased risk of T2D development caused by the PEPD gene (18). This corresponds to our results where PEPD had a protective effect against T2D development in C2, since C2 is characterized by lower TG levels ( associated with high n-3 fatty acid consumption) (19). The third most significant SNP in our analysis, rs6882076, is near the TIMD4 gene which affects LDL levels. This SNP increased the risk of T2D development in C1 while decreasing risk in C2.
  • 15. 12 Although LDL is not a major characteristic used in calculating T2D risk (12) because LDL levels do not differ significantly between diabetics and non-diabetics, research has found that LDL particles in T2D patients are typically smaller than those in their non-diabetic peers (20).The results of our analysis suggest that there may be some association between the TIMD4 gene and the environment that affects T2D development. There are far reaching implications for T2D research that incorporates both disease subtype and genetic risk. First, by identifying distinct subtypes of patients, one may create more accurate models of disease risk that incorporate both physical characteristics and genetics. The interaction between genes and the environment is complex, and it is not fully detailed in a mathematical analysis. By identifying characteristics of subgroups, researchers can gain a better understanding of what traits may be involved in disease pathways and focus their research in these areas. This may lead to a greater understanding of the disease itself and targeted diagnosis and treatment based on disease subtype. The goal is to find distinct subtypes of the disease that may be clearly defined and then to identify the metabolic pathways involved. Our study had a number of limitations that prevented us from determining whether the patients were clustered into truly distinct subtypes and which made it challenging to find statistically significant results. First, we were not able to test whether the clusters could be recreated in another dataset to confirm our findings. Further, our cohort was small for a genetic analysis (only 832 T2D cases) because we removed all people on cholesterol lowering medication and hypertension medication. Lastly, our cohort included only whites, so the results may not be generalizable to other populations. Despite these weaknesses, our study has
  • 16. 13 multiple strengths including the inclusion of multiple studies which provided a wide range of phenotypic and genetic data. In addition, the use of k-means clustering is not a common method in the T2D literature, but its use has found successful subtypes for Parkinson’s disease (9) and is therefore a promising method for disease subtyping. Future research should focus upon replicating these subtypes in another cohort and including more variables and subjects in the clustering analysis. Once clustering analyses produce replicable results, thereby identifying distinct subtypes of T2D, animal models may be used to further understand the disease etiology. There is still much to explore regarding how genetics and the environment interact to influence the development of this disease.
  • 17. 14 References 1. National Diabetes Statistics Report, 2014. Centers for Disease Control and Prevention: National Center for Chronic Disease Prevention and Health Promotion: Division of Diabetes Translation. 2014;1-11. 2. King H, Aubert R, Herman W. Global Burden of Diabetes, 1995-2025. Diabetes Care. 1998;21:14141-1431. 3. Zimmet P, Albertit KGMM, Shaw J. Global and societal implications of the diabetes epidemic. Nature. 2001;414(6865):782-787. 4. Gale EAM. Is type 2 diabetes a category error? The Lancet. 2013;381(9881):1956-1957. 5. Tuomi T, Santoro N, Caprio S, Cai M, Weng J, Groop L. The many faces of diabetes: a disease with increasing heterogeneity. The Lancet. 2014;383(9922):1084-1094. 6. Groop L, Pociot F. Genetics of diabetes – are we missing the genes or the disease? Molecular and Cellular Endocrinology. 2014;382(1):726-739. 7. Faerch K, Witte DR, Tabak AG, Perreault L, Herder C, et al. Trajectories of cardiometabolic risk factors before diagnosis of three subtypes of type 2 diabetes: a post-hoc analysis of the longitudinal Whiehall II cohort study. The Lancet Diabetes and Endocrinology. 2013;1(1):43-51. 8. Bapat SP, Suh JM, Fang S, Liu S, Zhang Y, et al. Depletion of fat-resident Treg cells prevent age-associated insulin resistance. Nature. 2015;528(7580):137-141. 9. Van Rooden SM, Heiser WJ, Kok JN, Verbaan D, van Hilten JJ, Marinus J. The identification of Parkinson's disease subtypes using cluster analysis: a systematic review. Movement Disorders. 2010;25(8):969–978.
  • 18. 15 10. O’Donovan CB, Walsh MC, Nugent AP, McNulty B, Walton J, et al. Use of metabotyping for the delivery of personalized nutrition. Molecular Nutrition and Food Research. 2014;59(3):377-385. 11. Li L, Cheng W, Glicksberg BS, Gottesman O, Tamler R, et al. Identification of type 2 diabetes subgroups through topological analysis of patient similarity. Science Translational Medicine. 2015;7(311): 311ra174-3. 12. Noble D, Mathur R, Dent T, Meads C, Greenhalgh T. Risk models and scores for type 2 diabetes: systematic review. BMJ. 2011;343:d7163. 13. Framingham Heart Study. NHBI. 2015. https://www.framinghamheartstudy.org/index.php. 14. About MESA. MESA Coordinating Center: University of Washington. 2015. http://www.mesa-nhlbi.org/aboutMESA.aspx. 15. Atherosclerosis Risk in Communities Study. Collaborating Studies Coordinating Center. Department of Biostatistics Gillings School of Global Public Health. North Carolina Chapel Hill. 2015. https://www2.cscc.unc.edu/aric/desc. 16. Milligan GW, Cooper MC. An examination of procedures for determining the number of clusters in a data set. Psychometrika. 1985;50(2):159-179. 17. Tirosh A, Shai I, Bitzur R, Kochba I, Tekes-Manova D, Israeli E. Changes in triglyceride levels over time and risk of type 2 diabetes in young men. Diabetes Care. 2008;31(10):2032+.
  • 19. 16 18. Zheng JS, Huang T, Li K, Chen Y, Xie H, et al. Modulation of the Association between the PEPD variant and the risk of type 2 diabetes by n-3 fatty acids in Chinese Hans. Jounral of Nutrigenetics and Nutrigenomics. 2015;8(1):36-43. 19. Harris WS, Bulchandani D. Why do omega-3 fatty acids lower serum triglycerides? Current Opinion in Lipidology. 2006;17(4):387-393. 20. Nesto RW. LDL cholesterol lowering in type 2 diabetes: what is the optimum approach? Clinical Diabetes. 2008;26(1):8-13.
  • 20. 17 Table 1: Baseline characteristics of participants in each of three cohorts. ARIC FRAM MESA n 545 178 109 Age (yrs) 54.02 (5.51) 54.92 (8.69) 59.87 (9.81) % Female 59.82% 53.37% 52.29% BMI (kg/m2 ) 29.49 (4.76) 30.88 (5.84) 29.52 (6.40) WHR 0.97 (0.06) 0.95 (0.09) 0.93 (0.08) TG* 1.82 (1.04) 196.84 (128.23) 150.79 (95.71) HDL** 1.12 (0.34) 43.19 (12.27) 49.58 (15.41) FG**** 6.00 (0.54) 117.25 (33.65) 95.17 (13.13) TC 5.50 (1.03) 208.77 (34.96) 205.39 (36.79) SBP (mmHg) 121.07 (15.47) 132.88 (16.92) 125.95 (20.60) DBP (mmHg) 73.73 (10.23) 78.92 (9.29) 71.10 (10.40) FI*** 105.31 (60.86) 38.21 (16.28) 11.76 (8.36) *TG units: ARIC - mmol/L; FRAM - Meq/L; MESA - mg/dL **HDL units: ARIC - mmol/L; FRAM&MESA - mg/dL ***FI units: ARIC - pmol/L , FRAM - μmol/L, MESA - mU/L ****FG units: ARIC - mmol/L, FRAM - mg/dl, MESA - mg/dl Table 2: Cluster phenotype comparisons via t-test. Variable clust1mean clust1SD clust2mean clust2SD t-stat p-value WHR_pheno 0.49058065 0.6669220 -0.72419048 0.9669827 20.0258690 2.456061e-67 HDL_pheno -0.46698033 0.6136599 0.68935191 1.0576818 -18.0838850 2.524972e-56 FI_pheno 0.38237082 1.0520284 -0.56445216 0.5506614 16.9133715 5.768705e-55 Sex 0.77620968 0.4172040 0.27678571 0.4480769 16.2165669 2.746904e-50 TG_pheno 0.31562379 1.1145907 -0.46592083 0.5232078 13.5651645 1.177836e-37 BMI_pheno 0.31943235 0.9644238 -0.47154299 0.8521667 12.4497415 1.460480e-32 DBP_pheno 0.29310623 0.9775874 -0.43268063 0.8646014 11.2642577 2.324341e-27 SBP_pheno 0.24978211 0.9528888 -0.36872597 0.9509042 9.1979556 3.866903e-19 FG_pheno 0.22054251 0.9732836 -0.32556276 0.9468849 8.0709721 2.857736e-15 TC_pheno 0.05963691 0.9818105 -0.08803544 1.0184503 2.0820662 3.769834e-02 Age 55.05443548 6.8291606 54.87500000 7.7795657 0.3426994 7.319345e-01
  • 21. 18 Table 3: Characteristics of each cluster organized by study. cluster 1 Cluster 2 Study ARIC FRAM MESA ARIC FRAM MESA N 332 98 68 213 80 41 Age (yrs) 54.33 (5.62) 54.67 (8.28) 59.18 (8.31) 53.54 (5.30) 55.23 (9.20) 61.02 (11.92) % Female 79.00% 79.00% 71.00% 30.00% 22.00% 22.00% BMI (kg/m2 ) 31.00 (4.60) 32.38 (5.52) 32.15 (6.26) 27.14 (4.00) 29.04 (5.71) 25.17 (3.73) WHR 1.00 (0.04) 0.99 (0.06) 0.97 (0.06) 0.92 (0.06) 0.89 (0.08) 0.87 (0.08) TG* 2.14 (1.13) 239.53 (151.69) 181.49 (108.30) 1.34 (0.60) 144.55 (59.68) 99.88 (28.76) HDL** 0.97 (0.21) 36.49 (8.15) 42.71 (9.64) 1.35 (0.37) 51.39 (11.48) 60.98 (16.49) FG**** 6.11 (0.50) 124.18 (39.78) 99.97 (11.71) 5.84 (0.57) 108.76 (21.48) 87.20 (11.45) TC 5.56 (0.97) 209.60 (36.32) 209.65 (40.12) 5.42 (1.11) 207.75 (33.42) 198.34 (29.61) SBP (mmHg) 125.33 (15.03) 136.79 (16.07) 128.70 (17.66) 114.43 (13.73) 128.11 (16.81) 121.40 (24.28) DBP (mmHg) 76.69 (10.11) 82.21 (8.98) 73.29 (9.75) 69.11 (8.59) 74.89 (8.04) 67.48 (10.56) FI*** 127.72 (62.70) 44.60 (18.33) 15.13 (8.90) 70.37 (36.79) 30.39 (8.27) 6.18 (2.14) *TG units: ARIC - mmol/L; FRAM - Meq/L; MESA - mg/dL **HDL units: ARIC - mmol/L; FRAM&MESA - mg/dL ***FI units: ARIC - pmol/L , FRAM - μmol/L, MESA - mU/L ****FG units: ARIC - mmol/L, FRAM - mg/dl, MESA - mg/dl
  • 22. 19 Table 4: Most significant primary and secondary analysis results corresponding to SNP data. SNP Interaction coefficient for SNP and cluster membership p-value Hazard Ratio for Cluster 1 Hazard Ratio for Cluster 2 Hazard Ratio Difference rs1553318 0.3298372 0.0018933 2.48E- 03 0.0041739 1.69E-03 rs731839 -0.313511 0.0052293 -1.58E- 01 -0.055687 1.02E-01 rs6882076 0.2692967 0.0091396 1.58E- 02 -0.008271 2.41E-02 rs2294239 -0.267014 0.0093985 -8.76E- 02 -0.053721 3.39E-02 rs2652834 -0.307825 0.0115077 -2.15E- 02 0.0146121 3.61E-02 rs13139571 -0.315492 0.0115378 -1.52E- 01 -0.103202 4.84E-02 rs576674 0.3626719 0.0118196 -8.95E- 02 0.0898402 1.79E-01 rs4823006 -0.2515 0.0119255 -7.15E- 02 -0.013835 5.77E-02 rs849135 -0.236855 0.01251 -1.39E- 01 0.0084402 1.48E-01 rs6477694 0.2613947 0.0144687 5.31E- 03 0.0278352 2.25E-02 rs7998202 0.4151296 0.0148486 8.41E- 02 0.1806077 9.65E-02 rs1689800 -0.241809 0.0187326 6.38E- 02 0.0086597 5.51E-02 rs6804842 0.2314893 0.0218175 9.61E- 02 0.0206176 7.55E-02 rs7739232 -0.457728 0.0244218 8.93E- 02 0.0068656 8.25E-02 rs2779116 0.2566808 0.0257419 -6.03E- 02 -0.097148 3.69E-02 rs6795735 -0.225945 0.0266387 2.60E- 02 -0.022721 4.88E-02 rs2954022 0.231793 0.0267414 -4.37E- 02 0.0294779 7.32E-02 rs2078267 -0.228196 0.027513 -4.82E- 02 0.0031743 5.14E-02 rs2954029 0.2308958 0.0275867 -4.88E- 02 0.0217629 7.05E-02 rs12328675 0.3312844 0.0291462 -1.17E- -0.01858 9.84E-02
  • 23. 20 01 rs4865796 -0.241173 0.0300234 3.18E- 02 -0.051541 8.34E-02 rs11694172 -0.256426 0.0301398 3.00E- 02 -0.057769 8.77E-02 rs8182584 -0.2283 0.0331252 -1.00E- 01 -0.0663 3.38E-02 rs2290547 0.3144762 0.0451986 -1.89E- 01 0.1023083 2.91E-01 rs10929925 -0.216843 0.0457031 -3.29E- 02 -0.039378 6.48E-03 rs17367504 -0.281034 0.0471991 1.96E- 02 -0.021738 4.13E-02 rs7225700 -0.20684 0.0485637 1.79E- 02 -0.078474 9.64E-02 rs3780181 0.3922958 0.0497388 -4.78E- 02 0.0059285 5.38E-02
  • 24. 21 Table 5: Primary and Secondary analysis results corresponding to GRS data. GRS Interaction coefficient for GRS and cluster membership p-value Hazard Ratio for Cluster 1 Hazard Ratio for Cluster 2 Hazard Ratio Difference FAT_GRS -1.035413 0.0464675 -0.125982 0.1108843 0.236866 TG_GRS -0.007944 0.0638599 -0.001277 -0.014616 0.013339 ADPN_GRS 1.7136247 0.0812667 1.0190954 0.9404897 0.078606 CRP_GRS 0.5235282 0.0934355 -0.138709 0.129723 0.268432 BMI_GRS 0.2146506 0.1011368 0.2438031 0.0369369 0.206866 PRO_GRS 1.4593472 0.1190019 0.5452933 -0.725566 1.270859 IR_GRS 0.0855827 0.1244277 0.0664784 0.0578434 0.008635 HDL_GRS -0.028909 0.142099 0.0128152 -0.032373 0.045188 URATE_GRS 0.3532343 0.1545706 -0.062828 0.139879 0.202707 HBA_GRS 1.3121146 0.1706568 0.4518716 1.6238976 1.172026 TG_GRS40 -0.431003 0.1836546 0.2381145 -0.864953 1.103068 BP_GRS 0.0507751 0.1887276 -0.032825 -0.02229 0.010535 FI_GRS 2.186906 0.1951746 2.8075312 2.6641247 0.143407 T2D_GRS 0.0767174 0.2964663 0.2875567 0.3214717 0.033915 LDL_GRS -0.008485 0.3233295 -0.006835 -0.010248 0.003413 TC_GRS -0.005176 0.4849864 -0.006943 -0.011592 0.004649 BC_GRS -0.027769 0.4905962 0.1140514 0.0997041 0.014347 WHR_GRS -0.648085 0.4958957 1.4135402 -0.021876 1.435416 FG_GRS 0.398858 0.5610801 1.7399193 2.3933806 0.653461 BMI96_GRS -0.163293 0.717723 0.5384569 0.209882 0.328575 WHRadjBMI48_GRS -0.212589 0.7464577 0.8287973 0.0777437 0.751054 THG_GRS 0.115274 0.796558 0.174948 0.0680792 0.106869
  • 25. 22 Table 6: Phenotypes and genes relating to the most significant SNPs in the primary analysis. SNP Associated Trait Effect Risk Allele Gene (ID) Chromosome rs1553318 TG 2.63 HAVCR1 5:157052312 rs731839 TG 0.022 PEPD 19:33408159 rs6882076 LDL 1.67 C TIMD4 5:156963286 rs2294239 WHR 0.025 ZNRF3 22:29053489 rs2652834 HDL 0.39 LACTB 15:63104668 rs13139571 BP 0.321259 GUCY1A3, LOC105377506 4:155724361 rs576674 FG 0.016697 G KL, ~36 kb upstream 13:32980164 rs4823006 WHR 0.023 A ZNRF3 22:29055683 rs849135 T2D 1.11 G JAZF1 7:28156794 rs6477694 BMI C C9orf4 9:109170062 rs7998202 HBA 0.031 G ATP11AUN 13:112677554 rs1689800 HDL 0.47 G GLUL, ZNF648 1:182199750 rs6804842 BMI G LOC101927874 3:25064946 rs7739232 HIPadjBMI A Locus: KLHL31 6:53675537 rs2779116 HBA 0.024 T SPTA1 1:158615625 rs6795735 T2D 1.08 C ADAMTS9-AS2 3:64719689 rs2954022 TC 2.3 C LOC105375745 8:125470379 rs2078267 URATE 0.073 C SLC22A11 11:64566642 rs2954029 TG 5.64 A LOC105375745 8:125478730 rs12328675 HDL 0.68 T COBLL1 2:164684290
  • 26. 23 rs4865796 FI 0.015358 A ARL15 5:53976834 rs11694172 TC 0.028 G FAM117B 2:202667581 rs8182584 T2D 1.04 T PEPD 19:33418804 rs2290547 HDL -0.03 A SETD2 3:47019693 rs10929925 HIP C SOX11 2:6015425 rs17367504 BP 0.9030779 A MTHFR 1:11802721 rs7225700 LDL 0.87 C LOC102724508 17:47314438 rs3780181 TC -0.044 G VLDLR 9:2640759