SlideShare a Scribd company logo
1 of 9
Download to read offline
PHARMACOGENOMIC DATA MINING
with Hierarchical Clustering Algorithms
Ohene Z. Frank
CSC 576 Data Warehousing and Mining
Final Report
Frank | PGX Data Mining 1
PHARMACOGENOMIC DATA MINING WITH HIERARCHICAL CLUSTERING ALGORITHMS
Designer’s drugs, individualized drugs and personalized medicine are few of the
buzzwords that are proliferating the biotech information super-highway and are
widely used by pharmaceutical scientists, clinical scientists, researchers and medical
humanitarians when referring to pharmacogenomics. Malorye Branca of Bio-IT
World stated, “One of the most seductive lures of the genomic revolution is the
promise of personalized medicine”. Pharmacogenomics is the study of how one’s
genetic makeup affects the body’s response to drugs, hence an intersection of
genetics, pharmacodynamics and pharmacokinetics. Pharmacogenetics is widely
used synonymously with pharmacogenomics. Conceptually, these genomics terms
are interchangeable, but from a purist view, pharmacogenomics is the technology
where as pharmacogenetics is the science. Genaissance Pharmaceuticals defined
pharmacogenomics as the application of genome science (genomics) to the study
of human variability to drug response.
So, what’s the real tumult? In the United States, there is at least 100, 000 death
annually due to adverse reactions (side effects) to prescription drugs. Moreover,
millions of people are being treated with drugs that are ineffective or have very little
pharmacological effect; beta-blockers given to reduce blood pressure are
ineffective in one-third of patients and many antidepressants in half of the people
who take them [1].
The culpability for the lack of efficacy and intolerance of many drugs lies mainly with
our genes, which help to determine the way in which our body reacts, absorbs,
Frank | PGX Data Mining 2
distributes, metabolize and excretes drugs. Small genetic variations between
people (known as polymorphisms) can alter the behavior of proteins that carry a
drug to its target cells or tissues, neutralize the enzymes that activate a drug or aid in
the excretion process or alter the structure of the receptor to which a drug is
supposed to bind [1]. Variation in immune-system genes can also influence how
particular drugs are tolerated. These slight genetic variations mean that the dose at
which a drug will work may vary hugely from person to person; hence, the one-size-
fits-all drug development and prescribing can lead to life-threatening adverse
reaction to a drug or in some cases, fatality.
On the right path forward, the genomics revolution has given us the tools to identify
people who don't fit the standard prescribing mold. Genomics is the use of high
throughput molecular biology technologies to study large numbers of genes, and
gene products simultaneously in whole cells, whole tissues, or whole organisms [2].
The genome is all of the genetic material in a cell or an organism. According to the
U.S. Department of Energy, the genome is an organism’s complete set of DNA. In
the human genome, DNA is arranged into 24 distinct chromosomes, which are
separate molecules (physically) that range in length from about 50 million to 250
million base pairs [3]. Each chromosome is a single strand of the DNA double helix
that is very long in length (as illustrated Figure 1).
Frank | PGX Data Mining 3
Figure 11: Illustration of a chromosome replicating its DNA before a cell divides.
Single nucleotide polymorphisms (SNPs) are single-letter variations in the genetic
code that are scattered throughout the genome. Most SNPs are benign, with
absolutely no effect on gene structure or expression; however, a subset of these
variations provides crucial links to disease-causing genes, either because they
directly alter a gene's activity or aid in pinpointing the location of a disease-related
gene [1].
1
Figure is the courtesy of Genaissance Pharmaceuticals, Inc.
Frank | PGX Data Mining 4
The profusion of SNPs and the simplistic identification, make them the ideal
biomarkers for clinical studies. SNPs are also found in genes for drug-metabolizing
enzymes, influencing individuals' ability to process a drug properly.
Many companies have compiled large collections of SNPs with the intention of
developing diagnostic and prognostic tests, as well as to guide the development of
a new generation of drugs that would target genetically determined subsets of
patients [1]. All in all, this type of genomic technology as it aims to identify the best
possible medications for individuals while maximizing efficacy and minimizing toxicity
is known as pharmacogenomics.
Due the gravity and promise of pharmacogenomics, several genomics companies
are manufacturing DNA microarrays to identify common SNPs that influence the
activity of various enzymes. Ultimately, these gene expression chips could help to
prevent life-threatening reactions to drugs, identify appropriate drug doses, and
prescribe the right drug combination (or concomitant medications) to give to
patients with complex conditions.
In order for this to come to fulfillment at faster pace, one can applied data mining
techniques to a clinical data warehouse that contains both clinical trials data and
genomic data (anonymized genotyping and microarray) utilizing hierarchical
clustering algorithms.
Frank | PGX Data Mining 5
The data mining technique most widely utilized for the analysis of gene expression
data is hierarchical clustering. This type of clustering algorithms has the advantage
of being relatively simple and the result can be easily visualized. Hierarchical
clustering is an agglomerative approach in which single expression profiles are
joined to form groups that are further joined until the process has been carried to
completion, forming a single hierarchical tree [5].
There are six main hierarchical clustering algorithms (single-linkage, complete-
linkage, average-linkage, weighted pair-group average, within-groups and Ward’s
method) that can be applied to gene expression profiling (microarray) data analysis.
These clustering algorithms differ in the methodology in which distances are
calculated between the growing clusters and the remaining members (including
other clusters) in the data set. [5]
Single-linkage Clustering: This method is also referred to as the minimum, or
nearest-neighbor method. The distance between two clusters, x and y, is
calculated as the minimum distance between a member of cluster x and a
member of cluster y. This method tends to produce “loose” clusters that can
be joined, if any two members are close together. This method often results in
sequential addition of single samples to an existing cluster, which in turn,
produces trees with many long, single-addition branches representing clusters
that have grown by accumulation.
Complete-linkage Clustering: This method is also referred to as the maximum
or furthest-neighbor method. The distance between two clusters is calculated
Frank | PGX Data Mining 6
as the greatest distance between members of the relevant clusters. This
method tends to produce very compact clusters of elements and the clusters
are often very similar in size.
Average-linkage Clustering: This method is also referred to as unweighted
pair-group method average. The average distance is calculated from the
distance between each point in a cluster and all other points in another
cluster. The two clusters with the lowest average distance are joined
together to form a new cluster.
Weighted Pair-group Average: This method is identical to average-linkage
clustering (as described above), except that the size of the respective clusters
is used as a weight in the computations. This method should be used when
the cluster sizes are suspected to be greatly uneven.
Within-groups Clustering: This method is similar to average-linkage clustering
also, except that the clusters are merged and a cluster average is used for
further calculations instead of the individual cluster elements. This method
tends to produce tighter clusters than average-linkage clustering.
Ward's Method: In this method, the calculation of the total sum of squared
deviations from the mean of a cluster and joining clusters in order that it
produces the smallest possible increase in the sum of squared errors
determines the clusters.
Frank | PGX Data Mining 7
Figure 32: Hierarchical Clustering Demonstration
Figure 3 is a representation of gene expression data that were subjected to average-
linkage, complete-linkage and single-linkage hierarchical clustering using a
Euclidean distance metric and gene-expression families (A–J) that were color coded
for comparison. Genes that are up-regulated appear in red, and those that are
down-regulated appear in green, with the relative log2 (ratio) reflected by the
intensity of the color [5].
2
Courtesy of Nature Reviews, Nature Publishing Group
Frank | PGX Data Mining 8
The aim and allure of pharmacogenomic data mining is to discovery knowledge
from a clinical genomic data warehouse (comprised of both genomic and clinical
data), in order to identify and prescribe the most effective and least toxic drug for
an individual based the person’s genetic makeup and the targeted disease.
References
[1] Abbott, A., Nature 425, 760 - 762 (23 October 2003).
[2] Genaissance Pharmaceuticals, Inc., Online Glossary (2004).
[3] US Department of Energy, Human Genome Information Project,
Pharmacogenomics (2004).
[4] Branca, M., The New, New Pharmacogenomics, Bio-IT World (Sept. 9, 2002).
[5] Quackenbush, J., Nature Reviews Genetics 2, 418-427 (2001).
[6] Brown, M., Essentials of Medical Genomics, 163-198 (2003).
[7] Hollinger, M.A., Introduction to Pharmacology 2, 288-290 (2003).

More Related Content

What's hot

Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Andrei KUCHARAVY
 
Tools for target identification and validation
Tools for target identification and validationTools for target identification and validation
Tools for target identification and validationDr. sreeremya S
 
COMPUTER ASSISTED DRUG DISCOVERY
COMPUTER ASSISTED DRUG DISCOVERYCOMPUTER ASSISTED DRUG DISCOVERY
COMPUTER ASSISTED DRUG DISCOVERYAmrutha Lakshmi
 
Research proposal sjtu
Research proposal sjtuResearch proposal sjtu
Research proposal sjtuAqsa Qambrani
 
Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923Tudor Oprea
 
Role of bioinformatics in drug designing
Role of bioinformatics in drug designingRole of bioinformatics in drug designing
Role of bioinformatics in drug designingW Roseybala Devi
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision
 
Recent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia GravisRecent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia Gravisangelisralopez
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interactionAashish Patel
 
Applications of proteomic sciences
Applications of proteomic sciencesApplications of proteomic sciences
Applications of proteomic sciencessukanyakk
 
Lecture 8 drug targets and target identification
Lecture 8 drug targets and target identificationLecture 8 drug targets and target identification
Lecture 8 drug targets and target identificationRAJAN ROLTA
 
The Role of Bioinformatics in The Drug Discovery Process
The Role of Bioinformatics in The Drug Discovery ProcessThe Role of Bioinformatics in The Drug Discovery Process
The Role of Bioinformatics in The Drug Discovery ProcessAdebowale Qazeem
 
Impacts of genomics, proteomics, and metabolomics ppt
Impacts of genomics, proteomics, and metabolomics pptImpacts of genomics, proteomics, and metabolomics ppt
Impacts of genomics, proteomics, and metabolomics pptGloria Okenze
 
Molecular target and development models
Molecular target and development modelsMolecular target and development models
Molecular target and development modelsAmjad Afridi
 
Unravelling the molecular linkage of co morbid diseases
Unravelling the molecular linkage of co morbid diseasesUnravelling the molecular linkage of co morbid diseases
Unravelling the molecular linkage of co morbid diseaseseSAT Journals
 

What's hot (20)

Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...
 
Tools for target identification and validation
Tools for target identification and validationTools for target identification and validation
Tools for target identification and validation
 
COMPUTER ASSISTED DRUG DISCOVERY
COMPUTER ASSISTED DRUG DISCOVERYCOMPUTER ASSISTED DRUG DISCOVERY
COMPUTER ASSISTED DRUG DISCOVERY
 
Insilico binding studies on tau protein and pp2 a as alternative targets in a...
Insilico binding studies on tau protein and pp2 a as alternative targets in a...Insilico binding studies on tau protein and pp2 a as alternative targets in a...
Insilico binding studies on tau protein and pp2 a as alternative targets in a...
 
Research proposal sjtu
Research proposal sjtuResearch proposal sjtu
Research proposal sjtu
 
Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923
 
Genomics and proteomics
Genomics and proteomicsGenomics and proteomics
Genomics and proteomics
 
Genomics & Proteomics Based Drug Discovery
Genomics & Proteomics Based Drug DiscoveryGenomics & Proteomics Based Drug Discovery
Genomics & Proteomics Based Drug Discovery
 
Role of bioinformatics in drug designing
Role of bioinformatics in drug designingRole of bioinformatics in drug designing
Role of bioinformatics in drug designing
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria López
 
Recent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia GravisRecent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia Gravis
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interaction
 
Preproposal Talk
Preproposal TalkPreproposal Talk
Preproposal Talk
 
Applications of proteomic sciences
Applications of proteomic sciencesApplications of proteomic sciences
Applications of proteomic sciences
 
Lecture 8 drug targets and target identification
Lecture 8 drug targets and target identificationLecture 8 drug targets and target identification
Lecture 8 drug targets and target identification
 
The Role of Bioinformatics in The Drug Discovery Process
The Role of Bioinformatics in The Drug Discovery ProcessThe Role of Bioinformatics in The Drug Discovery Process
The Role of Bioinformatics in The Drug Discovery Process
 
Impacts of genomics, proteomics, and metabolomics ppt
Impacts of genomics, proteomics, and metabolomics pptImpacts of genomics, proteomics, and metabolomics ppt
Impacts of genomics, proteomics, and metabolomics ppt
 
protein microarray
protein microarray protein microarray
protein microarray
 
Molecular target and development models
Molecular target and development modelsMolecular target and development models
Molecular target and development models
 
Unravelling the molecular linkage of co morbid diseases
Unravelling the molecular linkage of co morbid diseasesUnravelling the molecular linkage of co morbid diseases
Unravelling the molecular linkage of co morbid diseases
 

Viewers also liked

Общественный контроль государственных и муниципальных расходов
Общественный контроль государственных и муниципальных расходовОбщественный контроль государственных и муниципальных расходов
Общественный контроль государственных и муниципальных расходовKomitetGI
 
Forced migration, care and family relations
Forced migration, care and family relationsForced migration, care and family relations
Forced migration, care and family relationsRuth Evans
 
SMX East - Alternate Mobile Conversion Metrics
SMX East - Alternate Mobile Conversion MetricsSMX East - Alternate Mobile Conversion Metrics
SMX East - Alternate Mobile Conversion MetricsAaron Levy
 
Конфликты Никовская Л.И. - го и власть
Конфликты Никовская Л.И. - го и властьКонфликты Никовская Л.И. - го и власть
Конфликты Никовская Л.И. - го и властьKomitetGI
 
καζαντζακησ
καζαντζακησκαζαντζακησ
καζαντζακησfoteini2013
 
Our complex tech future
Our complex tech futureOur complex tech future
Our complex tech futureLizzie Hodgson
 
UN policy and standards migrants vs refugees
UN policy and standards migrants vs refugeesUN policy and standards migrants vs refugees
UN policy and standards migrants vs refugeesМЦМС | MCIC
 
Βραβείο προπαίδειας
Βραβείο προπαίδειαςΒραβείο προπαίδειας
Βραβείο προπαίδειαςRoula Mple
 

Viewers also liked (14)

Общественный контроль государственных и муниципальных расходов
Общественный контроль государственных и муниципальных расходовОбщественный контроль государственных и муниципальных расходов
Общественный контроль государственных и муниципальных расходов
 
Resume
ResumeResume
Resume
 
Powerpoint9
Powerpoint9Powerpoint9
Powerpoint9
 
Forced migration, care and family relations
Forced migration, care and family relationsForced migration, care and family relations
Forced migration, care and family relations
 
SMX East - Alternate Mobile Conversion Metrics
SMX East - Alternate Mobile Conversion MetricsSMX East - Alternate Mobile Conversion Metrics
SMX East - Alternate Mobile Conversion Metrics
 
Leave a legacy
Leave a legacyLeave a legacy
Leave a legacy
 
RecyclinginIV
RecyclinginIVRecyclinginIV
RecyclinginIV
 
Emerce Conversion
Emerce ConversionEmerce Conversion
Emerce Conversion
 
Конфликты Никовская Л.И. - го и власть
Конфликты Никовская Л.И. - го и властьКонфликты Никовская Л.И. - го и власть
Конфликты Никовская Л.И. - го и власть
 
καζαντζακησ
καζαντζακησκαζαντζακησ
καζαντζακησ
 
Activity Sheet
Activity SheetActivity Sheet
Activity Sheet
 
Our complex tech future
Our complex tech futureOur complex tech future
Our complex tech future
 
UN policy and standards migrants vs refugees
UN policy and standards migrants vs refugeesUN policy and standards migrants vs refugees
UN policy and standards migrants vs refugees
 
Βραβείο προπαίδειας
Βραβείο προπαίδειαςΒραβείο προπαίδειας
Βραβείο προπαίδειας
 

Similar to PGX Data Mining

The Principle of Rational Design of Drug Combination and Personalized Therapy...
The Principle of Rational Design of Drug Combination and Personalized Therapy...The Principle of Rational Design of Drug Combination and Personalized Therapy...
The Principle of Rational Design of Drug Combination and Personalized Therapy...Jianghui Xiong
 
Personalized medicine through wes and big data analytics
Personalized medicine through wes and big data analyticsPersonalized medicine through wes and big data analytics
Personalized medicine through wes and big data analyticsJunaidAKG
 
Target discovery and validation
Target discovery and validation Target discovery and validation
Target discovery and validation ANAND SAGAR TIWARI
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureAffymetrix
 
Contribution of genome-wide association studies to scientific research: a pra...
Contribution of genome-wide association studies to scientific research: a pra...Contribution of genome-wide association studies to scientific research: a pra...
Contribution of genome-wide association studies to scientific research: a pra...Mutiple Sclerosis
 
A common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organsA common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organsKevin Jaglinski
 
Genomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentGenomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentSuchittaU
 
Instructions Respond to your colleague in one of the following
Instructions Respond to your colleague in one of the following Instructions Respond to your colleague in one of the following
Instructions Respond to your colleague in one of the following TatianaMajor22
 
From reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingFrom reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingJoaquin Dopazo
 
Pharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahuPharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahuKAUSHAL SAHU
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryDr. Gerry Higgins
 
Pharmacogenomics: The right drug to the right person.
Pharmacogenomics: The right drug to the right person.Pharmacogenomics: The right drug to the right person.
Pharmacogenomics: The right drug to the right person.University of Allahabad
 
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...laserxiong
 
Folding. Building New Blood Vessels And Drugs Chosen by Your DNA.
Folding. Building New Blood Vessels  And Drugs Chosen by Your DNA.Folding. Building New Blood Vessels  And Drugs Chosen by Your DNA.
Folding. Building New Blood Vessels And Drugs Chosen by Your DNA.sebastian naranjo
 
Folding.Building New Blood Vessels And Drugs Chosen by Your DNA
Folding.Building New Blood Vessels  And Drugs Chosen by Your DNAFolding.Building New Blood Vessels  And Drugs Chosen by Your DNA
Folding.Building New Blood Vessels And Drugs Chosen by Your DNAsebastian naranjo
 

Similar to PGX Data Mining (20)

The Principle of Rational Design of Drug Combination and Personalized Therapy...
The Principle of Rational Design of Drug Combination and Personalized Therapy...The Principle of Rational Design of Drug Combination and Personalized Therapy...
The Principle of Rational Design of Drug Combination and Personalized Therapy...
 
Personalized medicine through wes and big data analytics
Personalized medicine through wes and big data analyticsPersonalized medicine through wes and big data analytics
Personalized medicine through wes and big data analytics
 
Target discovery and validation
Target discovery and validation Target discovery and validation
Target discovery and validation
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
 
Pharmacogenomics
Pharmacogenomics Pharmacogenomics
Pharmacogenomics
 
Contribution of genome-wide association studies to scientific research: a pra...
Contribution of genome-wide association studies to scientific research: a pra...Contribution of genome-wide association studies to scientific research: a pra...
Contribution of genome-wide association studies to scientific research: a pra...
 
A common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organsA common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organs
 
Genomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentGenomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and development
 
Instructions Respond to your colleague in one of the following
Instructions Respond to your colleague in one of the following Instructions Respond to your colleague in one of the following
Instructions Respond to your colleague in one of the following
 
From reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingFrom reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene finding
 
multiomics-ebook.pdf
multiomics-ebook.pdfmultiomics-ebook.pdf
multiomics-ebook.pdf
 
Pharmaogenomics
PharmaogenomicsPharmaogenomics
Pharmaogenomics
 
Pharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahuPharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahu
 
Genomics
GenomicsGenomics
Genomics
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discovery
 
Pharmacogenomics: The right drug to the right person.
Pharmacogenomics: The right drug to the right person.Pharmacogenomics: The right drug to the right person.
Pharmacogenomics: The right drug to the right person.
 
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
 
Folding. Building New Blood Vessels And Drugs Chosen by Your DNA.
Folding. Building New Blood Vessels  And Drugs Chosen by Your DNA.Folding. Building New Blood Vessels  And Drugs Chosen by Your DNA.
Folding. Building New Blood Vessels And Drugs Chosen by Your DNA.
 
Building new blood vessels
Building new blood vesselsBuilding new blood vessels
Building new blood vessels
 
Folding.Building New Blood Vessels And Drugs Chosen by Your DNA
Folding.Building New Blood Vessels  And Drugs Chosen by Your DNAFolding.Building New Blood Vessels  And Drugs Chosen by Your DNA
Folding.Building New Blood Vessels And Drugs Chosen by Your DNA
 

Recently uploaded

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 

Recently uploaded (20)

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men  🔝Thrissur🔝   Escor...
➥🔝 7737669865 🔝▻ Thrissur Call-girls in Women Seeking Men 🔝Thrissur🔝 Escor...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 

PGX Data Mining

  • 1. PHARMACOGENOMIC DATA MINING with Hierarchical Clustering Algorithms Ohene Z. Frank CSC 576 Data Warehousing and Mining Final Report
  • 2. Frank | PGX Data Mining 1 PHARMACOGENOMIC DATA MINING WITH HIERARCHICAL CLUSTERING ALGORITHMS Designer’s drugs, individualized drugs and personalized medicine are few of the buzzwords that are proliferating the biotech information super-highway and are widely used by pharmaceutical scientists, clinical scientists, researchers and medical humanitarians when referring to pharmacogenomics. Malorye Branca of Bio-IT World stated, “One of the most seductive lures of the genomic revolution is the promise of personalized medicine”. Pharmacogenomics is the study of how one’s genetic makeup affects the body’s response to drugs, hence an intersection of genetics, pharmacodynamics and pharmacokinetics. Pharmacogenetics is widely used synonymously with pharmacogenomics. Conceptually, these genomics terms are interchangeable, but from a purist view, pharmacogenomics is the technology where as pharmacogenetics is the science. Genaissance Pharmaceuticals defined pharmacogenomics as the application of genome science (genomics) to the study of human variability to drug response. So, what’s the real tumult? In the United States, there is at least 100, 000 death annually due to adverse reactions (side effects) to prescription drugs. Moreover, millions of people are being treated with drugs that are ineffective or have very little pharmacological effect; beta-blockers given to reduce blood pressure are ineffective in one-third of patients and many antidepressants in half of the people who take them [1]. The culpability for the lack of efficacy and intolerance of many drugs lies mainly with our genes, which help to determine the way in which our body reacts, absorbs,
  • 3. Frank | PGX Data Mining 2 distributes, metabolize and excretes drugs. Small genetic variations between people (known as polymorphisms) can alter the behavior of proteins that carry a drug to its target cells or tissues, neutralize the enzymes that activate a drug or aid in the excretion process or alter the structure of the receptor to which a drug is supposed to bind [1]. Variation in immune-system genes can also influence how particular drugs are tolerated. These slight genetic variations mean that the dose at which a drug will work may vary hugely from person to person; hence, the one-size- fits-all drug development and prescribing can lead to life-threatening adverse reaction to a drug or in some cases, fatality. On the right path forward, the genomics revolution has given us the tools to identify people who don't fit the standard prescribing mold. Genomics is the use of high throughput molecular biology technologies to study large numbers of genes, and gene products simultaneously in whole cells, whole tissues, or whole organisms [2]. The genome is all of the genetic material in a cell or an organism. According to the U.S. Department of Energy, the genome is an organism’s complete set of DNA. In the human genome, DNA is arranged into 24 distinct chromosomes, which are separate molecules (physically) that range in length from about 50 million to 250 million base pairs [3]. Each chromosome is a single strand of the DNA double helix that is very long in length (as illustrated Figure 1).
  • 4. Frank | PGX Data Mining 3 Figure 11: Illustration of a chromosome replicating its DNA before a cell divides. Single nucleotide polymorphisms (SNPs) are single-letter variations in the genetic code that are scattered throughout the genome. Most SNPs are benign, with absolutely no effect on gene structure or expression; however, a subset of these variations provides crucial links to disease-causing genes, either because they directly alter a gene's activity or aid in pinpointing the location of a disease-related gene [1]. 1 Figure is the courtesy of Genaissance Pharmaceuticals, Inc.
  • 5. Frank | PGX Data Mining 4 The profusion of SNPs and the simplistic identification, make them the ideal biomarkers for clinical studies. SNPs are also found in genes for drug-metabolizing enzymes, influencing individuals' ability to process a drug properly. Many companies have compiled large collections of SNPs with the intention of developing diagnostic and prognostic tests, as well as to guide the development of a new generation of drugs that would target genetically determined subsets of patients [1]. All in all, this type of genomic technology as it aims to identify the best possible medications for individuals while maximizing efficacy and minimizing toxicity is known as pharmacogenomics. Due the gravity and promise of pharmacogenomics, several genomics companies are manufacturing DNA microarrays to identify common SNPs that influence the activity of various enzymes. Ultimately, these gene expression chips could help to prevent life-threatening reactions to drugs, identify appropriate drug doses, and prescribe the right drug combination (or concomitant medications) to give to patients with complex conditions. In order for this to come to fulfillment at faster pace, one can applied data mining techniques to a clinical data warehouse that contains both clinical trials data and genomic data (anonymized genotyping and microarray) utilizing hierarchical clustering algorithms.
  • 6. Frank | PGX Data Mining 5 The data mining technique most widely utilized for the analysis of gene expression data is hierarchical clustering. This type of clustering algorithms has the advantage of being relatively simple and the result can be easily visualized. Hierarchical clustering is an agglomerative approach in which single expression profiles are joined to form groups that are further joined until the process has been carried to completion, forming a single hierarchical tree [5]. There are six main hierarchical clustering algorithms (single-linkage, complete- linkage, average-linkage, weighted pair-group average, within-groups and Ward’s method) that can be applied to gene expression profiling (microarray) data analysis. These clustering algorithms differ in the methodology in which distances are calculated between the growing clusters and the remaining members (including other clusters) in the data set. [5] Single-linkage Clustering: This method is also referred to as the minimum, or nearest-neighbor method. The distance between two clusters, x and y, is calculated as the minimum distance between a member of cluster x and a member of cluster y. This method tends to produce “loose” clusters that can be joined, if any two members are close together. This method often results in sequential addition of single samples to an existing cluster, which in turn, produces trees with many long, single-addition branches representing clusters that have grown by accumulation. Complete-linkage Clustering: This method is also referred to as the maximum or furthest-neighbor method. The distance between two clusters is calculated
  • 7. Frank | PGX Data Mining 6 as the greatest distance between members of the relevant clusters. This method tends to produce very compact clusters of elements and the clusters are often very similar in size. Average-linkage Clustering: This method is also referred to as unweighted pair-group method average. The average distance is calculated from the distance between each point in a cluster and all other points in another cluster. The two clusters with the lowest average distance are joined together to form a new cluster. Weighted Pair-group Average: This method is identical to average-linkage clustering (as described above), except that the size of the respective clusters is used as a weight in the computations. This method should be used when the cluster sizes are suspected to be greatly uneven. Within-groups Clustering: This method is similar to average-linkage clustering also, except that the clusters are merged and a cluster average is used for further calculations instead of the individual cluster elements. This method tends to produce tighter clusters than average-linkage clustering. Ward's Method: In this method, the calculation of the total sum of squared deviations from the mean of a cluster and joining clusters in order that it produces the smallest possible increase in the sum of squared errors determines the clusters.
  • 8. Frank | PGX Data Mining 7 Figure 32: Hierarchical Clustering Demonstration Figure 3 is a representation of gene expression data that were subjected to average- linkage, complete-linkage and single-linkage hierarchical clustering using a Euclidean distance metric and gene-expression families (A–J) that were color coded for comparison. Genes that are up-regulated appear in red, and those that are down-regulated appear in green, with the relative log2 (ratio) reflected by the intensity of the color [5]. 2 Courtesy of Nature Reviews, Nature Publishing Group
  • 9. Frank | PGX Data Mining 8 The aim and allure of pharmacogenomic data mining is to discovery knowledge from a clinical genomic data warehouse (comprised of both genomic and clinical data), in order to identify and prescribe the most effective and least toxic drug for an individual based the person’s genetic makeup and the targeted disease. References [1] Abbott, A., Nature 425, 760 - 762 (23 October 2003). [2] Genaissance Pharmaceuticals, Inc., Online Glossary (2004). [3] US Department of Energy, Human Genome Information Project, Pharmacogenomics (2004). [4] Branca, M., The New, New Pharmacogenomics, Bio-IT World (Sept. 9, 2002). [5] Quackenbush, J., Nature Reviews Genetics 2, 418-427 (2001). [6] Brown, M., Essentials of Medical Genomics, 163-198 (2003). [7] Hollinger, M.A., Introduction to Pharmacology 2, 288-290 (2003).