Optimal drug prediction from personal genomics profiles

Optimal Drug Prediction From
Personal Genomics Proﬁles
Name: Aarathi Anil
Class:S7 IT
Guide : Asst.Professor Simi MS

11-Sep-2017 2
Optimal drug prediction from personal
genomics profiles
1.Problem Definition
• Cancer is a group of diseases involving abnormal cell growth with the
potential to spread to other parts of the body.
• Cancer patients often show heterogeneous drug responses such that
only a small subset of patients is sensitive to a given anticancer drug.
 Although subtyping analysis has identiﬁed patient subgroups sharing
common biomarkers, there is no effective method to predict the drug
response of individual patients precisely and reliably.

11-Sep-2017 3Optimal drug prediction from personal
genomics profiles
2.Topic Relevance &Application
• The proposed drug prediction algorithm can be used to improve greatly
the reliability of finding optimal drugs for individual patients .
• It will, thus form a key component in the precision medicine
infrastructure for oncology care .
• It is a contribution to medical centre and useful for doctors to treat
cancer patients in an effective manner.
• It will also impact the cancer clinical regimen profoundly.

genomics profiles
Topic Relevance &Application(Contd..)
• Computational analysis is the bottleneck in associating the complex
genomic proﬁles reliably with heterogeneous drug responses.
• It also helps to achieve better therapeutic outcomes by selecting all
the optimal drugs among all anticancer drugs based on
their(patient’s) genomic profiles.

genomics profiles
3.Existing System
• No effective method to predict the drug response of individual
patients precisely and reliably.
• The International Cancer Genome Consortium was formed to
coordinate the generation and management of large-scale genomic
datasets from over 50 cancer types or subtypes collected from all
over the world.
• These large-scale and comprehensive genomic proﬁling datasets
found invaluable in identifying predictive biomarkers of an
individual’s drug response.

11-Sep-2017 6
genomics profiles
Existing System(Contd..)
• Though subtyping analysis of cancer patients has identified patient
subgroups sharing common biomarkers, the genomic instability
patterns among the subtypes of a cancer are complex and not easily
distinguishable.
• Thus, these genomic patterns often failed to offer insight into drug
sensitivity and to identify reliable predictive biomarkers of individual
subtypes.
• There also exist other algorithms to ﬁnd differentially expressed genes
between sensitive groups and resistant groups.

11-Sep-2017 7
genomics profiles
• They ﬁrst tested if t-test could be used to ﬁnd drug signatures.
• They did Shapiro–Wilk test on the expression values of genes across
cell lines and found that the expression value of most of the genes
(over 10,000) did not form normal distribution.
• Thus, t-test and other similar methods were not suitable there.
• Drug response of a patient was also predicted by mining high
throughput genomics data .

11-Sep-2017 8
genomics profiles
• Molecular analysis showed that cancers from different organs may
share similar features, while cancers from the same organ were often
quite distinct.
• As evidenced by the observation that cancer patients with the same
cancer type showed heterogeneous responses to the same therapy.
• It was believed that complex and distinct genomic instability is
responsible for the diversity in drug response.
• Many approaches were developed to reposition or re-purpose
known drugs to new indications based on assumptions that drugs
sharing similar chemical structures would have same targets.

genomics profiles
4.Proposed System
• The Proposed system is a drug prediction algorithm that can
be used to improve greatly the reliability of ﬁnding optimal drugs
for individual patients and will, thus, form a key component in the
precision medicine infrastructure for oncology care.
• For this, 624 cancer cell lines (viewed as individual patients) and
their responses to 75 drugs were obtained from Genomics of Drug
Sensitivity in Cancer (GDSC) .

genomics profiles
Proposed System(Contd..)
• The results showed that the proposed prediction algorithm could
signiﬁcantly improve the chance of ﬁnding sensitive drugs for
individual patients.

11-Sep-2017 11
genomics profiles
 DATA ANDMETHODOLOGY
(1)Data
• SMILES : Simpliﬁed molecular-input line-entry system
• Only drugs that have both SMILES and IC50 information, and cell
lines that have both IC50 and gene expression proﬁle data were
used in their analysis.
• IC50 data of 75 drugs on 624 cell lines in GDSC and 24 drugs on
504 cell lines in CCLE were obtained.

11-Sep-2017
12
genomics profiles
(2)Algorithm Overview
• Let ,
 C the cell line
 D set of known drugs in GDSC
 G the drug-related gene signature consisting of
differentially expressed genes between the drug-
speciﬁc resistant and sensitive cell lines.

11-Sep-2017 13
genomics profiles
Where ,
*
*
*

11-Sep-2017 14
genomics profiles
(3) IC50 Value Normalization
As suggested by GDSC,
 Cell lines with IC50 values greater than the maximum
concentration of a drug are likely to be resistant to that drug.
 Cell lines with IC50 values smaller than maximum concentration
are likely to be sensitive to that drug.
.

11-Sep-2017 15
genomics profiles
METHOD
 For each drug-cell line pair,
1) Calculate similarity scores for pair wise drugs based on their chemical
structures.
2) Then select most similar drugs to the given one.
3) Calculate a gene signature based on the IC50 values of the selected
drugs and gene expression proﬁles of the cell lines.
4) Calculate ﬂexible similarity score between cell lines based on the
gene signature.

11-Sep-2017 16
genomics profiles
5) Finally, assign a drug-cell line effect score by combining drug
similarity scores and cell line similarity scores.
.
• Using this method, the top candidate drugs that have the sensitivity for
the given cell line were predicted.
• Under IC50 Value Normalization ,
divide the raw IC50 value by the maximum concentration of
and then log2-transform it as normalized IC50 value.
For drug cell line pairs that lack IC50 values from GDSC,
set their normalized IC50 to 0 (for IC50=0).

11-Sep-2017
17Optimal drug prediction from personal
genomics profiles
 Cell lines with a normalized IC50 value greater than 0 are
considered to be resistant cell lines to the drug (for IC50>0).
 Cell lines with normalized IC50 value smaller than 0 were
considered to be sensitive cell lines to the drug (for IC50<0).
Since the range of IC50 values of sensitive cell lines differs from
the range of resistant cells, normalized IC50 value was further
scaled to the range of [−1,1] for the following analysis.

11-Sep-2017
genomics profiles
(4) Drug–DrugSimilarity
Given two drugs,


• The “rcdk” R package is used to retrieve their ﬁnger print information
and from the SMILES data.
• The similarity between the two drugs , ,is calculated
using Tanimoto coefﬁcient .

11-Sep-2017
genomics profiles
The tanimoto coefficient is given by the equation:
Where,
 c: the number of finger prints that appear in both
and .
 a: the number of finger prints that only appear in
 b: the number of finger prints that only appear in
 and : finger prints of and respectively.

11-Sep-2017
genomics profiles
(5) Patient–Patient Genomics Similarity
For a given drug , to calculate cell line similarity :
1)calculate the drug–drug similarities, ,between T
and all other drugs
2)rank in descending order of similarity score .
3)Top drugs are selected as .
4)Find a gene signature for .
5)Calculate cell line similarity score based on this signature.

11-Sep-2017
genomics profiles
6) For each drug , separate cell lines into two groups
according to their normalized IC50 values.
If > 0 , is assigned to group 1- Resistant group .
If < 0 , is assigned to group 2- Sensitive group .
7) For each gene we calculate its fold change between the mean
expression levels of the two groups, given by:

11-Sep-2017
genomics profiles
 RESULTS
• Two new validation strategies to reduce the bias invalidation and for
selecting optimal values for and . were conducted.
1. To use twofold cross validation for modeling GDSC dataset
2. To use GDSC dataset as the training set and CCLE data set as
the independent testing set.

11-Sep-2017
genomics profiles
1)To use twofold crossvalidation for modeling GDSC dataset
• Randomly divide the cell lines into two subgroups, with equal size
and each time use one as training set and the other as testing set.
• The testing results are then averaged for each
• The drug responses on the testing set were estimated using drug
treated IC50 values on the training set.
• Top ten prioritized drugs for each testing cell line were selected.
• All the drugs were ranked in increasing order based on their
normalized IC50 values on the given cell line.

11-Sep-2017
genomics profiles
• The enrichment score of the ten drugs in the ordered drug list was
calculated ie;
• (enrichment score) which ranges from −1 to 1, and would be
close to1 if the selected drugs were found at the top of the ranked
drug list and vice versa.
• The prediction power of single drug (rd =1) decreases when more
similar cell lines being added (noise sensitive),
• The prediction power with multiple similar drugs increases with more
similar cell lines (noise robust).

11-Sep-2017
genomics profiles
2) To useGDSC dataset asthe training set and CCLE data set as the independent
testing set
• The CCLE datasets were also collected for the validation. There
were 11 drugs in both GDSC and CCLE database.
• GDSC data was used as the training set, and CCLE database as the
testing dataset.
• Then predict the drug responses on 480 cell lines in CCLE for 11
drugs.

11-Sep-2017
genomics profiles
• p values were calculated for each and
• Set threshold as p<0.05
The p value was calculated as :
• Small p value means that the prediction can discover the more
sensitive drugs to the given cell line, compared with the random
selection.

11-Sep-2017
genomics profiles
• The results are shown in Table I:

11-Sep-2017
genomics profiles
• The p values are smaller than 0.05 in almost all the cases, which
means the top 10 drug-cell line pairs predicted by their model had a
much lower IC50 value compared with random selection, and thus,
the cell lines are more sensitive to these drugs.
In summary, the proposed prediction algorithm can improve
the effective drug selection signiﬁcantly, and the evaluation
results indicate the possibility of selecting the optimal drugs
among all anticancer drugs based on their genomic proﬁles .

11-Sep-2017
genomics profiles
5.CONCLUSION
• Patients with the same type of cancer often respond differently to the
same drug treatment, and only a small subset of patients has
profound sensitivity to the given drugs.
• The complex and distinct genomic instability of individual cancer
patients is believed to be responsible for that diversity in drug
response.
• Their research ﬁnding suggested the potential of selecting optimal
drugs among anticancer drugs for individual patients based on their
genomic proﬁles, which will impact the cancer clinical regimen
profoundly.

11-Sep-2017
genomics profiles
6.REFERENCE
• Discovery of drug mode of action and drug repositioning from
transcriptional responses.
• Drug repositioning for personalized medicine

11-Sep-2017 31
genomics profiles
THANK YOU…….!!!!!

Optimal drug prediction from personal genomics profiles

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Optimal drug prediction from personal genomics profiles

Similar to Optimal drug prediction from personal genomics profiles (20)

Recently uploaded

Recently uploaded (20)

Optimal drug prediction from personal genomics profiles