The document describes a proposed system for optimally predicting drugs for cancer patients based on their personal genomics profiles. The system uses drug sensitivity data from cancer cell lines and gene expression profiles to calculate similarity scores between drugs and between patients. These scores are combined to assign a drug-patient effect score, predicting the most sensitive drugs for individual patients. Validation tests on independent datasets showed the system significantly improves effective drug selection over random chance. The system could help select optimal personalized therapies for cancer patients based on their genomic information.
Optimal drug prediction from personal genomics profiles
1. Optimal Drug Prediction From
Personal Genomics Profiles
Name: Aarathi Anil
Class:S7 IT
Guide : Asst.Professor Simi MS
2. 11-Sep-2017 2
Optimal drug prediction from personal
genomics profiles
1.Problem Definition
• Cancer is a group of diseases involving abnormal cell growth with the
potential to spread to other parts of the body.
• Cancer patients often show heterogeneous drug responses such that
only a small subset of patients is sensitive to a given anticancer drug.
Although subtyping analysis has identified patient subgroups sharing
common biomarkers, there is no effective method to predict the drug
response of individual patients precisely and reliably.
3. 11-Sep-2017 3Optimal drug prediction from personal
genomics profiles
2.Topic Relevance &Application
• The proposed drug prediction algorithm can be used to improve greatly
the reliability of finding optimal drugs for individual patients .
• It will, thus form a key component in the precision medicine
infrastructure for oncology care .
• It is a contribution to medical centre and useful for doctors to treat
cancer patients in an effective manner.
• It will also impact the cancer clinical regimen profoundly.
4. 11-Sep-2017 4Optimal drug prediction from personal
genomics profiles
Topic Relevance &Application(Contd..)
• Computational analysis is the bottleneck in associating the complex
genomic profiles reliably with heterogeneous drug responses.
• It also helps to achieve better therapeutic outcomes by selecting all
the optimal drugs among all anticancer drugs based on
their(patient’s) genomic profiles.
5. 11-Sep-2017 5Optimal drug prediction from personal
genomics profiles
3.Existing System
• No effective method to predict the drug response of individual
patients precisely and reliably.
• The International Cancer Genome Consortium was formed to
coordinate the generation and management of large-scale genomic
datasets from over 50 cancer types or subtypes collected from all
over the world.
• These large-scale and comprehensive genomic profiling datasets
found invaluable in identifying predictive biomarkers of an
individual’s drug response.
6. 11-Sep-2017 6
Optimal drug prediction from personal
genomics profiles
Existing System(Contd..)
• Though subtyping analysis of cancer patients has identified patient
subgroups sharing common biomarkers, the genomic instability
patterns among the subtypes of a cancer are complex and not easily
distinguishable.
• Thus, these genomic patterns often failed to offer insight into drug
sensitivity and to identify reliable predictive biomarkers of individual
subtypes.
• There also exist other algorithms to find differentially expressed genes
between sensitive groups and resistant groups.
7. 11-Sep-2017 7
Optimal drug prediction from personal
genomics profiles
Existing System(Contd..)
• They first tested if t-test could be used to find drug signatures.
• They did Shapiro–Wilk test on the expression values of genes across
cell lines and found that the expression value of most of the genes
(over 10,000) did not form normal distribution.
• Thus, t-test and other similar methods were not suitable there.
• Drug response of a patient was also predicted by mining high
throughput genomics data .
8. 11-Sep-2017 8
Optimal drug prediction from personal
genomics profiles
Existing System(Contd..)
• Molecular analysis showed that cancers from different organs may
share similar features, while cancers from the same organ were often
quite distinct.
• As evidenced by the observation that cancer patients with the same
cancer type showed heterogeneous responses to the same therapy.
• It was believed that complex and distinct genomic instability is
responsible for the diversity in drug response.
• Many approaches were developed to reposition or re-purpose
known drugs to new indications based on assumptions that drugs
sharing similar chemical structures would have same targets.
9. 11-Sep-2017 9Optimal drug prediction from personal
genomics profiles
4.Proposed System
• The Proposed system is a drug prediction algorithm that can
be used to improve greatly the reliability of finding optimal drugs
for individual patients and will, thus, form a key component in the
precision medicine infrastructure for oncology care.
• For this, 624 cancer cell lines (viewed as individual patients) and
their responses to 75 drugs were obtained from Genomics of Drug
Sensitivity in Cancer (GDSC) .
10. 11-Sep-2017 10Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
• The results showed that the proposed prediction algorithm could
significantly improve the chance of finding sensitive drugs for
individual patients.
11. 11-Sep-2017 11
Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
DATA ANDMETHODOLOGY
(1)Data
• SMILES : Simplified molecular-input line-entry system
• Only drugs that have both SMILES and IC50 information, and cell
lines that have both IC50 and gene expression profile data were
used in their analysis.
• IC50 data of 75 drugs on 624 cell lines in GDSC and 24 drugs on
504 cell lines in CCLE were obtained.
12. 11-Sep-2017
12
Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
(2)Algorithm Overview
• Let ,
C the cell line
D set of known drugs in GDSC
G the drug-related gene signature consisting of
differentially expressed genes between the drug-
specific resistant and sensitive cell lines.
13. 11-Sep-2017 13
Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
Where ,
*
*
*
14. 11-Sep-2017 14
Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
(3) IC50 Value Normalization
As suggested by GDSC,
Cell lines with IC50 values greater than the maximum
concentration of a drug are likely to be resistant to that drug.
Cell lines with IC50 values smaller than maximum concentration
are likely to be sensitive to that drug.
.
15. 11-Sep-2017 15
Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
METHOD
For each drug-cell line pair,
1) Calculate similarity scores for pair wise drugs based on their chemical
structures.
2) Then select most similar drugs to the given one.
3) Calculate a gene signature based on the IC50 values of the selected
drugs and gene expression profiles of the cell lines.
4) Calculate flexible similarity score between cell lines based on the
gene signature.
16. 11-Sep-2017 16
Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
5) Finally, assign a drug-cell line effect score by combining drug
similarity scores and cell line similarity scores.
.
• Using this method, the top candidate drugs that have the sensitivity for
the given cell line were predicted.
• Under IC50 Value Normalization ,
divide the raw IC50 value by the maximum concentration of
and then log2-transform it as normalized IC50 value.
For drug cell line pairs that lack IC50 values from GDSC,
set their normalized IC50 to 0 (for IC50=0).
17. 11-Sep-2017
17Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
Cell lines with a normalized IC50 value greater than 0 are
considered to be resistant cell lines to the drug (for IC50>0).
Cell lines with normalized IC50 value smaller than 0 were
considered to be sensitive cell lines to the drug (for IC50<0).
Since the range of IC50 values of sensitive cell lines differs from
the range of resistant cells, normalized IC50 value was further
scaled to the range of [−1,1] for the following analysis.
18. 11-Sep-2017
18Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
(4) Drug–DrugSimilarity
Given two drugs,
• The “rcdk” R package is used to retrieve their finger print information
and from the SMILES data.
• The similarity between the two drugs , ,is calculated
using Tanimoto coefficient .
19. 11-Sep-2017
19Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
The tanimoto coefficient is given by the equation:
Where,
c: the number of finger prints that appear in both
and .
a: the number of finger prints that only appear in
b: the number of finger prints that only appear in
and : finger prints of and respectively.
20. 11-Sep-2017
20Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
(5) Patient–Patient Genomics Similarity
For a given drug , to calculate cell line similarity :
1)calculate the drug–drug similarities, ,between T
and all other drugs
2)rank in descending order of similarity score .
3)Top drugs are selected as .
4)Find a gene signature for .
5)Calculate cell line similarity score based on this signature.
21. 11-Sep-2017
21Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
6) For each drug , separate cell lines into two groups
according to their normalized IC50 values.
If > 0 , is assigned to group 1- Resistant group .
If < 0 , is assigned to group 2- Sensitive group .
7) For each gene we calculate its fold change between the mean
expression levels of the two groups, given by:
22. 11-Sep-2017
22Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
RESULTS
• Two new validation strategies to reduce the bias invalidation and for
selecting optimal values for and . were conducted.
1. To use twofold cross validation for modeling GDSC dataset
2. To use GDSC dataset as the training set and CCLE data set as
the independent testing set.
23. 11-Sep-2017
23Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
1)To use twofold crossvalidation for modeling GDSC dataset
• Randomly divide the cell lines into two subgroups, with equal size
and each time use one as training set and the other as testing set.
• The testing results are then averaged for each
• The drug responses on the testing set were estimated using drug
treated IC50 values on the training set.
• Top ten prioritized drugs for each testing cell line were selected.
• All the drugs were ranked in increasing order based on their
normalized IC50 values on the given cell line.
24. 11-Sep-2017
24Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
• The enrichment score of the ten drugs in the ordered drug list was
calculated ie;
• (enrichment score) which ranges from −1 to 1, and would be
close to1 if the selected drugs were found at the top of the ranked
drug list and vice versa.
• The prediction power of single drug (rd =1) decreases when more
similar cell lines being added (noise sensitive),
• The prediction power with multiple similar drugs increases with more
similar cell lines (noise robust).
25. 11-Sep-2017
25Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
2) To useGDSC dataset asthe training set and CCLE data set as the independent
testing set
• The CCLE datasets were also collected for the validation. There
were 11 drugs in both GDSC and CCLE database.
• GDSC data was used as the training set, and CCLE database as the
testing dataset.
• Then predict the drug responses on 480 cell lines in CCLE for 11
drugs.
26. 11-Sep-2017
26Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
• p values were calculated for each and
• Set threshold as p<0.05
The p value was calculated as :
• Small p value means that the prediction can discover the more
sensitive drugs to the given cell line, compared with the random
selection.
28. 11-Sep-2017
28Optimal drug prediction from personal
genomics profiles
Proposed System(Contd..)
• The p values are smaller than 0.05 in almost all the cases, which
means the top 10 drug-cell line pairs predicted by their model had a
much lower IC50 value compared with random selection, and thus,
the cell lines are more sensitive to these drugs.
In summary, the proposed prediction algorithm can improve
the effective drug selection significantly, and the evaluation
results indicate the possibility of selecting the optimal drugs
among all anticancer drugs based on their genomic profiles .
29. 11-Sep-2017
29Optimal drug prediction from personal
genomics profiles
5.CONCLUSION
• Patients with the same type of cancer often respond differently to the
same drug treatment, and only a small subset of patients has
profound sensitivity to the given drugs.
• The complex and distinct genomic instability of individual cancer
patients is believed to be responsible for that diversity in drug
response.
• Their research finding suggested the potential of selecting optimal
drugs among anticancer drugs for individual patients based on their
genomic profiles, which will impact the cancer clinical regimen
profoundly.
30. 11-Sep-2017
30Optimal drug prediction from personal
genomics profiles
6.REFERENCE
• Discovery of drug mode of action and drug repositioning from
transcriptional responses.
• Drug repositioning for personalized medicine