SlideShare a Scribd company logo
1 of 1
Identifying and Repurposing Novel Drug Candidates
for Treating Leukemia Using Drug, Protein and
Disease Interaction Networks
Rashell Garretson1, Rut Thakkar2 , Zack East2 , Bin Peng3
Dr. Jake Chen4 and Dr. Walter Jessen5
1Department of Biology, Purdue School of Science, IUPUI; 2Neuroscience Program, Purdue School of Science, IUPUI;
3Department of Computer and Information Science, Purdue School of Science, IUPUI, 4Indiana University Center for Systems
Biology and Personalized Medicine, IUPUI; 5Informatics, Covance, Greenfield, IN
Introduction
Taking a drug from discovery to market takes an average of twelve years.
To minimize the time and costs of new drug development, data mining
can be utilized to identify currently available drugs and other associated
data, and prioritize candidates that can be repurposed to treat other
diseases. This study focuses on three subtypes of leukemia:
myelomonocytic leukemia, acute megakaryoblastic leukemia and B-cell
prolymphocytic leukemia as they lack sufficient treatment along with
having a poor prognosis. The data mining process is initiated by gathering
information regarding FDA approved drugs and drugs in clinical trials to
treat these subtypes of leukemia. A complex network is then generated
through the curation of information on drug, protein, and disease
interactions. A host of other diseases are then analyzed through disease to
disease interactions to compile a list of diseases that are closely related to
our leukemia subtypes of interest. Drugs used for these closely related
diseases are then contrasted with drugs used for leukemia based on their
protein targets, interactions and structure to identify drugs that would
most likely be effective in treating our leukemia subtypes. Repurposing
drugs based on structure, protein interactions, and target similarity can be
beneficial in saving immense time and resources by utilizing drugs that
are already available on the market in a novel way with the ultimate goal
of saving lives.
Methods
Defining Subtypes of Interest
• Subtypes were chosen by reviewing articles about the prognosis, 5-
year survival rate, and currently available treatments. Subtypes with a
poor prognosis, a low survival rate, and few effective treatments were
prioritize.
• Myelomonocytic leukemia and acute megakaryoblastic leukemia are
subtypes of acute myeloid leukemia (AML) while B-cell
prolymphocytic leukemia is a subtype of chronic lymphoblastic
leukemia (CLL). We used AML and CLL subcategories in our drug,
disease, and protein interactions to gather more general information.
The more specific category information will be added later to find
drugs to target our subtypes of interest.
Disease to Drug
Drugs are separated into categories using two criteria:
• A D category drug is a drug being used for the specific disease of
interest while an X category drug is a drug currently being used for a
related disease.
• A level 1 drug is one that is curretly FDA approved. A level 2 drug is
a drug that is currently in clinical trial. A level 3 drug is a drug that
has been terminated, withdrawn or suspended in clinical trial, or in
this study any drug in a clinical trial that has not be updated since
2010.
• D1 and X1: Using cancer.gov and the Leukemia and Lymphoma
society website, information about drugs which are currently on the
market to treat the chosen subtypes or related diseases was collected.
• D2, D3, X2, and X3: Using clinicaltrials.gov each subtype and related
disease was inputed and all information on clinical trials was
downloaded and sorted. Trials that were listed as terminated,
withdrawn, or suspended or that had not been updated in the last 5
years were labeled as a category 3. The rest were considered a
category 2. All the drugs from each trial were separated, filtered, and
listed.
Disease to Protein
• Preliminary mutated genes associated with AML and CLL were found
through scrutinizing articles on Pubmed as well as OMIM.
• Effector genes are discovered using the GEO database, which lists all
the up and down regulated gene expressions in a disease.
Drug to Protein
• The D1 drug information collected from the disease to drug curation
was evaluated using DrugBank and STITCH that gave information
about protein interactions and targets for each drug.
Protein to Protein
• Using the Disease to Protein interactions, the key proteins connected
with the subtypes of interest were evaluated using STRING and
HAPPI databases. These interactions were used to create networks
using cytoscape.
Diseases to Disease
• CMBI and Diseaseconnect databases were used to acquire a list of all
the disease associated with AML and CLL.
• The list was then analyzed to obtain the top disease that are similar to
both the leukemia subtypes.
Conclusion & Future Studies
Current Status of Research
References
• The UniProt Consortium. UniProt: a hub for
protein information. Nucleic Acids Res. 43:
D204-D212 (2015). http://www.uniprot.org
• Jensen LJ, Kuhn M, Stark M, Chaffron S,
Creevey C, Muller J, Doerks T, Julien P, Roth
A, Simonovic M, Bork P, von Mering C.
STRING 8--a global view on proteins and their
functional interactions in 630 organisms.
Nucleic Acids Res. 2009 Jan;37(Database
issue):D412-6. doi: 10.1093/nar/gkn760. Epub
2008 Oct 21. http://string-db.org
• Kuhn M, Szklarczyk D, Pletscher-Frankild S,
Blicher TH, von Mering C, Jensen LJ, Bork P.
STITCH 4: integration of protein-chemical
interactions with user data. Nucleic Acids Res.
2014 Jan;42(Database issue):D401-7. doi:
10.1093/nar/gkt1207. Epub 2013 Nov 28.
http://stitch.embl.de
• Chen JY, Mamidipalli S, Huan T. HAPPI: an
online database of comprehensive human
annotated and predicted protein interactions.
BMC Genomics. 2009 Jul 7;10 Suppl 1:S16.
doi: 10.1186/1471-2164-10-S1-S16.
http://discovery.informatics.iupui.edu/HAPPI/
• Nucleic Acids Res. 2014 Jul;42(Web Server
issue):W137-46. doi: 10.1093/nar/gku412.
Epub 2014 Jun 3.
• Liu CC, Tseng YT, Li W, Wu CY, Mayzus I,
Rzhetsky A, Sun F, Waterman M, Chen JJ,
Chaudhary PM, Loscalzo J, Crandall E, Zhou
XJ. DiseaseConnect: a comprehensive web
server for mechanism-based disease-disease
connections. http://disease-connect.org
• DrugBank 4.0: shedding new light on drug
metabolism. Law V, Knox C, Djoumbou Y,
Jewison T, Guo AC, Liu Y, Maciejewski A,
Arndt D, Wilson M, Neveu V, Tang A, Gabriel
G, Ly C, Adamjee S, Dame ZT, Han B, Zhou
Y, Wishart DS. Nucleic Acids Res. 2014 Jan
1;42(1):D1091-7. http://www.drugbank.ca
• Bolton E, Wang Y, Thiessen PA, Bryant SH.
PubChem: Integrated Platform of Small
Molecules and Biological Activities. Chapter
12 IN Wheeler RA and Spellmeyer DC, eds.
Annual Reports in Computational Chemistry,
Volume 4. Oxford, UK: Elsevier, 2008, pp.
217-241. doi:10.1016/S1574-1400(08)00012-1.
https://pubchem.ncbi.nlm.nih.gov
• Shannon P, Markiel A, Ozier O, Baliga NS,
Wang JT, Ramage D, Amin N, Schwikowski
B, Ideker T. Cytoscape: a software
environment for integrated models of
biomolecular interaction networks. Genome
Research 2003 Nov; 13(11):2498-504.
http://cytoscape.org
• Edgar R, Domrachev M, Lash AE. Gene
Expression Omnibus: NCBI gene expression
and hybridization array data repository.
Nucleic Acids Res. 2002 Jan 1;30(1):207-10.
http://www.ncbi.nlm.nih.gov/geo/
0
150
300
450
600
D1 D2 D3 X1 X2 X3
NumberofDrugs
Category of Drugs
Number of Drugs Per Category
AML and CLL Protein to Protein
Interaction Network
Top Proteins Targeted by D1 Drugs
AML Drug Targets Number of Drugs CLL Drug Targets Number of Drugs
P42574 (CASP3) 8 P42574 (CASP3) 6
P08684 (CYP3A4) 8 P55211 (CASP9) 5
P33527 (ABCC1) 7 Q14790 (CASP8) 4
P08183 (ABCB1) 6 P09874 (PARP1) 4
Q14790 (CASP8) 6 P33527 (ABCC1) 4
P04637 (TP53) 5 P33527 (ABCB1) 4
Q9UNQ0 (ABCG2) 5 P08684 (CYP3A4) 4
Q92887 (ABCC2) 4 P55210 (CASP7) 3
P55211 (CASP9) 4 Q9UNQ0 (ABCG2) 3
Q16678 (CYP1B1) 4 P20815 (CYP3A5) 3
Table 2: Top proteins identified as targets in DrugBank and STITCH
from drug to protein interaction. The top ten proteins listed were the
proteins that were targeted by the highest number of D1 for each
subtype.
Table 3: Top related diseases from CMBI. Diseases that were found in
the related disease list for CLL and AML as well as having a CMBI score
greater than 0.3 were chosen as the top related diseases. These diseases
were used to identify X category drugs as candidates for repurposing.
Our team has developed a website that allows us to import data into a
database which we can use to analyze and visualize our data collected
from the data mining process. The data will be stored in the postgreSQL
database and use the elasticsearch framework to do the fuzzy searching
which would allow us to narrow down our searches to find key
information. The website also can provide the relationship between the
drug, disease and protein interactions which we can use to create models
and networks. The next steps are to improve the website’s functionality
so that we can import all our data collected regarding drugs, diseases, and
proteins. We will then use the website to gather information about the
interactions between the data we imported to help us create a model
which we will use to identify which of the drug candidates we found are
the most suited for repurposing.
Top Associated Diseases
Disease
AML/CLL
CMBI Score
Disease
AML/CLL
CMBI Score
Acute Lymphoblastic
Leukemia
0.3682,0.3979
Chronic Myeloid
Leukemia 0.3999,0.4131
Mixed lineage leukemia 0.2942/0.3031
T-cell acute
lymphocytic leukemia 0.4212/0.3074
Non-bruton
agammaglobulinemia 0.335/0.4215
Hemophagocytic
lymphohistiocytosis 0.3328/0.3775
Mycosis Fungoides 0.3083/0.3194 B-cell lymphoma 0.3129/0.2957
Non-hodgkin lymphoma 0.3092/0.3293 Hodgkin lymphoma 0.3092/0.3424
Burkitt's lymphoma 0.3092/0.3622 Werner Syndrome 0.2948/0.303
Figures 1 and 2: Proteins from disease to protein interaction were
combined with additional interacting proteins using STRING and HAPPI
databases. Cytoscape was used to create a network of connections between
proteins associated with each subtype.
Figure 3: Number of drugs found using clinicaltrial.gov, cancer.gov,
Leukemia and Lymphoma society, and articles found on pubmed for
AML and CLL as well as the top associated diseases. D1= FDA approved
drugs for AML or CLL. D2= drugs currently in clinical trial for AML or
CLL. D3= drugs with clinical trial information that has not been updated
in the last five years or clinical trials that have been suspended,
terminated, or withdrawn. X category drugs follow the same rules as D
category drugs, but they are being used for the top associated diseases.
Table 1: Top proteins identified as targets in DrugBank and STITCH
from drug to protein interaction. The top ten proteins listed were the
proteins that were targeted by the highest number of D1 for each
subtype.
Popularity of D Category Drugs by Pubmed Search
Name of drug
Number of Pubmed
articles found
Name of drug
Number of Pubmed
articles found
Cytoxan 6967 Etoposide 3421
Methotrexate 6668 Mercaptopurine 2860
Imatinib 6442
Tetradecanoylphorbol
acetate
2759
Aminopterin 5268 Asparaginase 2701
Antracycline 4175 Cytosar 2508

More Related Content

What's hot

Research proposal sjtu
Research proposal sjtuResearch proposal sjtu
Research proposal sjtuAqsa Qambrani
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsWarren Kibbe
 
National Cancer Data Ecosystem and Data Sharing
National Cancer Data Ecosystem and Data SharingNational Cancer Data Ecosystem and Data Sharing
National Cancer Data Ecosystem and Data SharingWarren Kibbe
 
A Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge SystemA Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge SystemWarren Kibbe
 
Drug Safety: Fluoroquinolone
Drug Safety: FluoroquinoloneDrug Safety: Fluoroquinolone
Drug Safety: FluoroquinoloneBhargav Darji
 
Cell centered database for immunology and cancer research feb252016
Cell centered database for immunology and cancer research feb252016Cell centered database for immunology and cancer research feb252016
Cell centered database for immunology and cancer research feb252016Ann-Marie Roche
 
Biomarkers brain regions
Biomarkers brain regionsBiomarkers brain regions
Biomarkers brain regionsAnn-Marie Roche
 
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR Warren Kibbe
 
SuperComputing 16 HPC Matters Panel on Precision Medicine
SuperComputing 16 HPC Matters Panel on Precision MedicineSuperComputing 16 HPC Matters Panel on Precision Medicine
SuperComputing 16 HPC Matters Panel on Precision MedicineWarren Kibbe
 
Interrogating differences in expression of targeted gene sets to predict brea...
Interrogating differences in expression of targeted gene sets to predict brea...Interrogating differences in expression of targeted gene sets to predict brea...
Interrogating differences in expression of targeted gene sets to predict brea...Enrique Moreno Gonzalez
 
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meeting
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meetingDay 2 Big Data panel at the NIH BD2K All Hands 2016 meeting
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meetingWarren Kibbe
 
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud PilotsC-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud PilotsWarren Kibbe
 
NCI Cancer Imaging Program - Cancer Research Data Ecosystem
NCI Cancer Imaging Program - Cancer Research Data EcosystemNCI Cancer Imaging Program - Cancer Research Data Ecosystem
NCI Cancer Imaging Program - Cancer Research Data EcosystemWarren Kibbe
 
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...ijsc
 
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...Richard Boyce, PhD
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 
Overcoming obstacles to repurposing for neurodegenerative disease
Overcoming obstacles to repurposing for neurodegenerative diseaseOvercoming obstacles to repurposing for neurodegenerative disease
Overcoming obstacles to repurposing for neurodegenerative diseaseLona Vincent
 
Unravelling the molecular linkage of co morbid
Unravelling the molecular linkage of co morbidUnravelling the molecular linkage of co morbid
Unravelling the molecular linkage of co morbideSAT Publishing House
 

What's hot (20)

Research proposal sjtu
Research proposal sjtuResearch proposal sjtu
Research proposal sjtu
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data Commons
 
National Cancer Data Ecosystem and Data Sharing
National Cancer Data Ecosystem and Data SharingNational Cancer Data Ecosystem and Data Sharing
National Cancer Data Ecosystem and Data Sharing
 
A Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge SystemA Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge System
 
Qiu_CV_Feb12_2017
Qiu_CV_Feb12_2017Qiu_CV_Feb12_2017
Qiu_CV_Feb12_2017
 
Drug Safety: Fluoroquinolone
Drug Safety: FluoroquinoloneDrug Safety: Fluoroquinolone
Drug Safety: Fluoroquinolone
 
Cell centered database for immunology and cancer research feb252016
Cell centered database for immunology and cancer research feb252016Cell centered database for immunology and cancer research feb252016
Cell centered database for immunology and cancer research feb252016
 
Biomarkers brain regions
Biomarkers brain regionsBiomarkers brain regions
Biomarkers brain regions
 
NETWORK OF DISEASES AND ITS ENDOWMENT TOWARDS DISEASE
NETWORK OF DISEASES AND ITS ENDOWMENT TOWARDS DISEASE NETWORK OF DISEASES AND ITS ENDOWMENT TOWARDS DISEASE
NETWORK OF DISEASES AND ITS ENDOWMENT TOWARDS DISEASE
 
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR
 
SuperComputing 16 HPC Matters Panel on Precision Medicine
SuperComputing 16 HPC Matters Panel on Precision MedicineSuperComputing 16 HPC Matters Panel on Precision Medicine
SuperComputing 16 HPC Matters Panel on Precision Medicine
 
Interrogating differences in expression of targeted gene sets to predict brea...
Interrogating differences in expression of targeted gene sets to predict brea...Interrogating differences in expression of targeted gene sets to predict brea...
Interrogating differences in expression of targeted gene sets to predict brea...
 
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meeting
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meetingDay 2 Big Data panel at the NIH BD2K All Hands 2016 meeting
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meeting
 
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud PilotsC-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
C-Change Cancer Big Data, NCI Genomic Data Commons, Cloud Pilots
 
NCI Cancer Imaging Program - Cancer Research Data Ecosystem
NCI Cancer Imaging Program - Cancer Research Data EcosystemNCI Cancer Imaging Program - Cancer Research Data Ecosystem
NCI Cancer Imaging Program - Cancer Research Data Ecosystem
 
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...
 
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Overcoming obstacles to repurposing for neurodegenerative disease
Overcoming obstacles to repurposing for neurodegenerative diseaseOvercoming obstacles to repurposing for neurodegenerative disease
Overcoming obstacles to repurposing for neurodegenerative disease
 
Unravelling the molecular linkage of co morbid
Unravelling the molecular linkage of co morbidUnravelling the molecular linkage of co morbid
Unravelling the molecular linkage of co morbid
 

Viewers also liked

Audience questionnaire
Audience questionnaireAudience questionnaire
Audience questionnairePuttH
 
Strengths Finder Top 5
Strengths Finder Top 5Strengths Finder Top 5
Strengths Finder Top 5Roy Remick
 
Cuadro de flechas
Cuadro de flechas Cuadro de flechas
Cuadro de flechas mayorlupo
 
Above and Beyond Honors_01
Above and Beyond Honors_01Above and Beyond Honors_01
Above and Beyond Honors_01Joe Acosta
 
70 Million Safe Man Hours Without LTI
70 Million Safe Man Hours Without LTI70 Million Safe Man Hours Without LTI
70 Million Safe Man Hours Without LTISyed Fazal Ali
 
Plots in neemrana behror,nh8 8
Plots in neemrana behror,nh8 8Plots in neemrana behror,nh8 8
Plots in neemrana behror,nh8 8Baburaj Patel
 
Digital Strategy for FujiFilm
Digital Strategy for FujiFilmDigital Strategy for FujiFilm
Digital Strategy for FujiFilmAdam Lam
 
Knowsy iPad Game Case Study
Knowsy iPad Game Case StudyKnowsy iPad Game Case Study
Knowsy iPad Game Case StudyLane Goldstone
 
Limites trigonometricos
Limites trigonometricosLimites trigonometricos
Limites trigonometricosEl Profe Sami
 
Seguridad en inflado y reparación de llantas
Seguridad en inflado y reparación de llantasSeguridad en inflado y reparación de llantas
Seguridad en inflado y reparación de llantashugomanrique1966
 

Viewers also liked (13)

Audience questionnaire
Audience questionnaireAudience questionnaire
Audience questionnaire
 
Rg 922
Rg 922Rg 922
Rg 922
 
Fuga de amoniaco afecta a comunidades barreñas
Fuga de amoniaco afecta a comunidades barreñasFuga de amoniaco afecta a comunidades barreñas
Fuga de amoniaco afecta a comunidades barreñas
 
Strengths Finder Top 5
Strengths Finder Top 5Strengths Finder Top 5
Strengths Finder Top 5
 
Cuadro de flechas
Cuadro de flechas Cuadro de flechas
Cuadro de flechas
 
Above and Beyond Honors_01
Above and Beyond Honors_01Above and Beyond Honors_01
Above and Beyond Honors_01
 
70 Million Safe Man Hours Without LTI
70 Million Safe Man Hours Without LTI70 Million Safe Man Hours Without LTI
70 Million Safe Man Hours Without LTI
 
Plots in neemrana behror,nh8 8
Plots in neemrana behror,nh8 8Plots in neemrana behror,nh8 8
Plots in neemrana behror,nh8 8
 
"Forte" - poema sobre epilepsia
"Forte"   -  poema sobre epilepsia"Forte"   -  poema sobre epilepsia
"Forte" - poema sobre epilepsia
 
Digital Strategy for FujiFilm
Digital Strategy for FujiFilmDigital Strategy for FujiFilm
Digital Strategy for FujiFilm
 
Knowsy iPad Game Case Study
Knowsy iPad Game Case StudyKnowsy iPad Game Case Study
Knowsy iPad Game Case Study
 
Limites trigonometricos
Limites trigonometricosLimites trigonometricos
Limites trigonometricos
 
Seguridad en inflado y reparación de llantas
Seguridad en inflado y reparación de llantasSeguridad en inflado y reparación de llantas
Seguridad en inflado y reparación de llantas
 

Similar to MURI Summer

Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Ankur Khanna
 
Big Data Analytics in the Health Domain
Big Data Analytics in the Health DomainBig Data Analytics in the Health Domain
Big Data Analytics in the Health DomainBigData_Europe
 
Biomedicine & Pharmacotherapy
Biomedicine & PharmacotherapyBiomedicine & Pharmacotherapy
Biomedicine & PharmacotherapyTrustlife
 
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET
 
Role of bioinformatics of drug designing
Role of bioinformatics of drug designingRole of bioinformatics of drug designing
Role of bioinformatics of drug designingDr NEETHU ASOKAN
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryDr. Gerry Higgins
 
Role of bioinformatics in drug designing
Role of bioinformatics in drug designingRole of bioinformatics in drug designing
Role of bioinformatics in drug designingW Roseybala Devi
 
Translational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.pptTranslational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.pptMedhavi27
 
NPC-PD2 PPP collab-PLoS 2015
NPC-PD2 PPP collab-PLoS 2015NPC-PD2 PPP collab-PLoS 2015
NPC-PD2 PPP collab-PLoS 2015Sitta Sittampalam
 
The evolving promise of genomic medicine ibm white paper
The evolving promise of genomic medicine ibm white paperThe evolving promise of genomic medicine ibm white paper
The evolving promise of genomic medicine ibm white paperPietro Leo
 
The evolving promise of genomic medicine
The evolving promise of genomic medicineThe evolving promise of genomic medicine
The evolving promise of genomic medicineBart de Witte
 
Toward a reliable and interoperable public repository for natural product-dru...
Toward a reliable and interoperable public repository for natural product-dru...Toward a reliable and interoperable public repository for natural product-dru...
Toward a reliable and interoperable public repository for natural product-dru...Richard Boyce, PhD
 
IMPACT STUDY_12NOV2015
IMPACT STUDY_12NOV2015IMPACT STUDY_12NOV2015
IMPACT STUDY_12NOV2015Kim Kersten
 
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Bioinformatics in the Clinical Pipeline: Contribution in Genomic MedicineBioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicineiosrjce
 
iOMICS Clinical & Omnia
iOMICS Clinical & OmniaiOMICS Clinical & Omnia
iOMICS Clinical & OmniaInterpretOmics
 
Exploiting drug targets for Immuno-Oncology drug discovery
Exploiting drug targets for Immuno-Oncology drug discoveryExploiting drug targets for Immuno-Oncology drug discovery
Exploiting drug targets for Immuno-Oncology drug discoveryVikram Rao
 

Similar to MURI Summer (20)

Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma Data Mining and Big Data Analytics in Pharma
Data Mining and Big Data Analytics in Pharma
 
Big Data Analytics in the Health Domain
Big Data Analytics in the Health DomainBig Data Analytics in the Health Domain
Big Data Analytics in the Health Domain
 
Meta analysis of molecular property patterns and filtering of public datasets...
Meta analysis of molecular property patterns and filtering of public datasets...Meta analysis of molecular property patterns and filtering of public datasets...
Meta analysis of molecular property patterns and filtering of public datasets...
 
Biomedicine & Pharmacotherapy
Biomedicine & PharmacotherapyBiomedicine & Pharmacotherapy
Biomedicine & Pharmacotherapy
 
Marsh pers strat-mednov2014
Marsh pers strat-mednov2014Marsh pers strat-mednov2014
Marsh pers strat-mednov2014
 
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
 
Role of bioinformatics of drug designing
Role of bioinformatics of drug designingRole of bioinformatics of drug designing
Role of bioinformatics of drug designing
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discovery
 
Role of bioinformatics in drug designing
Role of bioinformatics in drug designingRole of bioinformatics in drug designing
Role of bioinformatics in drug designing
 
Translational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.pptTranslational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
Translational Genomics towards Personalized medicine - Medhavi Vashisth.ppt
 
NPC-PD2 PPP collab-PLoS 2015
NPC-PD2 PPP collab-PLoS 2015NPC-PD2 PPP collab-PLoS 2015
NPC-PD2 PPP collab-PLoS 2015
 
Ibm
IbmIbm
Ibm
 
The evolving promise of genomic medicine ibm white paper
The evolving promise of genomic medicine ibm white paperThe evolving promise of genomic medicine ibm white paper
The evolving promise of genomic medicine ibm white paper
 
The evolving promise of genomic medicine
The evolving promise of genomic medicineThe evolving promise of genomic medicine
The evolving promise of genomic medicine
 
Toward a reliable and interoperable public repository for natural product-dru...
Toward a reliable and interoperable public repository for natural product-dru...Toward a reliable and interoperable public repository for natural product-dru...
Toward a reliable and interoperable public repository for natural product-dru...
 
Biomarkers & Clinical Research
Biomarkers & Clinical ResearchBiomarkers & Clinical Research
Biomarkers & Clinical Research
 
IMPACT STUDY_12NOV2015
IMPACT STUDY_12NOV2015IMPACT STUDY_12NOV2015
IMPACT STUDY_12NOV2015
 
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Bioinformatics in the Clinical Pipeline: Contribution in Genomic MedicineBioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
 
iOMICS Clinical & Omnia
iOMICS Clinical & OmniaiOMICS Clinical & Omnia
iOMICS Clinical & Omnia
 
Exploiting drug targets for Immuno-Oncology drug discovery
Exploiting drug targets for Immuno-Oncology drug discoveryExploiting drug targets for Immuno-Oncology drug discovery
Exploiting drug targets for Immuno-Oncology drug discovery
 

MURI Summer

  • 1. Identifying and Repurposing Novel Drug Candidates for Treating Leukemia Using Drug, Protein and Disease Interaction Networks Rashell Garretson1, Rut Thakkar2 , Zack East2 , Bin Peng3 Dr. Jake Chen4 and Dr. Walter Jessen5 1Department of Biology, Purdue School of Science, IUPUI; 2Neuroscience Program, Purdue School of Science, IUPUI; 3Department of Computer and Information Science, Purdue School of Science, IUPUI, 4Indiana University Center for Systems Biology and Personalized Medicine, IUPUI; 5Informatics, Covance, Greenfield, IN Introduction Taking a drug from discovery to market takes an average of twelve years. To minimize the time and costs of new drug development, data mining can be utilized to identify currently available drugs and other associated data, and prioritize candidates that can be repurposed to treat other diseases. This study focuses on three subtypes of leukemia: myelomonocytic leukemia, acute megakaryoblastic leukemia and B-cell prolymphocytic leukemia as they lack sufficient treatment along with having a poor prognosis. The data mining process is initiated by gathering information regarding FDA approved drugs and drugs in clinical trials to treat these subtypes of leukemia. A complex network is then generated through the curation of information on drug, protein, and disease interactions. A host of other diseases are then analyzed through disease to disease interactions to compile a list of diseases that are closely related to our leukemia subtypes of interest. Drugs used for these closely related diseases are then contrasted with drugs used for leukemia based on their protein targets, interactions and structure to identify drugs that would most likely be effective in treating our leukemia subtypes. Repurposing drugs based on structure, protein interactions, and target similarity can be beneficial in saving immense time and resources by utilizing drugs that are already available on the market in a novel way with the ultimate goal of saving lives. Methods Defining Subtypes of Interest • Subtypes were chosen by reviewing articles about the prognosis, 5- year survival rate, and currently available treatments. Subtypes with a poor prognosis, a low survival rate, and few effective treatments were prioritize. • Myelomonocytic leukemia and acute megakaryoblastic leukemia are subtypes of acute myeloid leukemia (AML) while B-cell prolymphocytic leukemia is a subtype of chronic lymphoblastic leukemia (CLL). We used AML and CLL subcategories in our drug, disease, and protein interactions to gather more general information. The more specific category information will be added later to find drugs to target our subtypes of interest. Disease to Drug Drugs are separated into categories using two criteria: • A D category drug is a drug being used for the specific disease of interest while an X category drug is a drug currently being used for a related disease. • A level 1 drug is one that is curretly FDA approved. A level 2 drug is a drug that is currently in clinical trial. A level 3 drug is a drug that has been terminated, withdrawn or suspended in clinical trial, or in this study any drug in a clinical trial that has not be updated since 2010. • D1 and X1: Using cancer.gov and the Leukemia and Lymphoma society website, information about drugs which are currently on the market to treat the chosen subtypes or related diseases was collected. • D2, D3, X2, and X3: Using clinicaltrials.gov each subtype and related disease was inputed and all information on clinical trials was downloaded and sorted. Trials that were listed as terminated, withdrawn, or suspended or that had not been updated in the last 5 years were labeled as a category 3. The rest were considered a category 2. All the drugs from each trial were separated, filtered, and listed. Disease to Protein • Preliminary mutated genes associated with AML and CLL were found through scrutinizing articles on Pubmed as well as OMIM. • Effector genes are discovered using the GEO database, which lists all the up and down regulated gene expressions in a disease. Drug to Protein • The D1 drug information collected from the disease to drug curation was evaluated using DrugBank and STITCH that gave information about protein interactions and targets for each drug. Protein to Protein • Using the Disease to Protein interactions, the key proteins connected with the subtypes of interest were evaluated using STRING and HAPPI databases. These interactions were used to create networks using cytoscape. Diseases to Disease • CMBI and Diseaseconnect databases were used to acquire a list of all the disease associated with AML and CLL. • The list was then analyzed to obtain the top disease that are similar to both the leukemia subtypes. Conclusion & Future Studies Current Status of Research References • The UniProt Consortium. UniProt: a hub for protein information. Nucleic Acids Res. 43: D204-D212 (2015). http://www.uniprot.org • Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, von Mering C. STRING 8--a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 2009 Jan;37(Database issue):D412-6. doi: 10.1093/nar/gkn760. Epub 2008 Oct 21. http://string-db.org • Kuhn M, Szklarczyk D, Pletscher-Frankild S, Blicher TH, von Mering C, Jensen LJ, Bork P. STITCH 4: integration of protein-chemical interactions with user data. Nucleic Acids Res. 2014 Jan;42(Database issue):D401-7. doi: 10.1093/nar/gkt1207. Epub 2013 Nov 28. http://stitch.embl.de • Chen JY, Mamidipalli S, Huan T. HAPPI: an online database of comprehensive human annotated and predicted protein interactions. BMC Genomics. 2009 Jul 7;10 Suppl 1:S16. doi: 10.1186/1471-2164-10-S1-S16. http://discovery.informatics.iupui.edu/HAPPI/ • Nucleic Acids Res. 2014 Jul;42(Web Server issue):W137-46. doi: 10.1093/nar/gku412. Epub 2014 Jun 3. • Liu CC, Tseng YT, Li W, Wu CY, Mayzus I, Rzhetsky A, Sun F, Waterman M, Chen JJ, Chaudhary PM, Loscalzo J, Crandall E, Zhou XJ. DiseaseConnect: a comprehensive web server for mechanism-based disease-disease connections. http://disease-connect.org • DrugBank 4.0: shedding new light on drug metabolism. Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y, Maciejewski A, Arndt D, Wilson M, Neveu V, Tang A, Gabriel G, Ly C, Adamjee S, Dame ZT, Han B, Zhou Y, Wishart DS. Nucleic Acids Res. 2014 Jan 1;42(1):D1091-7. http://www.drugbank.ca • Bolton E, Wang Y, Thiessen PA, Bryant SH. PubChem: Integrated Platform of Small Molecules and Biological Activities. Chapter 12 IN Wheeler RA and Spellmeyer DC, eds. Annual Reports in Computational Chemistry, Volume 4. Oxford, UK: Elsevier, 2008, pp. 217-241. doi:10.1016/S1574-1400(08)00012-1. https://pubchem.ncbi.nlm.nih.gov • Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Research 2003 Nov; 13(11):2498-504. http://cytoscape.org • Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002 Jan 1;30(1):207-10. http://www.ncbi.nlm.nih.gov/geo/ 0 150 300 450 600 D1 D2 D3 X1 X2 X3 NumberofDrugs Category of Drugs Number of Drugs Per Category AML and CLL Protein to Protein Interaction Network Top Proteins Targeted by D1 Drugs AML Drug Targets Number of Drugs CLL Drug Targets Number of Drugs P42574 (CASP3) 8 P42574 (CASP3) 6 P08684 (CYP3A4) 8 P55211 (CASP9) 5 P33527 (ABCC1) 7 Q14790 (CASP8) 4 P08183 (ABCB1) 6 P09874 (PARP1) 4 Q14790 (CASP8) 6 P33527 (ABCC1) 4 P04637 (TP53) 5 P33527 (ABCB1) 4 Q9UNQ0 (ABCG2) 5 P08684 (CYP3A4) 4 Q92887 (ABCC2) 4 P55210 (CASP7) 3 P55211 (CASP9) 4 Q9UNQ0 (ABCG2) 3 Q16678 (CYP1B1) 4 P20815 (CYP3A5) 3 Table 2: Top proteins identified as targets in DrugBank and STITCH from drug to protein interaction. The top ten proteins listed were the proteins that were targeted by the highest number of D1 for each subtype. Table 3: Top related diseases from CMBI. Diseases that were found in the related disease list for CLL and AML as well as having a CMBI score greater than 0.3 were chosen as the top related diseases. These diseases were used to identify X category drugs as candidates for repurposing. Our team has developed a website that allows us to import data into a database which we can use to analyze and visualize our data collected from the data mining process. The data will be stored in the postgreSQL database and use the elasticsearch framework to do the fuzzy searching which would allow us to narrow down our searches to find key information. The website also can provide the relationship between the drug, disease and protein interactions which we can use to create models and networks. The next steps are to improve the website’s functionality so that we can import all our data collected regarding drugs, diseases, and proteins. We will then use the website to gather information about the interactions between the data we imported to help us create a model which we will use to identify which of the drug candidates we found are the most suited for repurposing. Top Associated Diseases Disease AML/CLL CMBI Score Disease AML/CLL CMBI Score Acute Lymphoblastic Leukemia 0.3682,0.3979 Chronic Myeloid Leukemia 0.3999,0.4131 Mixed lineage leukemia 0.2942/0.3031 T-cell acute lymphocytic leukemia 0.4212/0.3074 Non-bruton agammaglobulinemia 0.335/0.4215 Hemophagocytic lymphohistiocytosis 0.3328/0.3775 Mycosis Fungoides 0.3083/0.3194 B-cell lymphoma 0.3129/0.2957 Non-hodgkin lymphoma 0.3092/0.3293 Hodgkin lymphoma 0.3092/0.3424 Burkitt's lymphoma 0.3092/0.3622 Werner Syndrome 0.2948/0.303 Figures 1 and 2: Proteins from disease to protein interaction were combined with additional interacting proteins using STRING and HAPPI databases. Cytoscape was used to create a network of connections between proteins associated with each subtype. Figure 3: Number of drugs found using clinicaltrial.gov, cancer.gov, Leukemia and Lymphoma society, and articles found on pubmed for AML and CLL as well as the top associated diseases. D1= FDA approved drugs for AML or CLL. D2= drugs currently in clinical trial for AML or CLL. D3= drugs with clinical trial information that has not been updated in the last five years or clinical trials that have been suspended, terminated, or withdrawn. X category drugs follow the same rules as D category drugs, but they are being used for the top associated diseases. Table 1: Top proteins identified as targets in DrugBank and STITCH from drug to protein interaction. The top ten proteins listed were the proteins that were targeted by the highest number of D1 for each subtype. Popularity of D Category Drugs by Pubmed Search Name of drug Number of Pubmed articles found Name of drug Number of Pubmed articles found Cytoxan 6967 Etoposide 3421 Methotrexate 6668 Mercaptopurine 2860 Imatinib 6442 Tetradecanoylphorbol acetate 2759 Aminopterin 5268 Asparaginase 2701 Antracycline 4175 Cytosar 2508