This study aims to identify drug candidates that can be repurposed to treat three subtypes of leukemia by analyzing drug, protein, and disease interaction networks. The researchers gathered data on FDA-approved drugs and drugs in clinical trials for leukemia and related diseases. They then constructed networks showing interactions between drugs, proteins, and diseases. The top related diseases to leukemia were identified, and their associated drugs were considered candidates for repurposing. The researchers developed a website to import and analyze the collected data to identify the most suitable drug candidates based on the interaction networks.
Contribution of genome-wide association studies to scientific research: a pra...Mutiple Sclerosis
Vito A. G. Ricigliano, Renato Umeton, Lorenzo Germinario, Eleonora Alma, Martina Briani, Noemi Di Segni, Dalma Montesanti, Giorgia Pierelli, Fabiana Cancrini, Cristiano Lomonaco, Francesca Grassi, Gabriella Palmieri, and Marco Salvetti,
Struan Frederick Airth Grant, Editor
The factual value of genome-wide association studies (GWAS) for the understanding of multifactorial diseases is a matter of intense debate. Practical consequences for the development of more effective therapies do not seem to be around the corner. Here we propose a pragmatic and objective evaluation of how much new biology is arising from these studies, with particular attention to the information that can help prioritize therapeutic targets. We chose multiple sclerosis (MS) as a paradigm disease and assumed that, in pre-GWAS candidate-gene studies, the knowledge behind the choice of each gene reflected the understanding of the disease prior to the advent of GWAS. Importantly, this knowledge was based mainly on non-genetic, phenotypic grounds. We performed single-gene and pathway-oriented comparisons of old and new knowledge in MS by confronting an unbiased list of candidate genes in pre-GWAS association studies with those genes exceeding the genome-wide significance threshold in GWAS published from 2007 on. At the single gene level, the majority (94 out of 125) of GWAS-discovered variants had never been contemplated as plausible candidates in pre-GWAS association studies. The 31 genes that were present in both pre- and post-GWAS lists may be of particular interest in that they represent disease-associated variants whose pathogenetic relevance is supported at the phenotypic level (i.e. the phenotypic information that steered their selection as candidate genes in pre-GWAS association studies). As such they represent attractive therapeutic targets. Interestingly, our analysis shows that some of these variants are targets of pharmacologically active compounds, including drugs that are already registered for human use. Compared with the above single-gene analysis, at the pathway level GWAS results appear more coherent with previous knowledge, reinforcing some of the current views on MS pathogenesis and related therapeutic research. This study presents a pragmatic approach that helps interpret and exploit GWAS knowledge.
Neglected and rare diseases traditionally have not been the focus of large pharmaceutical company research as biotech and academia have primarily been involved in drug discovery efforts for such diseases. This area certainly represents a new opportunity as the pharmaceutical industry investigates new markets. One approach to speed up drug discovery is to examine new uses for existing approved drugs; this is termed drug repositioning or drug repurposing and has become increasingly popular in recent years. Analysis of the literature reveals that using high-throughput screening there have been many examples of FDA approved drugs found to be active against additional targets that can be used to therapeutic advantage for repositioning for other diseases. To date there are far fewer such examples where in silico approaches have allowed for the derivation of new uses. It is suggested that with current technologies and databases of chemical compounds (drugs) and related data, as well as close integration with in vitro screening data, improved opportunities for drug repurposing will emerge. In this publication a review of the literature will highlight several proof of principle examples from areas such as finding new inhibitors for drug transporters with 3D pharmacophores and uncovering molecules active against Mycobacterium tuberculosis (Mtb) using Bayesian models of compound libraries. Research into neglected or rare/orphan diseases can likely benefit from in silico drug repositioning approaches and accelerate drug discovery for these diseases.
Discuss about Al, machine learning, and the hype cycle
Discuss the knowledge-based classification of proteins
Discuss applications of AI/ML to drug discovery
A slide series to learn and appreciate the importance and the potential of Personalized/Individualized Genomic Medicine. It briefly goes through the idea of biotechnology and the advancements we have made in biology and technology. A series of applications for genomic medicine is then explored, not failing to mention the challenges we have to overcome as well, for the next medical revolution.
A case for personalized medicine is presented.
Contribution of genome-wide association studies to scientific research: a pra...Mutiple Sclerosis
Vito A. G. Ricigliano, Renato Umeton, Lorenzo Germinario, Eleonora Alma, Martina Briani, Noemi Di Segni, Dalma Montesanti, Giorgia Pierelli, Fabiana Cancrini, Cristiano Lomonaco, Francesca Grassi, Gabriella Palmieri, and Marco Salvetti,
Struan Frederick Airth Grant, Editor
The factual value of genome-wide association studies (GWAS) for the understanding of multifactorial diseases is a matter of intense debate. Practical consequences for the development of more effective therapies do not seem to be around the corner. Here we propose a pragmatic and objective evaluation of how much new biology is arising from these studies, with particular attention to the information that can help prioritize therapeutic targets. We chose multiple sclerosis (MS) as a paradigm disease and assumed that, in pre-GWAS candidate-gene studies, the knowledge behind the choice of each gene reflected the understanding of the disease prior to the advent of GWAS. Importantly, this knowledge was based mainly on non-genetic, phenotypic grounds. We performed single-gene and pathway-oriented comparisons of old and new knowledge in MS by confronting an unbiased list of candidate genes in pre-GWAS association studies with those genes exceeding the genome-wide significance threshold in GWAS published from 2007 on. At the single gene level, the majority (94 out of 125) of GWAS-discovered variants had never been contemplated as plausible candidates in pre-GWAS association studies. The 31 genes that were present in both pre- and post-GWAS lists may be of particular interest in that they represent disease-associated variants whose pathogenetic relevance is supported at the phenotypic level (i.e. the phenotypic information that steered their selection as candidate genes in pre-GWAS association studies). As such they represent attractive therapeutic targets. Interestingly, our analysis shows that some of these variants are targets of pharmacologically active compounds, including drugs that are already registered for human use. Compared with the above single-gene analysis, at the pathway level GWAS results appear more coherent with previous knowledge, reinforcing some of the current views on MS pathogenesis and related therapeutic research. This study presents a pragmatic approach that helps interpret and exploit GWAS knowledge.
Neglected and rare diseases traditionally have not been the focus of large pharmaceutical company research as biotech and academia have primarily been involved in drug discovery efforts for such diseases. This area certainly represents a new opportunity as the pharmaceutical industry investigates new markets. One approach to speed up drug discovery is to examine new uses for existing approved drugs; this is termed drug repositioning or drug repurposing and has become increasingly popular in recent years. Analysis of the literature reveals that using high-throughput screening there have been many examples of FDA approved drugs found to be active against additional targets that can be used to therapeutic advantage for repositioning for other diseases. To date there are far fewer such examples where in silico approaches have allowed for the derivation of new uses. It is suggested that with current technologies and databases of chemical compounds (drugs) and related data, as well as close integration with in vitro screening data, improved opportunities for drug repurposing will emerge. In this publication a review of the literature will highlight several proof of principle examples from areas such as finding new inhibitors for drug transporters with 3D pharmacophores and uncovering molecules active against Mycobacterium tuberculosis (Mtb) using Bayesian models of compound libraries. Research into neglected or rare/orphan diseases can likely benefit from in silico drug repositioning approaches and accelerate drug discovery for these diseases.
Discuss about Al, machine learning, and the hype cycle
Discuss the knowledge-based classification of proteins
Discuss applications of AI/ML to drug discovery
A slide series to learn and appreciate the importance and the potential of Personalized/Individualized Genomic Medicine. It briefly goes through the idea of biotechnology and the advancements we have made in biology and technology. A series of applications for genomic medicine is then explored, not failing to mention the challenges we have to overcome as well, for the next medical revolution.
A case for personalized medicine is presented.
Cancer Moonshot, Data sharing and the Genomic Data CommonsWarren Kibbe
Gave the inaugural Informatics Grand Rounds at City of Hope on September 8th. NIH Commons, Genomic Data Commons, NCI Cloud Pilots, Cancer Moonshot and rationale for changing incentives around data sharing all discussed.
National Cancer Data Ecosystem and Data SharingWarren Kibbe
Grand Rounds at the Siteman Cancer Center at Washington University. Highlighting the Genomic Data Commons and the National Cancer Data Ecosystem defined by the Cancer Moonshot Blue Ribbon Panel
Cell centered database for immunology and cancer research feb252016Ann-Marie Roche
Determining the cellular mechanisms of diseases is a crucial requirement for understanding the causes and progression of diseases, predicting outcomes, and developing new treatments. Often relevant information, e.g. what cells are involved in a disease or what effects does a drug have on cells, is scattered across many papers and journals, which makes it difficult for researchers to be sure they have a complete picture. Using Elsevier’s automated text mining technology, we have created a new cell-centered database consisting of 850 000 facts captured from more than 24 million PubMed abstracts and 3.5 million full text articles for use in Pathway Studio. This database focused primarily on cellular aspects of immunology and immuno-oncology can be used to summarize and visualize published research, and to analyze experimental data.
Learn how to use Pathway Studio to explore biomarkers and brain regions. With the addition of highly sophisticated visualization tools, users can interactively explore the vast number of connections created to help unravel disease biology. In addition, an innovative new taxonomy based on brain region identifications will be presented. Together, these innovations can be applied to rapidly increase the knowledge of diseases based on published findings.
Disease Network is the science that has emerged to diagnose a disease from a network aspect
specifically. Networks are the group that interconnect to each others similarly disease networks are
the one that reveal concelled connection among apparently independent biomedical entities like
physiologic process, signaling receptors, in addition to genetic code, also they prove to exists
intitutive in addition to powerful way to learn/discover or diagnose a disease.Due to these networks,
we can now consume the elderly drugs and its method to learn/discover the new drug
accordingly.Example- Colchicine is used in gout but after repurposing it is also used in mediterranean
fever. This is because there are many factors that affect the body during mediterranean fever and
gout, we know that gout is a form of arthritis that causes pain in joints also mediterranean fever is the
one which is accompanied by pain in joints, therefore colchicine is used as a repurposed drug again.In
repurposing of medicines or drugs we first analyse the change in symptoms and identify the target
organ and accorgingly we produce a drug that is compatible with pharmacokinetics of the body. As
the availablity of transcriptomic,proteomic and metabolomic data sources are increasing day by day it helps in classification of disease .Also there are some networks reffered to as complex networks which can be called as collection of linked junctions/ nodes
NCI Cancer Genomics, Open Science and PMI: FAIR Warren Kibbe
Talk given to the NLM Fellows on July 8, 2016. Touches on Cancer Genomics, Open Science and PMI: FAIR in NCI genomics thinking and projects. Includes discussion of the Genomic Data Commons (GDC), Cancer Data Ecosystem, Data sharing, and the NCI cancer clinical trials open API.
Interrogating differences in expression of targeted gene sets to predict brea...Enrique Moreno Gonzalez
Genomics provides opportunities to develop precise tests for diagnostics, therapy selection and monitoring. From analyses of our studies and those of published results, 32 candidate genes were identified, whose expression appears related to clinical outcome of breast cancer. Expression of these genes was validated by qPCR and correlated with clinical follow-up to identify a gene subset for development of a prognostic test.
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meetingWarren Kibbe
Big data in oncology and implications for open data, open science, rapid innovation, data reuse, reproducibility and data sharing. Cancer Moonshot, Precisions Medicine Initiative (PMI), the Genomic Data Commons, NCI Cloud Pilots, NCI-DOE Pilots, and the Cancer Research Data Ecosystem.
NCI Cancer Imaging Program - Cancer Research Data EcosystemWarren Kibbe
Given to the NCI Cancer Imaging Program monthly telecon on January 9th, 2017. NCI Genomic Data Commons, Beau Biden Cancer Moonshot Blue Ribbon Panel, Cancer Research Data Ecosystem and the role of imaging in precision medicine
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...ijsc
Microarray is a useful technique for measuring expression data of thousands or more of genes
simultaneously. One of challenges in classification of cancer using high-dimensional gene expression data
is to select a minimal number of relevant genes which can maximize classification accuracy. Because of the
distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and
robust gene identification methods is extremely fundamental. Many gene selection methods as well as their
corresponding classifiers have been proposed. In the proposed method, a single gene with high classdiscrimination
capability is selected and classification rules are generated for cancer based on gene
expression profiles. The method first computes importance factor of each gene of experimental cancer
dataset by counting number of linguistic terms (defined in terms of different discreet quantity) with high
class discrimination capability according to their depended degree of classes. Then initial important genes
are selected according to high importance factor of each gene and form initial reduct. Then traditional kmeans
clustering algorithm is applied on each selected gene of initial reduct and compute missclassification
errors of individual genes. The final reduct is formed by selecting most important genes with
respect to less miss-classification errors. Then a classifier is constructed based on decision rules induced
by selected important genes (single) from training dataset to classify cancerous and non-cancerous samples
of experimental test dataset. The proposed method test on four publicly available cancerous gene
expression test dataset. In most of cases, accurate classifications outcomes are obtained by just using
important (single) genes that are highly correlated with the pathogenesis cancer are identified. Also to
prove the robustness of proposed method compares the outcomes (correctly classified instances) with some
existing well known classifiers.
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...Richard Boyce, PhD
A presentation of a new adverse drug event evidence base (Laertes - http://goo.gl/nZSqVw) within a standard framework for clinical research (OHDSI - www.ohdsi.org) made at the American Medical Informatics Association Joint Summits on Translational Research on 3/26/2015
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Cancer Moonshot, Data sharing and the Genomic Data CommonsWarren Kibbe
Gave the inaugural Informatics Grand Rounds at City of Hope on September 8th. NIH Commons, Genomic Data Commons, NCI Cloud Pilots, Cancer Moonshot and rationale for changing incentives around data sharing all discussed.
National Cancer Data Ecosystem and Data SharingWarren Kibbe
Grand Rounds at the Siteman Cancer Center at Washington University. Highlighting the Genomic Data Commons and the National Cancer Data Ecosystem defined by the Cancer Moonshot Blue Ribbon Panel
Cell centered database for immunology and cancer research feb252016Ann-Marie Roche
Determining the cellular mechanisms of diseases is a crucial requirement for understanding the causes and progression of diseases, predicting outcomes, and developing new treatments. Often relevant information, e.g. what cells are involved in a disease or what effects does a drug have on cells, is scattered across many papers and journals, which makes it difficult for researchers to be sure they have a complete picture. Using Elsevier’s automated text mining technology, we have created a new cell-centered database consisting of 850 000 facts captured from more than 24 million PubMed abstracts and 3.5 million full text articles for use in Pathway Studio. This database focused primarily on cellular aspects of immunology and immuno-oncology can be used to summarize and visualize published research, and to analyze experimental data.
Learn how to use Pathway Studio to explore biomarkers and brain regions. With the addition of highly sophisticated visualization tools, users can interactively explore the vast number of connections created to help unravel disease biology. In addition, an innovative new taxonomy based on brain region identifications will be presented. Together, these innovations can be applied to rapidly increase the knowledge of diseases based on published findings.
Disease Network is the science that has emerged to diagnose a disease from a network aspect
specifically. Networks are the group that interconnect to each others similarly disease networks are
the one that reveal concelled connection among apparently independent biomedical entities like
physiologic process, signaling receptors, in addition to genetic code, also they prove to exists
intitutive in addition to powerful way to learn/discover or diagnose a disease.Due to these networks,
we can now consume the elderly drugs and its method to learn/discover the new drug
accordingly.Example- Colchicine is used in gout but after repurposing it is also used in mediterranean
fever. This is because there are many factors that affect the body during mediterranean fever and
gout, we know that gout is a form of arthritis that causes pain in joints also mediterranean fever is the
one which is accompanied by pain in joints, therefore colchicine is used as a repurposed drug again.In
repurposing of medicines or drugs we first analyse the change in symptoms and identify the target
organ and accorgingly we produce a drug that is compatible with pharmacokinetics of the body. As
the availablity of transcriptomic,proteomic and metabolomic data sources are increasing day by day it helps in classification of disease .Also there are some networks reffered to as complex networks which can be called as collection of linked junctions/ nodes
NCI Cancer Genomics, Open Science and PMI: FAIR Warren Kibbe
Talk given to the NLM Fellows on July 8, 2016. Touches on Cancer Genomics, Open Science and PMI: FAIR in NCI genomics thinking and projects. Includes discussion of the Genomic Data Commons (GDC), Cancer Data Ecosystem, Data sharing, and the NCI cancer clinical trials open API.
Interrogating differences in expression of targeted gene sets to predict brea...Enrique Moreno Gonzalez
Genomics provides opportunities to develop precise tests for diagnostics, therapy selection and monitoring. From analyses of our studies and those of published results, 32 candidate genes were identified, whose expression appears related to clinical outcome of breast cancer. Expression of these genes was validated by qPCR and correlated with clinical follow-up to identify a gene subset for development of a prognostic test.
Day 2 Big Data panel at the NIH BD2K All Hands 2016 meetingWarren Kibbe
Big data in oncology and implications for open data, open science, rapid innovation, data reuse, reproducibility and data sharing. Cancer Moonshot, Precisions Medicine Initiative (PMI), the Genomic Data Commons, NCI Cloud Pilots, NCI-DOE Pilots, and the Cancer Research Data Ecosystem.
NCI Cancer Imaging Program - Cancer Research Data EcosystemWarren Kibbe
Given to the NCI Cancer Imaging Program monthly telecon on January 9th, 2017. NCI Genomic Data Commons, Beau Biden Cancer Moonshot Blue Ribbon Panel, Cancer Research Data Ecosystem and the role of imaging in precision medicine
MINING OF IMPORTANT INFORMATIVE GENES AND CLASSIFIER CONSTRUCTION FOR CANCER ...ijsc
Microarray is a useful technique for measuring expression data of thousands or more of genes
simultaneously. One of challenges in classification of cancer using high-dimensional gene expression data
is to select a minimal number of relevant genes which can maximize classification accuracy. Because of the
distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and
robust gene identification methods is extremely fundamental. Many gene selection methods as well as their
corresponding classifiers have been proposed. In the proposed method, a single gene with high classdiscrimination
capability is selected and classification rules are generated for cancer based on gene
expression profiles. The method first computes importance factor of each gene of experimental cancer
dataset by counting number of linguistic terms (defined in terms of different discreet quantity) with high
class discrimination capability according to their depended degree of classes. Then initial important genes
are selected according to high importance factor of each gene and form initial reduct. Then traditional kmeans
clustering algorithm is applied on each selected gene of initial reduct and compute missclassification
errors of individual genes. The final reduct is formed by selecting most important genes with
respect to less miss-classification errors. Then a classifier is constructed based on decision rules induced
by selected important genes (single) from training dataset to classify cancerous and non-cancerous samples
of experimental test dataset. The proposed method test on four publicly available cancerous gene
expression test dataset. In most of cases, accurate classifications outcomes are obtained by just using
important (single) genes that are highly correlated with the pathogenesis cancer are identified. Also to
prove the robustness of proposed method compares the outcomes (correctly classified instances) with some
existing well known classifiers.
Piloting a Comprehensive Knowledge Base for Pharmacovigilance Using Standardi...Richard Boyce, PhD
A presentation of a new adverse drug event evidence base (Laertes - http://goo.gl/nZSqVw) within a standard framework for clinical research (OHDSI - www.ohdsi.org) made at the American Medical Informatics Association Joint Summits on Translational Research on 3/26/2015
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Neglected infectious diseases such as tuberculosis (TB) and malaria kill millions of people annually and the oral drugs used are subject to resistance requiring the urgent development of new therapeutics. Several groups, including pharmaceutical companies, have made large sets of antimalarial screening hit compounds and the associated bioassay data available for the community to learn from and potentially optimize. We have examined both intrinsic and predicted molecular properties across these datasets and compared them with large libraries of compounds screened against Mycobacterium tuberculosis in order to identify any obvious patterns, trends or relationships. One set of antimalarial hits provided by GlaxoSmithKline appears less optimal for lead optimization compared with two other sets of screening hits we examined. Active compounds against both diseases were identified to have larger molecular weight ([similar]350–400) and logP values of [similar]4.0, values that are, in general, distinct from the less active compounds. The antimalarial hits were also filtered with computational rules to identify potentially undesirable substructures. We were surprised that approximately 75–85% of these compounds failed one of the sets of filters that we applied during this work. The level of filter failure was much higher than for FDA approved drugs or a subset of antimalarial drugs. Both antimalarial and antituberculosis drug discovery should likely use simple available approaches to ensure that the hits derived from large scale screening are worth optimizing and do not clearly represent reactive compounds with a higher probability of toxicity in vivo.
Talk delivered at Warwick Biomedical Engineering Seminar series 27 November 2014. Develops a theme emerging from a review in 2010:
J Watkins, A Marsh, P C Taylor, D R J Singer
Therapeutic Delivery, 2010, 1, 651-665
"Continued adherence to a single-drug single-target paradigm will limit the ability of chemists to contribute to advances in personalized medicine, whether they be in discovery or delivery"
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET
Abstract
Pharos (https://pharos.nih.gov/) is an integrated web-based informatics platform for the analysis of data aggregated by the Illuminating the Druggable Genome (IDG) Knowledge Management Center, an NIH Common Fund initiative. The current version of Pharos (as of October 2019) spans 20,244 proteins in the human proteome, 19,880 disease and phenotype associations, and 226,829 ChEMBL compounds. This resource not only collates and analyzes data from over 60 high-quality resources to generate these types, but also uses text indexing to find less apparent connections between targets, and has recently begun to collaborate with institutions that generate data and resources. Proteins are ranked according to a knowledge-based classification system, which can help researchers to identify less studied “dark” targets that could be potentially further illuminated. This is an important process for both drug discovery and target validation, as more knowledge can accelerate target identification, and previously understudied proteins can serve as novel targets in drug discovery. In this webinar, Dr. Tudor Oprea will introduce how to use Pharos to find targets of interest for drug discovery.
The top 3 key questions that Pharos can answer:
1. What are the novel drug targets that may play a role in a specific disease?
2. What are the diseases that are related directly or indirectly to a drug target?
3. Find researchers that are related directly or indirectly to a drug target.
Presenter: Tudor Oprea, MD, PhD, Professor of Medicine, Chief of Translational Informatics Division & Internal Medicine, University of New Mexico
dkNET Webinar Information: https://dknet.org/about/webinar
Translational Genomics towards Personalized medicine - Medhavi Vashisth.pptMedhavi27
Every individual is unique, and so is his/her body's affinity and reaction towards diseases and their treatment methods. The science of personalized takes into account biology of one individual at a time and relates it with established databases for devising or optimizing suitable treatment strategies.
Toward a reliable and interoperable public repository for natural product-dru...Richard Boyce, PhD
A poster presented at the 2017 Annual Symposium of the American Medical Informatics Association (AMIA 2017). November 04- 08, 2017. Washington, DC. USA
4th International Conference on Biomarkers & Clinical Research, will be organized around the theme "Impact of Biomarker Developments in Health Diagnostics and Clinical Research."
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicineiosrjce
In this review report we like to focus on the new challenges in methodology of modern biology be
used in medical science. Today human health is a primary issue to cure disease, undoubtedly the answer to this
is bioinformatics or (In-silco) tools has change the concept of treating patients to understand the need of
genomic medicine in use. Those with new modes of action in clinical treatment, is a major health concern in
medical science. On global prospective scientific role in constructing new ideas to remediate health care to
treat disease exciting in nature is challenging task. So awareness needs to accelerate store clinical datasets for
scientific represents to design genomic drugs. This new outline will drive the medical to discover public data
and create a cognitive approach to use technology cheaper at cost effective mode.
Exploiting drug targets for Immuno-Oncology drug discoveryVikram Rao
IO innovators are increasingly pursuing the CD molecules as targets for new therapies, for example in developing antibodies for immune modulation and cytotoxicity.
This requires a deep understanding of the gene’s role in cancer biology.
Find out how we can help you with understanding the relationship between CD genes and checkpoint inhibitors... exploring different approaches to patient response biomarkers, and prioritise novel drug targets.
Exploiting drug targets for Immuno-Oncology drug discovery
MURI Summer
1. Identifying and Repurposing Novel Drug Candidates
for Treating Leukemia Using Drug, Protein and
Disease Interaction Networks
Rashell Garretson1, Rut Thakkar2 , Zack East2 , Bin Peng3
Dr. Jake Chen4 and Dr. Walter Jessen5
1Department of Biology, Purdue School of Science, IUPUI; 2Neuroscience Program, Purdue School of Science, IUPUI;
3Department of Computer and Information Science, Purdue School of Science, IUPUI, 4Indiana University Center for Systems
Biology and Personalized Medicine, IUPUI; 5Informatics, Covance, Greenfield, IN
Introduction
Taking a drug from discovery to market takes an average of twelve years.
To minimize the time and costs of new drug development, data mining
can be utilized to identify currently available drugs and other associated
data, and prioritize candidates that can be repurposed to treat other
diseases. This study focuses on three subtypes of leukemia:
myelomonocytic leukemia, acute megakaryoblastic leukemia and B-cell
prolymphocytic leukemia as they lack sufficient treatment along with
having a poor prognosis. The data mining process is initiated by gathering
information regarding FDA approved drugs and drugs in clinical trials to
treat these subtypes of leukemia. A complex network is then generated
through the curation of information on drug, protein, and disease
interactions. A host of other diseases are then analyzed through disease to
disease interactions to compile a list of diseases that are closely related to
our leukemia subtypes of interest. Drugs used for these closely related
diseases are then contrasted with drugs used for leukemia based on their
protein targets, interactions and structure to identify drugs that would
most likely be effective in treating our leukemia subtypes. Repurposing
drugs based on structure, protein interactions, and target similarity can be
beneficial in saving immense time and resources by utilizing drugs that
are already available on the market in a novel way with the ultimate goal
of saving lives.
Methods
Defining Subtypes of Interest
• Subtypes were chosen by reviewing articles about the prognosis, 5-
year survival rate, and currently available treatments. Subtypes with a
poor prognosis, a low survival rate, and few effective treatments were
prioritize.
• Myelomonocytic leukemia and acute megakaryoblastic leukemia are
subtypes of acute myeloid leukemia (AML) while B-cell
prolymphocytic leukemia is a subtype of chronic lymphoblastic
leukemia (CLL). We used AML and CLL subcategories in our drug,
disease, and protein interactions to gather more general information.
The more specific category information will be added later to find
drugs to target our subtypes of interest.
Disease to Drug
Drugs are separated into categories using two criteria:
• A D category drug is a drug being used for the specific disease of
interest while an X category drug is a drug currently being used for a
related disease.
• A level 1 drug is one that is curretly FDA approved. A level 2 drug is
a drug that is currently in clinical trial. A level 3 drug is a drug that
has been terminated, withdrawn or suspended in clinical trial, or in
this study any drug in a clinical trial that has not be updated since
2010.
• D1 and X1: Using cancer.gov and the Leukemia and Lymphoma
society website, information about drugs which are currently on the
market to treat the chosen subtypes or related diseases was collected.
• D2, D3, X2, and X3: Using clinicaltrials.gov each subtype and related
disease was inputed and all information on clinical trials was
downloaded and sorted. Trials that were listed as terminated,
withdrawn, or suspended or that had not been updated in the last 5
years were labeled as a category 3. The rest were considered a
category 2. All the drugs from each trial were separated, filtered, and
listed.
Disease to Protein
• Preliminary mutated genes associated with AML and CLL were found
through scrutinizing articles on Pubmed as well as OMIM.
• Effector genes are discovered using the GEO database, which lists all
the up and down regulated gene expressions in a disease.
Drug to Protein
• The D1 drug information collected from the disease to drug curation
was evaluated using DrugBank and STITCH that gave information
about protein interactions and targets for each drug.
Protein to Protein
• Using the Disease to Protein interactions, the key proteins connected
with the subtypes of interest were evaluated using STRING and
HAPPI databases. These interactions were used to create networks
using cytoscape.
Diseases to Disease
• CMBI and Diseaseconnect databases were used to acquire a list of all
the disease associated with AML and CLL.
• The list was then analyzed to obtain the top disease that are similar to
both the leukemia subtypes.
Conclusion & Future Studies
Current Status of Research
References
• The UniProt Consortium. UniProt: a hub for
protein information. Nucleic Acids Res. 43:
D204-D212 (2015). http://www.uniprot.org
• Jensen LJ, Kuhn M, Stark M, Chaffron S,
Creevey C, Muller J, Doerks T, Julien P, Roth
A, Simonovic M, Bork P, von Mering C.
STRING 8--a global view on proteins and their
functional interactions in 630 organisms.
Nucleic Acids Res. 2009 Jan;37(Database
issue):D412-6. doi: 10.1093/nar/gkn760. Epub
2008 Oct 21. http://string-db.org
• Kuhn M, Szklarczyk D, Pletscher-Frankild S,
Blicher TH, von Mering C, Jensen LJ, Bork P.
STITCH 4: integration of protein-chemical
interactions with user data. Nucleic Acids Res.
2014 Jan;42(Database issue):D401-7. doi:
10.1093/nar/gkt1207. Epub 2013 Nov 28.
http://stitch.embl.de
• Chen JY, Mamidipalli S, Huan T. HAPPI: an
online database of comprehensive human
annotated and predicted protein interactions.
BMC Genomics. 2009 Jul 7;10 Suppl 1:S16.
doi: 10.1186/1471-2164-10-S1-S16.
http://discovery.informatics.iupui.edu/HAPPI/
• Nucleic Acids Res. 2014 Jul;42(Web Server
issue):W137-46. doi: 10.1093/nar/gku412.
Epub 2014 Jun 3.
• Liu CC, Tseng YT, Li W, Wu CY, Mayzus I,
Rzhetsky A, Sun F, Waterman M, Chen JJ,
Chaudhary PM, Loscalzo J, Crandall E, Zhou
XJ. DiseaseConnect: a comprehensive web
server for mechanism-based disease-disease
connections. http://disease-connect.org
• DrugBank 4.0: shedding new light on drug
metabolism. Law V, Knox C, Djoumbou Y,
Jewison T, Guo AC, Liu Y, Maciejewski A,
Arndt D, Wilson M, Neveu V, Tang A, Gabriel
G, Ly C, Adamjee S, Dame ZT, Han B, Zhou
Y, Wishart DS. Nucleic Acids Res. 2014 Jan
1;42(1):D1091-7. http://www.drugbank.ca
• Bolton E, Wang Y, Thiessen PA, Bryant SH.
PubChem: Integrated Platform of Small
Molecules and Biological Activities. Chapter
12 IN Wheeler RA and Spellmeyer DC, eds.
Annual Reports in Computational Chemistry,
Volume 4. Oxford, UK: Elsevier, 2008, pp.
217-241. doi:10.1016/S1574-1400(08)00012-1.
https://pubchem.ncbi.nlm.nih.gov
• Shannon P, Markiel A, Ozier O, Baliga NS,
Wang JT, Ramage D, Amin N, Schwikowski
B, Ideker T. Cytoscape: a software
environment for integrated models of
biomolecular interaction networks. Genome
Research 2003 Nov; 13(11):2498-504.
http://cytoscape.org
• Edgar R, Domrachev M, Lash AE. Gene
Expression Omnibus: NCBI gene expression
and hybridization array data repository.
Nucleic Acids Res. 2002 Jan 1;30(1):207-10.
http://www.ncbi.nlm.nih.gov/geo/
0
150
300
450
600
D1 D2 D3 X1 X2 X3
NumberofDrugs
Category of Drugs
Number of Drugs Per Category
AML and CLL Protein to Protein
Interaction Network
Top Proteins Targeted by D1 Drugs
AML Drug Targets Number of Drugs CLL Drug Targets Number of Drugs
P42574 (CASP3) 8 P42574 (CASP3) 6
P08684 (CYP3A4) 8 P55211 (CASP9) 5
P33527 (ABCC1) 7 Q14790 (CASP8) 4
P08183 (ABCB1) 6 P09874 (PARP1) 4
Q14790 (CASP8) 6 P33527 (ABCC1) 4
P04637 (TP53) 5 P33527 (ABCB1) 4
Q9UNQ0 (ABCG2) 5 P08684 (CYP3A4) 4
Q92887 (ABCC2) 4 P55210 (CASP7) 3
P55211 (CASP9) 4 Q9UNQ0 (ABCG2) 3
Q16678 (CYP1B1) 4 P20815 (CYP3A5) 3
Table 2: Top proteins identified as targets in DrugBank and STITCH
from drug to protein interaction. The top ten proteins listed were the
proteins that were targeted by the highest number of D1 for each
subtype.
Table 3: Top related diseases from CMBI. Diseases that were found in
the related disease list for CLL and AML as well as having a CMBI score
greater than 0.3 were chosen as the top related diseases. These diseases
were used to identify X category drugs as candidates for repurposing.
Our team has developed a website that allows us to import data into a
database which we can use to analyze and visualize our data collected
from the data mining process. The data will be stored in the postgreSQL
database and use the elasticsearch framework to do the fuzzy searching
which would allow us to narrow down our searches to find key
information. The website also can provide the relationship between the
drug, disease and protein interactions which we can use to create models
and networks. The next steps are to improve the website’s functionality
so that we can import all our data collected regarding drugs, diseases, and
proteins. We will then use the website to gather information about the
interactions between the data we imported to help us create a model
which we will use to identify which of the drug candidates we found are
the most suited for repurposing.
Top Associated Diseases
Disease
AML/CLL
CMBI Score
Disease
AML/CLL
CMBI Score
Acute Lymphoblastic
Leukemia
0.3682,0.3979
Chronic Myeloid
Leukemia 0.3999,0.4131
Mixed lineage leukemia 0.2942/0.3031
T-cell acute
lymphocytic leukemia 0.4212/0.3074
Non-bruton
agammaglobulinemia 0.335/0.4215
Hemophagocytic
lymphohistiocytosis 0.3328/0.3775
Mycosis Fungoides 0.3083/0.3194 B-cell lymphoma 0.3129/0.2957
Non-hodgkin lymphoma 0.3092/0.3293 Hodgkin lymphoma 0.3092/0.3424
Burkitt's lymphoma 0.3092/0.3622 Werner Syndrome 0.2948/0.303
Figures 1 and 2: Proteins from disease to protein interaction were
combined with additional interacting proteins using STRING and HAPPI
databases. Cytoscape was used to create a network of connections between
proteins associated with each subtype.
Figure 3: Number of drugs found using clinicaltrial.gov, cancer.gov,
Leukemia and Lymphoma society, and articles found on pubmed for
AML and CLL as well as the top associated diseases. D1= FDA approved
drugs for AML or CLL. D2= drugs currently in clinical trial for AML or
CLL. D3= drugs with clinical trial information that has not been updated
in the last five years or clinical trials that have been suspended,
terminated, or withdrawn. X category drugs follow the same rules as D
category drugs, but they are being used for the top associated diseases.
Table 1: Top proteins identified as targets in DrugBank and STITCH
from drug to protein interaction. The top ten proteins listed were the
proteins that were targeted by the highest number of D1 for each
subtype.
Popularity of D Category Drugs by Pubmed Search
Name of drug
Number of Pubmed
articles found
Name of drug
Number of Pubmed
articles found
Cytoxan 6967 Etoposide 3421
Methotrexate 6668 Mercaptopurine 2860
Imatinib 6442
Tetradecanoylphorbol
acetate
2759
Aminopterin 5268 Asparaginase 2701
Antracycline 4175 Cytosar 2508