Maria Shkrob, PhD, Project Manager, Elsevier Professional Services
m.shkrob@elsevier.com
March 8, 2016
Mobilizing informational resources
for rare diseases
When every piece matters
| 2
• A rare genetic disease
• Permanently excessive level of
insulin in the blood
• Develops within the first few days of life
Symptoms include floppiness, shakiness, poor
feedings, seizures, fits and convulsions.
 If not caught quickly can lead to brain
injury or even death.
 In the most severe cases the only viable
treatment is the removal of the
pancreas, consigning the patient to a
lifetime of diabetes.
Congenital hyperinsulinsm
is a UK
charity that is building the rare
disease community to raise
awareness, drive research and
develop treatments.
is partnering
with Findacure scientists to help
identify and evaluate treatments
for this devastating disease.
| 3
Congenital hyperinsulinism library
In support of Findacure’s mission of education and knowledge sharing:
• Collection of papers focused on CHI sorted by disease and study type
• Access to all Elsevier’s ScienceDirect full-text publications covering CHI
| 4
Why do we need literature?
PLACES
PEOPLE
GENES
DRUGS
INTERACTIONSPROPERTIES
| 5
The power of processed content
PEOPLE
GENES
DRUGS
INTERACTIONSPROPERTIES
DATA NORMALIZATION DATABASES TOOLS
PLACES
| 7
Research landscape analysis: connecting patients,
researchers and institutions
0 10 20 30 40 50 60 70
Stanley, C.A.
Hussain, K.
De Lonlay, P.
Rahier, J.
Ellard, S.
Flanagan, S.E.
Shyng, S.L.
Nihoul-Fekete, C.
Bellanne-Chantelot, C.
Robert, J.J.
Brunelle, F.
KEY AUTHORS
0 10 20 30 40 50 60 70 80
The Children's Hospital of Philadelphia
UCL Institute of Child Health
Hopital Necker Enfants Malades
University of Pennsylvania, School of…
UCL
Universite Paris Descartes
University of Pennsylvania
Cliniques Universitaires Saint-Luc,…
University of Exeter
Oregon Health and Science University
KEY INSTITUTIONS0 1 2
Ajinomoto CO., INC.
Arkray, INC.
Korea Research Institute…
ViviaBiotech, S.L.
Bassa, Babu V.
Commisariat a l'Energie…
Glaser, Benjamin
Kowa CO., LTD.
Kyowa Hakko Kogyo…
KEY PATENTS
• Most prolific authors and institutions,
based on full-text searching for terms
and synonyms
• Patent assignee names from Reaxys
| 8
Research landscape analysis: collaboration
• Network of people and organizations collaborating in CHI space based on
co-authorship
| 9
High level summary of full text publications
Tag cloud of titles and sentences discussing hyperinsulinism:
• Provides a very high level summary of a group of publications
• Gives overview of the terms and words being used when discussing the
disease
Sized by inversed document frequency (IDF),
colored by term frequency (TDF)
Sized by relevance, colored by trend
| 10
Finding mechanisms and targets: text mining
• Text mining of 25M abstracts, 3.5M Elsevier and non-Elsevier full texts
• normalization of concept names
• normalization of different ways of saying the same thing
• makes text compatible with other sources of information negative effect
| 11
Quick summary of what is known about CHI
• Text mining of 25M abstracts, 3.5M Elsevier and non-Elsevier full texts
• Identified proteins, small molecules, clinical parameters, diseases, and
biological functions, associated with CHI
| 12
Building and refining the disease model
• Summary of the literature findings: CHI mutations
in the context of insulin secretion
• Generate hypotheses using:
 6.2M literature-extracted findings
 Functional annotations (e.g. Gene Ontology)
 >1800 pre-build pathways modeling disease and
normal states
| 13
From pathways to treatments:
PipelinePilot implementation combines data sources
Automated analysis combines bioassay data with pathway data
Find all targets that could
be used to affect the
disease state
Query for each target to find
the activities for each
compound that are >6 log units
Collate data by compound to summarize the
targets/activities related to disease that the
compound hits
• Compute geometric mean of activities for ranking
• Rank by number of targets and geometric mean of
activities against targets
Step 1 Step 2
Step 3
| 14
Automated analysis combines bioassay data with pathway data
From pathways to treatments:
• 88 Targets related to
hyperinsulinism with ≥3
literature references
• Full PathwayStudio
relationship information
• PathwayStudio also has all
compounds suggested as
treatments
Find all targets that could
be used to affect the
disease state
Step 1
| 15
Automated analysis combines bioassay data with pathway data
From pathways to treatments:
Find all targets that could
be used to affect the
disease state
Query for each target to find
compounds that have high
affinity for them (>6 log units)
Step 1 Step 2
Targets based on
text mining
Approved
compounds
| 16
Automated analysis combines bioassay data with pathway data
From pathways to treatments:
Mean of activities
among these targets
Mean of activities
among these targets
Targets and activities
for each compound
Drug-likeness
metrics for
sorting/classification
• All compounds that
were observed to bind
to targets in pathway
• Sorted by number of
active targets.
Too many targets may
suggest lack of specificity.
Find all targets that could
be used to affect the
disease state
Query for each target to find
compounds that have high
affinity for them (>6 log units)
Collate data by compound to summarize the
targets/activities related to disease that the
compound hits
• Compute geometric mean of activities for ranking
• Rank by number of targets and geometric mean of
activities against targets
Step 1 Step 2
Step 3
| 17
Approved compounds that may treat hyperinsulinism
• Each binds to one or
more targets related to
the disease
• Can easily be obtained
and tested in preclinical
studies
• List includes a
compound known to
treat hyperinsulinism,
sirolimus
| 18
From pathways to treatments:
PipelinePilot implementation output
Input:
“Congenital hyperinsulinism”
Output:
• Table of target information from PathwayStudio
• Table of compounds with targets, activities, and druglike parameters for each
compound
• SD file of compounds that may be efficacious, with clinical status (if any)
• Authors, Affiliations, Collaboration map
• List of papers
| 19
Power of combining pathway
data with experimentally
verified binding data
• Not just theoretical pathways -
testable hypotheses.
Results in testable
ideas
• Many compounds are
already approved drugs,
can be tested in in-vivo
experiments
Concepts can be extended
to find novel compounds
• Use modeling tools to extract
common frameworks
• SAR to optimize activity for
new indication
• Compare with compounds
suggested as treatments as
found by text mining
From pathways to treatments:
PipelinePilot implementation summary
| 20
Findacure: empowering patient groups and facilitating
treatment development
Parents:
• Learn more about the disease
• Find doctors and medical centers
Doctors:
• Learn more about the disease
• Explore case studies
• Collaborate
Researchers:
• Testable ideas for repurposing of generic drugs
• Knowledgebase to support the research of the disease
mechanisms
• Collaborate
Evidence to support 10 drug repurposing trials
| 21
• Figured out what is really needed
• Went through all the content, resources and tools we have in our
possession
• Which is possible because information is normalized
• Once the output of interest is decided: automated answer-generation
Provide disease name and get:
 KOLs and institutes
 List of targets with supporting information
 Sorted list of approved drugs with supporting information
Summary
| 22
Findacure / Elsevier collaboration
Dr Rick Thompson
Findacure
Dr Nicolas Sireau
Findacure
Dr Matthew Clark
Elsevier
Dr Maria Shkrob
Elsevier
Thank you
Welcome to our
booth
#306

Mobilizing informational resources for rare diseases

  • 1.
    Maria Shkrob, PhD,Project Manager, Elsevier Professional Services m.shkrob@elsevier.com March 8, 2016 Mobilizing informational resources for rare diseases When every piece matters
  • 2.
    | 2 • Arare genetic disease • Permanently excessive level of insulin in the blood • Develops within the first few days of life Symptoms include floppiness, shakiness, poor feedings, seizures, fits and convulsions.  If not caught quickly can lead to brain injury or even death.  In the most severe cases the only viable treatment is the removal of the pancreas, consigning the patient to a lifetime of diabetes. Congenital hyperinsulinsm is a UK charity that is building the rare disease community to raise awareness, drive research and develop treatments. is partnering with Findacure scientists to help identify and evaluate treatments for this devastating disease.
  • 3.
    | 3 Congenital hyperinsulinismlibrary In support of Findacure’s mission of education and knowledge sharing: • Collection of papers focused on CHI sorted by disease and study type • Access to all Elsevier’s ScienceDirect full-text publications covering CHI
  • 4.
    | 4 Why dowe need literature? PLACES PEOPLE GENES DRUGS INTERACTIONSPROPERTIES
  • 5.
    | 5 The powerof processed content PEOPLE GENES DRUGS INTERACTIONSPROPERTIES DATA NORMALIZATION DATABASES TOOLS PLACES
  • 6.
    | 7 Research landscapeanalysis: connecting patients, researchers and institutions 0 10 20 30 40 50 60 70 Stanley, C.A. Hussain, K. De Lonlay, P. Rahier, J. Ellard, S. Flanagan, S.E. Shyng, S.L. Nihoul-Fekete, C. Bellanne-Chantelot, C. Robert, J.J. Brunelle, F. KEY AUTHORS 0 10 20 30 40 50 60 70 80 The Children's Hospital of Philadelphia UCL Institute of Child Health Hopital Necker Enfants Malades University of Pennsylvania, School of… UCL Universite Paris Descartes University of Pennsylvania Cliniques Universitaires Saint-Luc,… University of Exeter Oregon Health and Science University KEY INSTITUTIONS0 1 2 Ajinomoto CO., INC. Arkray, INC. Korea Research Institute… ViviaBiotech, S.L. Bassa, Babu V. Commisariat a l'Energie… Glaser, Benjamin Kowa CO., LTD. Kyowa Hakko Kogyo… KEY PATENTS • Most prolific authors and institutions, based on full-text searching for terms and synonyms • Patent assignee names from Reaxys
  • 7.
    | 8 Research landscapeanalysis: collaboration • Network of people and organizations collaborating in CHI space based on co-authorship
  • 8.
    | 9 High levelsummary of full text publications Tag cloud of titles and sentences discussing hyperinsulinism: • Provides a very high level summary of a group of publications • Gives overview of the terms and words being used when discussing the disease Sized by inversed document frequency (IDF), colored by term frequency (TDF) Sized by relevance, colored by trend
  • 9.
    | 10 Finding mechanismsand targets: text mining • Text mining of 25M abstracts, 3.5M Elsevier and non-Elsevier full texts • normalization of concept names • normalization of different ways of saying the same thing • makes text compatible with other sources of information negative effect
  • 10.
    | 11 Quick summaryof what is known about CHI • Text mining of 25M abstracts, 3.5M Elsevier and non-Elsevier full texts • Identified proteins, small molecules, clinical parameters, diseases, and biological functions, associated with CHI
  • 11.
    | 12 Building andrefining the disease model • Summary of the literature findings: CHI mutations in the context of insulin secretion • Generate hypotheses using:  6.2M literature-extracted findings  Functional annotations (e.g. Gene Ontology)  >1800 pre-build pathways modeling disease and normal states
  • 12.
    | 13 From pathwaysto treatments: PipelinePilot implementation combines data sources Automated analysis combines bioassay data with pathway data Find all targets that could be used to affect the disease state Query for each target to find the activities for each compound that are >6 log units Collate data by compound to summarize the targets/activities related to disease that the compound hits • Compute geometric mean of activities for ranking • Rank by number of targets and geometric mean of activities against targets Step 1 Step 2 Step 3
  • 13.
    | 14 Automated analysiscombines bioassay data with pathway data From pathways to treatments: • 88 Targets related to hyperinsulinism with ≥3 literature references • Full PathwayStudio relationship information • PathwayStudio also has all compounds suggested as treatments Find all targets that could be used to affect the disease state Step 1
  • 14.
    | 15 Automated analysiscombines bioassay data with pathway data From pathways to treatments: Find all targets that could be used to affect the disease state Query for each target to find compounds that have high affinity for them (>6 log units) Step 1 Step 2 Targets based on text mining Approved compounds
  • 15.
    | 16 Automated analysiscombines bioassay data with pathway data From pathways to treatments: Mean of activities among these targets Mean of activities among these targets Targets and activities for each compound Drug-likeness metrics for sorting/classification • All compounds that were observed to bind to targets in pathway • Sorted by number of active targets. Too many targets may suggest lack of specificity. Find all targets that could be used to affect the disease state Query for each target to find compounds that have high affinity for them (>6 log units) Collate data by compound to summarize the targets/activities related to disease that the compound hits • Compute geometric mean of activities for ranking • Rank by number of targets and geometric mean of activities against targets Step 1 Step 2 Step 3
  • 16.
    | 17 Approved compoundsthat may treat hyperinsulinism • Each binds to one or more targets related to the disease • Can easily be obtained and tested in preclinical studies • List includes a compound known to treat hyperinsulinism, sirolimus
  • 17.
    | 18 From pathwaysto treatments: PipelinePilot implementation output Input: “Congenital hyperinsulinism” Output: • Table of target information from PathwayStudio • Table of compounds with targets, activities, and druglike parameters for each compound • SD file of compounds that may be efficacious, with clinical status (if any) • Authors, Affiliations, Collaboration map • List of papers
  • 18.
    | 19 Power ofcombining pathway data with experimentally verified binding data • Not just theoretical pathways - testable hypotheses. Results in testable ideas • Many compounds are already approved drugs, can be tested in in-vivo experiments Concepts can be extended to find novel compounds • Use modeling tools to extract common frameworks • SAR to optimize activity for new indication • Compare with compounds suggested as treatments as found by text mining From pathways to treatments: PipelinePilot implementation summary
  • 19.
    | 20 Findacure: empoweringpatient groups and facilitating treatment development Parents: • Learn more about the disease • Find doctors and medical centers Doctors: • Learn more about the disease • Explore case studies • Collaborate Researchers: • Testable ideas for repurposing of generic drugs • Knowledgebase to support the research of the disease mechanisms • Collaborate Evidence to support 10 drug repurposing trials
  • 20.
    | 21 • Figuredout what is really needed • Went through all the content, resources and tools we have in our possession • Which is possible because information is normalized • Once the output of interest is decided: automated answer-generation Provide disease name and get:  KOLs and institutes  List of targets with supporting information  Sorted list of approved drugs with supporting information Summary
  • 21.
    | 22 Findacure /Elsevier collaboration Dr Rick Thompson Findacure Dr Nicolas Sireau Findacure Dr Matthew Clark Elsevier Dr Maria Shkrob Elsevier
  • 22.
    Thank you Welcome toour booth #306