SlideShare a Scribd company logo
1 of 46
Practical semantics in the pharmaceutical
industry - the Open PHACTS project
Antony Williams
On behalf of the Open PHACTS Team
(and with a focus on Chemistry!)
Fundamental issue:
There is a LOT of science online!
Chaotic, varying quality and very valuable!
Scientists want to find information quickly and
easily
Often they just “can‟t get there” (or don‟t even
know where “there” is)
And you have to manage it all (or not)
Pre-competitive Informatics:
Pharma are all accessing, processing, storing & re-processing external research data
Literature
PubChem
Genbank
Patents
Databases
Downloads
Data Integration Data Analysis
Firewalled Databases
Repeat @
each
company
x
Lowering industry firewalls: pre-competitive informatics in drug discovery
Nature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944
The Project
Innovative Medicines Initiative
• EC funded public-private
partnership for
pharmaceutical research
• Focus on key problems
– Efficacy, Safety, Educati
on &
Training, Knowledge
Management
The Open PHACTS Project
• Create a semantic integration hub (“Open
Pharmacological Space”)…
• Delivering services to support on-going drug
discovery programs in pharma and public domain
• Not just another project; Leading academics in
semantics, pharmacology and informatics, driven
by solid industry business requirements
• 23 academic partners, 8 pharmaceutical
companies, 3 biotechs INITIALLY
• Work split into clusters:
• Technical Build
• Scientific Drive
• Community & Sustainability
Major Work Streams
Build: OPS service layer and resource integration
Drive: Development of exemplar work packages & Applications
Sustain: Community engagement and long-term sustainability
Assertion & Meta Data Mgmt
Transform / Translate
Integrator
OPS Service Layer
Corpus 1
„Consumer‟
Firewall
Supplier
Firewall
Db 2
Db 3
Db 4
Corpus 5
Std Public
Vocabularies
Target
Dossier
Compound
Dossier
Pharmacological
Networks
Business
Rules
Work Stream 1: Open Pharmacological
Space (OPS)Service Layer
Standardised softwarelayerto allow public
DD resource integration
− Define standards and construct OPS service layer
− Develop interface (API) for data access, integration
and analysis
− Develop secure access models
Existing Drug Discovery (DD)Resource
Integration
Work Stream 2: Exemplar Drug Discovery Informatics tools
Develop exemplar services to test OPS Service Layer
Target Dossier (Data Integration)
Pharmacological Network Navigator (Data Visualisation)
Compound Dossier (Data Analysis)
ChEMBL DrugBank
Gene
Ontology
Wikipathways
UniProt
ChemSpider
UMLS
ConceptWiki
ChEBI
TrialTrove
GVKBio
GeneGo
TR Integrity
“Find me compounds
that inhibit targets in
NFkB pathway assayed
in only functional assays
with a potency <1 μM”
“What is the
selectivity profile of
known p38 inhibitors?”
“Let me compare
MW, logP and PSA
for known
oxidoreductase
inhibitors”
Number sum Nr of 1 Question
15 12 9 All oxidoreductase inhibitors active <100nM in both human and mouse
18 14 8
Given compound X, what is its predicted secondary pharmacology? What are the on and
off,target safety concerns for a compound? What is the evidence and how reliable is that
evidence (journal impact factor, KOL) for findings associated with a compound?
24 13 8
Given a target find me all actives against that target. Find/predict polypharmacology of actives.
Determine ADMET profile of actives.
32 13 8 For a given interaction profile, give me compounds similar to it.
37 13 8
The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data
in serine protease assays for molecules that contain substructure X.
38 13 8
Retrieve all experimental and clinical data for a given list of compounds defined by their chemical
structure (with options to match stereochemistry or not).
41 13 8
A project is considering Protein Kinase C Alpha (PRKCA) as a target. What are all the
compounds known to modulate the target directly? What are the compounds that may modulate
the target directly? i.e. return all cmpds active in assays where the resolution is at least at the
level of the target family (i.e. PKC) both from structured assay databases and the literature.
44 13 8 Give me all active compounds on a given target with the relevant assay data
46 13 8
Give me the compound(s) which hit most specifically the multiple targets in a given pathway
(disease)
59 14 8 Identify all known protein-protein interaction inhibitors
Business Question Driven Approach
Open PHACTS Scientific Services
Platform Explorer
Standards
Apps
API
“Provenance Everywhere”
Nanopub
Db
VoID
Data Cache
(Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)
Domain
Specific
Services
Identity
Resolution
Service
Chemistry
Registration
Normalisation
& Q/C
Identifier
Management
Service
Indexing
CorePlatform
P12374
EC2.43.4
CS4532
“Adenosine
receptor 2a”
VoID
Db
Nanopub
Db
VoID
Db
VoID
Nanopub
VoID
Public Content Commercial
Public
Ontologies
User
Annotations
Apps
RDF/VoID
RDF (Resource Description Framework)
VoID (Vocabulary of Interlinked Datasets)
– Metadata describing the RDF
– Describes how Datasets are linked using Linksets
• skos:exactMatch (Simple Knowledge Organisation System)
E.g. To link compounds in OPS with compounds in ChEBI.
• skos:closeMatch
E.g. To link Stereo Insensitive Parents to their Children within OPS.
• skos:relatedMatch
E.g. To link Parent compounds that contain others as Fragments.
• dul:expresses (DOLCE+DnS Ultralite) – describes what links the Datasets. We
use Cheminf to express the links E.g.
http://semanticscience.org/resource/CHEMINF_000059 represents an
InChIKey.
– Recommendations on how to create the VoID have been specified by Manchester
here: http://www.cs.man.ac.uk/~graya/ops/2012/ED-datadesc/
Chemistry
Registration
Normalisation
& Q/C
Chemistry Registration
• Old chemistry registration system uses standard
ChemSpider deposition system: includes low-
level structure validation and manual curation
service by RSC staff.
• New Registration System
• Utilizes ChemSpider Validation and
Standardization platform including collapsing
tautomers
• Utilizes FDA rule set as basis for
standardizations
• Generate Open PHACTS identifier (OPS ID)
STANDARD_TYPE UNIT_COUNT
---------------- -------
AC50 7
Activity 421
EC50 39
IC50 46
ID50 42
Ki 23
Log IC50 4
Log Ki 7
Potency 11
log IC50 0
STANDARD_TYPE STANDARD_UNITS COUNT(*)
------------------ ------------------ --------
IC50 nM 829448
IC50 ug.mL-1 41000
IC50 38521
IC50 ug/ml 2038
IC50 ug ml-1 509
IC50 mg kg-1 295
IC50 molar ratio 178
IC50 ug 117
IC50 % 113
IC50 uM well-1 52
~ 100 units
>5000 types
Implemented using the Quantities, Dimension, Units, Types
Ontology (http://www.qudt.org/)
Quantitative Data Challenges
Content Changes Regularly! POINT IN TIME
Source Initial Records Triples Properties
ChEMBL 1,149,792
~1,091,462 cmpds
~8845 targets
146,079,194 17 cmpds
13 targets
DrugBank 19,628
~14,000 drugs
~5000 targets
517,584 74
UniProt 536,789 156,569,764 78
ENZYME 6,187 73,838 2
ChEBI 35,584 905,189 2
GO/GOA 38,137 24,574,774 42
ChemSpider/ACD 1,194,437 161,336,857 22 ACD, 4 CS
ConceptWiki 2,828,966 3,739,884 1
Data Cache
(Virtuoso Triple Store)
Semantic Workflow Engine
Infrastructure
Hardware (development)
- 2 x Intel Xeon E5-2640 
- 384 GB
DDR3 1333MHz RAM
- 1.5 TB
SSD 
- 3TB 7200rpm
Triple Store
- Virtuoso 7 column store
- Shown to scale to > 100 billion
triples
Network
- AMX-IS
- Extensive memcache
Antony Williams vs Identifiers
Passport ID
Dad, Tony, others
SSN
Green Card
License
5 email addresses
ChemConnector
(blog, Twitter
account, Facebook, Fri
endfeed)
OpenID, ORCID
….
P12047
X31045
GB:29384
Let a Mapping Service take the strain….
PubChemDrugbankChemSpider
Imatinib
Mesylate
What Is Gleevec?
Strict Relaxed
Analysing Browsing
Dynamic Equality
LinkSet#1 {
chemspider:gleevec hasParent imatinib ...
drugbank:gleevec exactMatch imatinib ...
}
chemspider:gleevec drugbank:gleevec
ChemSpider Validation & Standardization Platform
Quality Assurance
Chemistry Validation and Standardization
Platform (CVSP)
at cvsp.chemspider.com
• Validation
• Standardization
• Parent generation
RDF Export
Data
CTAB
REGID1
DataSource
Synonym1
Synonym2
XRef1
etc
Deposited
SDF record Standardized
entity
OPS_ID1
Parents
Charge Parent (OPS_ID7)
Isotope Parent (OPS_ID5)
Stereo Parent (OPS_ID4)
Tautomer Parent
(OPS_ID6)
Super Parent (OPS_ID8)
Fragment (OPS_ID3)
Fragment (OPS_ID2)
For each Compound (CSID) parent generation is
attempted: “Tautomerism in large databases”, Sitzmann
and others, J.Comput Aided Mol Des (2010)
Parent Description
Charge-
Unsensitive
An attempt is made to neutralize ionized
acids and bases. Envisioned to be an
ongoing improvement while new cases
appear.
Isotope-
Unsensitive
Isotopes replaced by common weight
Stereo-Unsensitive Stereo is stripped
Tautomer-
Unsensitive
Tautomer canonicalization is attempting to
generate a “reasonable” tautomer
Super-Unsensitive This parent is all of the above
OPS1
DrugBank ID DB07241
OPS5OPS4
OPS3
OPS2
OPS6
ops:OPS1 skos:exactMatch
<http://www4.wiwiss.fu-
berlin.de/drugbank/resource/drugs/DB07241> .
ops:OPS2 skos:relatedMatch ops:OPS1 .
ops:OPS3 skos:relatedMatch ops:OPS1 .
ops:OPS3 skos:closeMatch ops:OPS4 .
ops:OPS3 skos:closeMatch ops:OPS5 .
ops:OPS4 skos:closeMatch ops:OPS6 .
ops:OPS5 skos:closeMatch ops:OPS6 .
A Precompetitive Knowledge Framework
Integration
Pharma Needs
Inputs
Sustainabilit
y
Stability
Security
Management
/ Governance
Data Mining
Services/Alg
orithms
Mapping &
Populating
Architecture
Interfaces
& Services
Content
Structured &
Unstructured
Vocabularies
& Identifiers
(URIs)
Community
KD
Innovation
The Ecosystem is ….
API
Approach
Community
Industry
Academia
Data
Provider
Software
Provider
Kick-Starting Sustainability
Collaboration
Grants
Industry
Open PHACTS APIUsers
Apps
API
explorer.openphacts.org
Example applications
Advanced analytics
ChemBioNavigator Navigating at the interface of chemical and
biological data with sorting and plotting options
TargetDossier Interconnecting Open PHACTS with multiple
target centric services. Exploring target
similarity using diverse criteria
PharmaTrek Interactive Polypharmacology space of
experimental annotations
UTOPIA Semantic enrichment of scientific PDFs
Predictions
GARFIELD Prediction of target pharmacology based on the
Similar Ensemble Approach
eTOX connector Automatic extraction of data for building
predictive toxicology models in eTOX project
Front-endframework to visualizebiologicaldata
Target dossier (CNIO)
The Open PHACTS community ecosystem
Becoming part of the Open PHACTS Foundation
Members
membership offers early access to platform updates and releases
the opportunity to steer research and development directions
receive technical support
work with the ecosystem of developers and semantic data
integrators around Open PHACTS
tiered membership
familiar business and governance model
A UK-based not-for-profit
member owned company
What are the problems with licensing we had to address?
– To make data and software generated by the project usable/ reusable
– Multiplicity of unclear or non-standard licenses on original data sources
• „Public‟ can mean use but not redistribute, use in commercial environment,
• Legal position on use and reuse extremely unclear
• Different issues than just linking to data
– Legal status of integrated collections of the above, and of derived
knowledge?
– Appropriate software license selection
– Legal clarity for EFPIA and end users
– Approaches for commercial data integration, EFPIA in-house data
AIM: enable maximum possible dissemination and usability of integrated
data and architecture with approaches that will be applicable in other data
integration projects
Licensing Challenges
Chose John Wilbanks as consultant
A framework built around STANDARD well-understood
Creative Commons licences – and how they interoperate
Deal with the problems by:
Interoperable licences
Appropriate terms
Declare expectations to users and data publishers
One size won„t fit all requirements
Data Licensing Solution
Open PHACTS Project Partners
Pfizer Limited – Coordinator
Universität Wien – Managing entity
Technical University of Denmark
University of Hamburg, Center for Bioinformatics
BioSolveIT GmBH
Consorci Mar Parc de Salut de Barcelona
Leiden University Medical Centre
Royal Society of Chemistry
Vrije Universiteit Amsterdam
Spanish National Cancer Research Centre
University of Manchester
Maastricht University
Aqnowledge
University of Santiago de Compostela
Rheinische Friedrich-Wilhelms-Universität Bonn
AstraZeneca
GlaxoSmithKline
Esteve
Novartis
Merck Serono
H. Lundbeck A/S
Eli Lilly
Netherlands Bioinformatics Centre
Swiss Institute of Bioinformatics
ConnectedDiscovery
EMBL-European Bioinformatics Institute
Janssen
OpenLink
pmu@openphacts.org

More Related Content

What's hot

IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...Chris Southan
 
Big data supporting drug discovery - cautionary tales from the world of chemi...
Big data supporting drug discovery - cautionary tales from the world of chemi...Big data supporting drug discovery - cautionary tales from the world of chemi...
Big data supporting drug discovery - cautionary tales from the world of chemi...Valery Tkachenko
 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply ChainPaul Groth
 
Digging out Structures for Repurposing: Non-competitive Intelligence ...
Digging out Structures for Repurposing: Non-competitive Intelligence        ...Digging out Structures for Repurposing: Non-competitive Intelligence        ...
Digging out Structures for Repurposing: Non-competitive Intelligence ...Chris Southan
 
Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1BigData_Europe
 
Open PHACTS (Sept 2013) EBI Industry Programme
Open PHACTS (Sept 2013) EBI Industry ProgrammeOpen PHACTS (Sept 2013) EBI Industry Programme
Open PHACTS (Sept 2013) EBI Industry ProgrammeSciBite Limited
 
Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium Kerstin Forsberg
 
Open sourcedays2013 claus stie kallesoe
Open sourcedays2013   claus stie kallesoeOpen sourcedays2013   claus stie kallesoe
Open sourcedays2013 claus stie kallesoeClaus Stie Kallesøe
 
Exploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChemExploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChemPaul Thiessen
 
BOUNCER: A Privacy-aware Query Processing Over Federations of RDF Datasets
BOUNCER: A Privacy-aware Query Processing Over Federations of RDF DatasetsBOUNCER: A Privacy-aware Query Processing Over Federations of RDF Datasets
BOUNCER: A Privacy-aware Query Processing Over Federations of RDF DatasetsKemele M. Endris
 
Patent Knowledge Mining for Drug Repurposing
Patent Knowledge Mining for Drug RepurposingPatent Knowledge Mining for Drug Repurposing
Patent Knowledge Mining for Drug RepurposingHermann Mucke
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BigData_Europe
 
Capturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataCapturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataChris Southan
 

What's hot (16)

IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
IUPHAR/BPS Guide to Pharmacology: concise mapping of chemistry, data, and tar...
 
Big data supporting drug discovery - cautionary tales from the world of chemi...
Big data supporting drug discovery - cautionary tales from the world of chemi...Big data supporting drug discovery - cautionary tales from the world of chemi...
Big data supporting drug discovery - cautionary tales from the world of chemi...
 
Transparency in the Data Supply Chain
Transparency in the Data Supply ChainTransparency in the Data Supply Chain
Transparency in the Data Supply Chain
 
Digging out Structures for Repurposing: Non-competitive Intelligence ...
Digging out Structures for Repurposing: Non-competitive Intelligence        ...Digging out Structures for Repurposing: Non-competitive Intelligence        ...
Digging out Structures for Repurposing: Non-competitive Intelligence ...
 
Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1Open PHACTS for BDE SC1.1
Open PHACTS for BDE SC1.1
 
Assay Development and Drug Repurposing Core
Assay Development and Drug Repurposing CoreAssay Development and Drug Repurposing Core
Assay Development and Drug Repurposing Core
 
Open PHACTS (Sept 2013) EBI Industry Programme
Open PHACTS (Sept 2013) EBI Industry ProgrammeOpen PHACTS (Sept 2013) EBI Industry Programme
Open PHACTS (Sept 2013) EBI Industry Programme
 
Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium Lankade data Vinnova webbinarium
Lankade data Vinnova webbinarium
 
Open sourcedays2013 claus stie kallesoe
Open sourcedays2013   claus stie kallesoeOpen sourcedays2013   claus stie kallesoe
Open sourcedays2013 claus stie kallesoe
 
Exploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChemExploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChem
 
BOUNCER: A Privacy-aware Query Processing Over Federations of RDF Datasets
BOUNCER: A Privacy-aware Query Processing Over Federations of RDF DatasetsBOUNCER: A Privacy-aware Query Processing Over Federations of RDF Datasets
BOUNCER: A Privacy-aware Query Processing Over Federations of RDF Datasets
 
Patent Knowledge Mining for Drug Repurposing
Patent Knowledge Mining for Drug RepurposingPatent Knowledge Mining for Drug Repurposing
Patent Knowledge Mining for Drug Repurposing
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
 
Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...
 
Capturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor dataCapturing BIA-10-2474 and related FAAH inhibitor data
Capturing BIA-10-2474 and related FAAH inhibitor data
 
Article
ArticleArticle
Article
 

Similar to Practical semantics in the pharmaceutical industry - the Open PHACTS project

Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiChris Evelo
 
Mashing Up Drug Discovery
Mashing Up Drug DiscoveryMashing Up Drug Discovery
Mashing Up Drug DiscoverySciBite Limited
 
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...open_phacts
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionPaul Groth
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSValery Tkachenko
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsmikaelhuss
 
Jack Tuszynski Accelerating Chemotherapy Drug Discovery with Analytics and Hi...
Jack Tuszynski Accelerating Chemotherapy Drug Discovery with Analytics and Hi...Jack Tuszynski Accelerating Chemotherapy Drug Discovery with Analytics and Hi...
Jack Tuszynski Accelerating Chemotherapy Drug Discovery with Analytics and Hi...Kim Solez ,
 
Open PHACTS MIOSS may 2016
Open PHACTS MIOSS may 2016Open PHACTS MIOSS may 2016
Open PHACTS MIOSS may 2016open_phacts
 
2011-12-02 Open PHACTS at STM Innovation
2011-12-02 Open PHACTS at STM Innovation2011-12-02 Open PHACTS at STM Innovation
2011-12-02 Open PHACTS at STM Innovationopen_phacts
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...Maulik Kamdar
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Sanjay Padhi, Ph.D
 
Driving Deep Semantics in Middleware and Networks: What, why and how?
Driving Deep Semantics in Middleware and Networks: What, why and how?Driving Deep Semantics in Middleware and Networks: What, why and how?
Driving Deep Semantics in Middleware and Networks: What, why and how?Amit Sheth
 
Promiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNPromiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNJeremy Yang
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesJosef Scheiber
 
Stratergies for the intergration of information (IPI_ConfEX)
Stratergies for the intergration of information (IPI_ConfEX)Stratergies for the intergration of information (IPI_ConfEX)
Stratergies for the intergration of information (IPI_ConfEX)Ben Gardner
 

Similar to Practical semantics in the pharmaceutical industry - the Open PHACTS project (20)

Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
Mashing Up Drug Discovery
Mashing Up Drug DiscoveryMashing Up Drug Discovery
Mashing Up Drug Discovery
 
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
2012-10-08 Practical Semantics In The Pharmaceutical Industry - The Open PHAC...
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tension
 
Online Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery SystemsOnline Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery Systems
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTS
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
Semantic (Web) Technologies for Translational Research in Life Sciences
Semantic (Web) Technologies for Translational Research in Life SciencesSemantic (Web) Technologies for Translational Research in Life Sciences
Semantic (Web) Technologies for Translational Research in Life Sciences
 
Jack Tuszynski Accelerating Chemotherapy Drug Discovery with Analytics and Hi...
Jack Tuszynski Accelerating Chemotherapy Drug Discovery with Analytics and Hi...Jack Tuszynski Accelerating Chemotherapy Drug Discovery with Analytics and Hi...
Jack Tuszynski Accelerating Chemotherapy Drug Discovery with Analytics and Hi...
 
Open PHACTS MIOSS may 2016
Open PHACTS MIOSS may 2016Open PHACTS MIOSS may 2016
Open PHACTS MIOSS may 2016
 
2011-12-02 Open PHACTS at STM Innovation
2011-12-02 Open PHACTS at STM Innovation2011-12-02 Open PHACTS at STM Innovation
2011-12-02 Open PHACTS at STM Innovation
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021
 
Driving Deep Semantics in Middleware and Networks: What, why and how?
Driving Deep Semantics in Middleware and Networks: What, why and how?Driving Deep Semantics in Middleware and Networks: What, why and how?
Driving Deep Semantics in Middleware and Networks: What, why and how?
 
Promiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNPromiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCN
 
Big Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use CasesBig Data in Pharma - Overview and Use Cases
Big Data in Pharma - Overview and Use Cases
 
Stratergies for the intergration of information (IPI_ConfEX)
Stratergies for the intergration of information (IPI_ConfEX)Stratergies for the intergration of information (IPI_ConfEX)
Stratergies for the intergration of information (IPI_ConfEX)
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 

Recently uploaded (20)

The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 

Practical semantics in the pharmaceutical industry - the Open PHACTS project

  • 1. Practical semantics in the pharmaceutical industry - the Open PHACTS project Antony Williams On behalf of the Open PHACTS Team (and with a focus on Chemistry!)
  • 2. Fundamental issue: There is a LOT of science online! Chaotic, varying quality and very valuable! Scientists want to find information quickly and easily Often they just “can‟t get there” (or don‟t even know where “there” is) And you have to manage it all (or not)
  • 3. Pre-competitive Informatics: Pharma are all accessing, processing, storing & re-processing external research data Literature PubChem Genbank Patents Databases Downloads Data Integration Data Analysis Firewalled Databases Repeat @ each company x Lowering industry firewalls: pre-competitive informatics in drug discovery Nature Reviews Drug Discovery (2009) 8, 701-708 doi:10.1038/nrd2944
  • 4. The Project Innovative Medicines Initiative • EC funded public-private partnership for pharmaceutical research • Focus on key problems – Efficacy, Safety, Educati on & Training, Knowledge Management The Open PHACTS Project • Create a semantic integration hub (“Open Pharmacological Space”)… • Delivering services to support on-going drug discovery programs in pharma and public domain • Not just another project; Leading academics in semantics, pharmacology and informatics, driven by solid industry business requirements • 23 academic partners, 8 pharmaceutical companies, 3 biotechs INITIALLY • Work split into clusters: • Technical Build • Scientific Drive • Community & Sustainability
  • 5. Major Work Streams Build: OPS service layer and resource integration Drive: Development of exemplar work packages & Applications Sustain: Community engagement and long-term sustainability Assertion & Meta Data Mgmt Transform / Translate Integrator OPS Service Layer Corpus 1 „Consumer‟ Firewall Supplier Firewall Db 2 Db 3 Db 4 Corpus 5 Std Public Vocabularies Target Dossier Compound Dossier Pharmacological Networks Business Rules Work Stream 1: Open Pharmacological Space (OPS)Service Layer Standardised softwarelayerto allow public DD resource integration − Define standards and construct OPS service layer − Develop interface (API) for data access, integration and analysis − Develop secure access models Existing Drug Discovery (DD)Resource Integration Work Stream 2: Exemplar Drug Discovery Informatics tools Develop exemplar services to test OPS Service Layer Target Dossier (Data Integration) Pharmacological Network Navigator (Data Visualisation) Compound Dossier (Data Analysis)
  • 6. ChEMBL DrugBank Gene Ontology Wikipathways UniProt ChemSpider UMLS ConceptWiki ChEBI TrialTrove GVKBio GeneGo TR Integrity “Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM” “What is the selectivity profile of known p38 inhibitors?” “Let me compare MW, logP and PSA for known oxidoreductase inhibitors”
  • 7. Number sum Nr of 1 Question 15 12 9 All oxidoreductase inhibitors active <100nM in both human and mouse 18 14 8 Given compound X, what is its predicted secondary pharmacology? What are the on and off,target safety concerns for a compound? What is the evidence and how reliable is that evidence (journal impact factor, KOL) for findings associated with a compound? 24 13 8 Given a target find me all actives against that target. Find/predict polypharmacology of actives. Determine ADMET profile of actives. 32 13 8 For a given interaction profile, give me compounds similar to it. 37 13 8 The current Factor Xa lead series is characterised by substructure X. Retrieve all bioactivity data in serine protease assays for molecules that contain substructure X. 38 13 8 Retrieve all experimental and clinical data for a given list of compounds defined by their chemical structure (with options to match stereochemistry or not). 41 13 8 A project is considering Protein Kinase C Alpha (PRKCA) as a target. What are all the compounds known to modulate the target directly? What are the compounds that may modulate the target directly? i.e. return all cmpds active in assays where the resolution is at least at the level of the target family (i.e. PKC) both from structured assay databases and the literature. 44 13 8 Give me all active compounds on a given target with the relevant assay data 46 13 8 Give me the compound(s) which hit most specifically the multiple targets in a given pathway (disease) 59 14 8 Identify all known protein-protein interaction inhibitors Business Question Driven Approach
  • 8.
  • 9. Open PHACTS Scientific Services Platform Explorer Standards Apps API “Provenance Everywhere”
  • 10. Nanopub Db VoID Data Cache (Virtuoso Triple Store) Semantic Workflow Engine Linked Data API (RDF/XML, TTL, JSON) Domain Specific Services Identity Resolution Service Chemistry Registration Normalisation & Q/C Identifier Management Service Indexing CorePlatform P12374 EC2.43.4 CS4532 “Adenosine receptor 2a” VoID Db Nanopub Db VoID Db VoID Nanopub VoID Public Content Commercial Public Ontologies User Annotations Apps
  • 11.
  • 12. RDF/VoID RDF (Resource Description Framework) VoID (Vocabulary of Interlinked Datasets) – Metadata describing the RDF – Describes how Datasets are linked using Linksets • skos:exactMatch (Simple Knowledge Organisation System) E.g. To link compounds in OPS with compounds in ChEBI. • skos:closeMatch E.g. To link Stereo Insensitive Parents to their Children within OPS. • skos:relatedMatch E.g. To link Parent compounds that contain others as Fragments. • dul:expresses (DOLCE+DnS Ultralite) – describes what links the Datasets. We use Cheminf to express the links E.g. http://semanticscience.org/resource/CHEMINF_000059 represents an InChIKey. – Recommendations on how to create the VoID have been specified by Manchester here: http://www.cs.man.ac.uk/~graya/ops/2012/ED-datadesc/
  • 13. Chemistry Registration Normalisation & Q/C Chemistry Registration • Old chemistry registration system uses standard ChemSpider deposition system: includes low- level structure validation and manual curation service by RSC staff. • New Registration System • Utilizes ChemSpider Validation and Standardization platform including collapsing tautomers • Utilizes FDA rule set as basis for standardizations • Generate Open PHACTS identifier (OPS ID)
  • 14. STANDARD_TYPE UNIT_COUNT ---------------- ------- AC50 7 Activity 421 EC50 39 IC50 46 ID50 42 Ki 23 Log IC50 4 Log Ki 7 Potency 11 log IC50 0 STANDARD_TYPE STANDARD_UNITS COUNT(*) ------------------ ------------------ -------- IC50 nM 829448 IC50 ug.mL-1 41000 IC50 38521 IC50 ug/ml 2038 IC50 ug ml-1 509 IC50 mg kg-1 295 IC50 molar ratio 178 IC50 ug 117 IC50 % 113 IC50 uM well-1 52 ~ 100 units >5000 types Implemented using the Quantities, Dimension, Units, Types Ontology (http://www.qudt.org/) Quantitative Data Challenges
  • 15. Content Changes Regularly! POINT IN TIME Source Initial Records Triples Properties ChEMBL 1,149,792 ~1,091,462 cmpds ~8845 targets 146,079,194 17 cmpds 13 targets DrugBank 19,628 ~14,000 drugs ~5000 targets 517,584 74 UniProt 536,789 156,569,764 78 ENZYME 6,187 73,838 2 ChEBI 35,584 905,189 2 GO/GOA 38,137 24,574,774 42 ChemSpider/ACD 1,194,437 161,336,857 22 ACD, 4 CS ConceptWiki 2,828,966 3,739,884 1
  • 16. Data Cache (Virtuoso Triple Store) Semantic Workflow Engine Infrastructure Hardware (development) - 2 x Intel Xeon E5-2640 
- 384 GB DDR3 1333MHz RAM
- 1.5 TB SSD 
- 3TB 7200rpm Triple Store - Virtuoso 7 column store - Shown to scale to > 100 billion triples Network - AMX-IS - Extensive memcache
  • 17. Antony Williams vs Identifiers Passport ID Dad, Tony, others SSN Green Card License 5 email addresses ChemConnector (blog, Twitter account, Facebook, Fri endfeed) OpenID, ORCID ….
  • 18.
  • 19.
  • 20.
  • 21.
  • 22. P12047 X31045 GB:29384 Let a Mapping Service take the strain….
  • 24. Strict Relaxed Analysing Browsing Dynamic Equality LinkSet#1 { chemspider:gleevec hasParent imatinib ... drugbank:gleevec exactMatch imatinib ... } chemspider:gleevec drugbank:gleevec
  • 25. ChemSpider Validation & Standardization Platform Quality Assurance
  • 26. Chemistry Validation and Standardization Platform (CVSP) at cvsp.chemspider.com • Validation • Standardization • Parent generation RDF Export Data
  • 27. CTAB REGID1 DataSource Synonym1 Synonym2 XRef1 etc Deposited SDF record Standardized entity OPS_ID1 Parents Charge Parent (OPS_ID7) Isotope Parent (OPS_ID5) Stereo Parent (OPS_ID4) Tautomer Parent (OPS_ID6) Super Parent (OPS_ID8) Fragment (OPS_ID3) Fragment (OPS_ID2)
  • 28. For each Compound (CSID) parent generation is attempted: “Tautomerism in large databases”, Sitzmann and others, J.Comput Aided Mol Des (2010) Parent Description Charge- Unsensitive An attempt is made to neutralize ionized acids and bases. Envisioned to be an ongoing improvement while new cases appear. Isotope- Unsensitive Isotopes replaced by common weight Stereo-Unsensitive Stereo is stripped Tautomer- Unsensitive Tautomer canonicalization is attempting to generate a “reasonable” tautomer Super-Unsensitive This parent is all of the above
  • 29. OPS1 DrugBank ID DB07241 OPS5OPS4 OPS3 OPS2 OPS6 ops:OPS1 skos:exactMatch <http://www4.wiwiss.fu- berlin.de/drugbank/resource/drugs/DB07241> . ops:OPS2 skos:relatedMatch ops:OPS1 . ops:OPS3 skos:relatedMatch ops:OPS1 . ops:OPS3 skos:closeMatch ops:OPS4 . ops:OPS3 skos:closeMatch ops:OPS5 . ops:OPS4 skos:closeMatch ops:OPS6 . ops:OPS5 skos:closeMatch ops:OPS6 .
  • 30.
  • 31. A Precompetitive Knowledge Framework Integration Pharma Needs Inputs Sustainabilit y Stability Security Management / Governance Data Mining Services/Alg orithms Mapping & Populating Architecture Interfaces & Services Content Structured & Unstructured Vocabularies & Identifiers (URIs) Community KD Innovation
  • 32. The Ecosystem is …. API Approach Community Industry Academia Data Provider Software Provider
  • 35. Example applications Advanced analytics ChemBioNavigator Navigating at the interface of chemical and biological data with sorting and plotting options TargetDossier Interconnecting Open PHACTS with multiple target centric services. Exploring target similarity using diverse criteria PharmaTrek Interactive Polypharmacology space of experimental annotations UTOPIA Semantic enrichment of scientific PDFs Predictions GARFIELD Prediction of target pharmacology based on the Similar Ensemble Approach eTOX connector Automatic extraction of data for building predictive toxicology models in eTOX project
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42. The Open PHACTS community ecosystem
  • 43. Becoming part of the Open PHACTS Foundation Members membership offers early access to platform updates and releases the opportunity to steer research and development directions receive technical support work with the ecosystem of developers and semantic data integrators around Open PHACTS tiered membership familiar business and governance model A UK-based not-for-profit member owned company
  • 44. What are the problems with licensing we had to address? – To make data and software generated by the project usable/ reusable – Multiplicity of unclear or non-standard licenses on original data sources • „Public‟ can mean use but not redistribute, use in commercial environment, • Legal position on use and reuse extremely unclear • Different issues than just linking to data – Legal status of integrated collections of the above, and of derived knowledge? – Appropriate software license selection – Legal clarity for EFPIA and end users – Approaches for commercial data integration, EFPIA in-house data AIM: enable maximum possible dissemination and usability of integrated data and architecture with approaches that will be applicable in other data integration projects Licensing Challenges
  • 45. Chose John Wilbanks as consultant A framework built around STANDARD well-understood Creative Commons licences – and how they interoperate Deal with the problems by: Interoperable licences Appropriate terms Declare expectations to users and data publishers One size won„t fit all requirements Data Licensing Solution
  • 46. Open PHACTS Project Partners Pfizer Limited – Coordinator Universität Wien – Managing entity Technical University of Denmark University of Hamburg, Center for Bioinformatics BioSolveIT GmBH Consorci Mar Parc de Salut de Barcelona Leiden University Medical Centre Royal Society of Chemistry Vrije Universiteit Amsterdam Spanish National Cancer Research Centre University of Manchester Maastricht University Aqnowledge University of Santiago de Compostela Rheinische Friedrich-Wilhelms-Universität Bonn AstraZeneca GlaxoSmithKline Esteve Novartis Merck Serono H. Lundbeck A/S Eli Lilly Netherlands Bioinformatics Centre Swiss Institute of Bioinformatics ConnectedDiscovery EMBL-European Bioinformatics Institute Janssen OpenLink pmu@openphacts.org

Editor's Notes

  1. Work Stream 1: Open Pharmacological Space (OPS) Service LayerStandardised software layer to allow public DD resource integrationDefine standards and construct OPS service layerDevelop interface (API) for data access, integration and analysisDevelop secure access modelsExisting Drug Discovery (DD) Resource IntegrationModify existing public DD resources to operate with OPS service layerIMI funds cost of modificationsConsider partnership with international resourcesWS2: Development of exemplar work packagesDevelop exemplar services to test OPS Service Layer proof of concept of the functionality developed in work stream 1services will depend on the expertise of the consortiumSome possible exemplar services:Target Dossier (Data Integration)Integrate target info from diverse sourcese.g. bioactive molecules, druggability, orthology, drug expression signatures Pharmacological Network Navigator (Data Visualisation)Profile “chemical space” of screening librariesVisualisation to assist lead hoppingCompound Dossier (Data Analysis)Integrate compound informationAlgorithms to predict drug action
  2. Mx/psa, how calculated who did it?Mash up. With your data too,- top layer join together but need them allcommerical
  3. 10Can go get everythingOPS not a repo of the world, specific sources
  4. Statistics to be added