Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open PHACTS MIOSS may 2016

396 views

Published on

Open PHACTS a sustainable platform for drug discovery using linked data

Published in: Science
  • Be the first to comment

  • Be the first to like this

Open PHACTS MIOSS may 2016

  1. 1. Open PHACTS – experience of sustainability MIOSS 2016 nick@openphactsfoundation.org Openphacts.org
  2. 2. Open PHACTS Mission: Integrate Multiple Research Biomedical Data Resources Into A Single Open & Free Access Point …and make it sustainable in the long term
  3. 3. WHAT WE THOUGHT … or how we thought the world was
  4. 4. Literature PubChem Genbank Patents Databases Downloads Data Analysis Data Integration Firewalled Databases How do pharma companies use public data?
  5. 5. How do pharma companies use public data? Pfizer AZ Roche n
  6. 6. P12047 X31045 GB:29384 Andy Law's Third Law “The number of unique identifiers assigned to an individual is never less than the number of Institutions involved in the study” http://bioinformatics.roslin.ac.uk/lawslaws/
  7. 7. WHAT WE DID
  8. 8. ChEMBL DrugBank Gene Ontology Wikipathways UniProt ChemSpider UMLS ConceptWiki ChEBI TrialTrove GVKBio GeneGo TR Integrity “Find me compounds that inhibit targets in NFkB pathway assayed in only functional assays with a potency <1 μM” “What is the selectivity profile of known p38 inhibitors?” “Let me compare MW, logP and PSA for known oxidoreductase inhibitors” DisGeNet neXtProt ChEMBL Target Class ENZYME FDA adverse events SureChEMBL
  9. 9. @gray_alasdair Big Data Integration 11
  10. 10. Nanopub Db VoID Data Cache (Virtuoso Triple Store) Semantic Workflow Engine Linked Data API (RDF/XML, TTL, JSON) Domain Specific Services Identity Resolution Service Chemistry Registration Normalisation & Q/C Identifier Management Service Indexing CorePlatform P12374 EC2.43.4 CS4532 “Adenosine receptor 2a” VoID Db Nanopub Db VoID Db VoID Nanopub VoID Public Content Commercial Public Ontologies User Annotations Apps
  11. 11. Mappings: Raw Mappings (Raw) 25,087,328
  12. 12. “CUTTING THE GORDIAN KNOT” What are the problems with licensing we had to address? – To make the data and software generated by the project usable and reusable – Multiplicity of unclear or non-standard licenses on original data sources • ‘Public’ can mean use but not redistribute, use in commercial environment, • Legal position on use and reuse extremely unclear • Different issues than just linking to data – What is the legal status of integrated collections of the above, and of derived knowledge? – Appropriate software license selection – Legal clarity for EFPIA and end users – Approaches for commercial data integration, EFPIA in-house data AIM: to enable maximum possible dissemination and usability of the integrated data and architecture generated by the project - with approaches that will be applicable in other data integration projects Licensing Challenges
  13. 13. Dataset Downloaded Version Licence Triples Bio Assay Ontology CC-By 10,360 CALOHA 8 Apr 2015 2014-01-22 CC-By-ND 14,552 ChEBI 4 Mar 2015 125 CC-By-SA 1,012,056 ChEMBL 18 Feb 2015 20.0 CC-By-SA 445,732,880 ConceptWiki 12 Dec 2013 CC-By-SA 4,331,760 DisGeNET 31 Mar 2015 2.1.0 ODbL 15,011,136 Disease Ontology 2015-05-21 CC-By 188,062 DrugBank 19 Feb 2015 4.1 Non-commercial 4,028,767 ENZYME 2015_11 CC-By-ND 61,467 FDA Adverse Events 9 Jul 2012 CC0 13,557,070 Example Data Licenses
  14. 14. Handling private data securely in the cloud
  15. 15. WHERE DID THAT GET US?
  16. 16. DELIVERY UPDATE Regular data updates as the core data refreshes API updates aligned to new business questions and changes Workstreams to add further new data – see later New release May 2016 2.1 – SureChEMBL and Pathways update Further updates planned for summer 2016
  17. 17. Usage >500 million queries
  18. 18. Public Data Open PHACTS Evolution - Platform Public Data Private Data Public Data VM VM Public Data Commercial Data • Security Audited Hosted platform • Platform sustainability
  19. 19. Open PHACTS Expanding EcoSystem Further Apps Explorer
  20. 20. Workshops Researchathons Further planned
  21. 21. WHAT IT FEELS LIKE NOW … or how it really is
  22. 22. Sustaining Impact “Software is free like puppies are free - they both need money for maintenance” …and more resource for future development
  23. 23. Kick-Starting Sustainability Collaboration Grants Industry Open PHACTS APIUsers Apps API
  24. 24. Open PHACTS Foundation Routes to Access Access Route Open API services Unlimited API services Unlimited API, RDF and Link sets Open PHACTS Virtual Machine Full OPF Member ✓ ✓ ✓ ✓ Licensor/ Reseller* ✓ ✓ ✓ Licensor (Own Use) ✓ ✓ ✓ High volume API Licensor ✓ ✓ Open Access API Consumer ✓ Open Data Non- commercial ✓ ** *3rd parties must have own agreement with OPF ** talk to us for collaborative proposals – non commercial use
  25. 25. Open PHACTS Foundation engaged in the following projects ……
  26. 26. Come and collaborate New projects Improve our code and services Open Innovation projects Webinars New ideas for data services and workflows
  27. 27. info@openphactsfoundation.org @Open_PHACTS Open PHACTS Practical Semantics Acknowledgements GlaxoSmithKline – Coordinator Universität Wien – Managing entity Technical University of Denmark University of Hamburg, Center for Bioinformatics BioSolveIT GmBH Consorci Mar Parc de Salut de Barcelona Leiden University Medical Centre Royal Society of Chemistry Vrije Universiteit Amsterdam Novartis Merck Serono H. Lundbeck A/S Eli Lilly Netherlands Bioinformatics Centre Swiss Institute of Bioinformatics ConnectedDiscovery EMBL-European Bioinformatics Institute Janssen Esteve Almirall OpenLink Scibite The Open PHACTS Foundation Spanish National Cancer Research Centre University of Manchester Maastricht University Aqnowledge University of Santiago de Compostela Rheinische Friedrich-Wilhelms-Universität Bonn AstraZeneca Pfizer
  28. 28. Questions/Discussions Moving from Project to Foundation – What are the expectations? How best to show the value of platform/API? What would you expect from sustainability? What can we do differently? How to best get contributors involved? Thanks

×