Pistoia Alliance: SESL Pilot for a Biomedical Brokering Service
Upcoming SlideShare
Loading in...5
×
 

Pistoia Alliance: SESL Pilot for a Biomedical Brokering Service

on

  • 382 views

8th Pharma Tech IT congress 16sep10

8th Pharma Tech IT congress 16sep10

Statistics

Views

Total Views
382
Views on SlideShare
382
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Pistoia Alliance: SESL Pilot for a Biomedical Brokering Service Pistoia Alliance: SESL Pilot for a Biomedical Brokering Service Presentation Transcript

    • The Pistoia SESL Project An Emerging Vehicle for SESL Pilot Pistoia Alliance Collaboration: TheBiomedical Brokering Service for a Pistoia Alliance Ian Harrow, Wendy Filsell, Dietrich Rebholz Schuhmann8th PharmaTech IT Congress http://pistoiaalliance.org16th Sept 2010
    • Presentation Outline• Pistoia Background• SESL: Biomedical Knowledge Brokering• The Knowledge Service Framework• The Pilot• Minimal configuration to test a brokering service• SESL user interface Mock Up• Project Timelines• Summary• Acknowledgements
    • Pistoia Background – How it all started 2007 2008 2009 2010 Now Informal Met in Create Pistoia as Not Official 7 / 10 top Pharma as members meeting Pistoia for profit company Launch 33 members Stanhope Gate Domains Established Pistoia Lhasa Curzon Informal Collaborations Collaboration/project meetingPistoia Description HistoryThe primary purpose of the Pistoia Alliance is to Initial Meeting with GSK, AZ,streamline non-competitive elements of the life Pfizer and Novartis – outlinedscience workflow by the specification of common similar challenges andstandards, business terms, relationships and frustrations in the Informatics sector of DiscoveryprocessesPistoia Goals The advent of Web Services and Web 2.0 allow for decoupling of proprietary data from technology• to allow this framework to encompass/support most pre-competitive work between the Publicly available structural and biological DBs allow organisations for a non-IP related analysis and as a scientific test suite.• to support life science workflow prior to submission Sponsorship from R&D IS heads within Life Science• to work with other Standards organisations industry
    • Pistoia Domains Pistoia Domains group areas of interest, scope and deliver projects Pistoia Domain – high level collectionPistoia Groups – as of Working Groups with common themes External Groupsdefined in byelaws Domain Allows governance across outside of Steering a domain using Working Pistoia Board of Groups Group chairs and Technical Committee reps Directors Could: • join Pistoia Working The main project delivery • influence Pistoia Working mechanism in Pistoia. All Officers Groups members Groups standards will be • influence through (Operational delivered by WGs other standards Team) groups and activities Provide expertise for WGs • Collaborate on and running Pistoia standards’ feasibility Technical Pistoia Define: studies •Requirements • Collaborate through Committee Members •Technical Standards non-Pistoia •Service Standards Standards initiatives
    • Pistoia Domains Pistoia Domains focus on business workflows /supply chains Enabling Knowledge and Information Services Vocabulary Visualisation Application Integration Workflow Others Biology Chemistry Translational Data Data Data Services Services Services
    • SESL: Biomedical Knowledge Brokering• Challenge: – No single system for retrieving gene to disease relationships contained in both published & biological database content – Need a ‘push model’ for biomedical knowledge access: the current model requires the consumer to search 1000’s of content sources• Opportunity: Pilot Project with key stakeholders – Pilot a ‘push model’ for biomedical knowledge brokering – Engage multiple consumers, content providers and a single, public group to develop the necessary infrastructure to explore the standards required for the model to work in production• History: – May 2008: Common Disease Knowledge Environment (CDKE) IMI call drafted – Sep 2008: postponed call publication – Jan 2009: x-pharma meeting in London on how to progress CDKE – Apr 2009: CDKE presented at SESL workshop – Oct 2009: SESL Pilot meeting (funders) – Jan 2010: Pilot launch
    • The Knowledge Service Framework Multiple Consumers‘Consumer’ Disease Dossier KnowledgeFirewall Applications Service Layer Std Public CommonOpen Assertion & Meta Data Mgmt Vocabularies ServiceStds Transform / Translate Business Broker Integrator RulesSupplierFirewall Content Suppliers Db 2 Effort required Db 4 to fit DBs to Corpus 1 service layer Db 3 Corpus 5 7
    • A Production Service vision...ConsumerSide Exemplar Disease Dossier Application License Service Layer Std Public Service Layer Std Public Service Layer Std Public Service Layer Std Public Vocabularies Vocabularies Vocabularies Vocabularies Assertion & Meta Data Mgmt Assertion & Meta Data Mgmt Assertion & Meta Data Mgmt Assertion & Meta Data Mgmt Transform / Translate Business Transform / Translate Business Transform / Translate Business Transform / Translate Business Rules Rules Rules Rules Integrator Integrator Integrator Integrator Broker Org #1 Broker Org #2 Broker Org #3 Internal Broker LicenseCorpus 1 Db 3 Corpus 5 Db 7 Corpus 9 Db 11 Corpus 13 Db 15 Corpus 4 Corpus 8 Corpus 12 Corpus 16 Db 2 Db 6 Db 10 Db 14SupplierSide
    • The technology is coming here
    • The Pilot• Deliverables: – Publication of standards & recommendations for service implementation – Pilot implementation of service for a single disease (assertions from pre-defined document sets & databases) – Establish ways of working pre-competitively across industry/vendor/academia – Dialogue and assessment of cost / value, with key content suppliers in moving to such a push model for content (viability of moving to production)• Status: – AZ, Pfizer, GSK, Roche, Unilever, EBI, NPG, OUP, Elsevier & RSC – 12 month project, £200K direct funding (+ PM & Architecture support) – Contract between Pistoia & EBI signed 20th January 2010 for 1 year• Scope: – Development of an assertion database in combination with a user interface and associated web services for one disease/indication/phenotype of broad interest: Type II Diabetes – Assertional content derived from 3 structured data sources and limited Journal content (co-occurrence & statistical derivation from full text) – Assertional evidence for filtering and drill down to primary data. – Limited vocabulary development for area of focus: Type II Diabetes
    • Minimal configuration to test a Brokering Service Interface User Interface Layer at consumer org’nCondition: Service Layer Assertion & Meta Data Mgmt Std Public Vocabularies Service Layer Assertion & Meta Data Mgmt Std Public Vocabularies Brokering serviceIdentical structure.Different content Transform / Translate Query Transform / Translate Query Layer templates templateswhich can overlap. Triple store 1 Triple store 2 at EBI Broker #1 Broker #2 Primary source Elsevier RSC Layer corpus corpus NPG corpus OUP corpus at provider org’n EBI Swissprot EBI Ensembl EBI Array EBI Swissprot database gene Express database database
    • SESL user interface mock-up: Query Gene: abc Relationship: Any Disease: Diabetes Constraint: Species: Any Tissue: Any
    • SESL user interface mock-up: Results Gene R’ship Disease Species Evidence 1 abc1 Co-occurs Diabetes Mus Paper UID:1234 2 abc1 Up-Reg Diabetes Homo ArrayExpress: XXX 3 abc2 Co-occurs Diabetes Homo Paper UID:1344 4 abc13 Co-occurs Diabetes Mus Paper UID:1314 5 abc7 Mutation Diabetes Rattus OMIMI: XXX 6 abc1 Co-occurs Diabetes Mus Paper UID:45643 7 abc1 Co-occurs Diabetes Homo Paper UID:2143 8 abc1 Co-occurs Diabetes Mus Paper UID:1204
    • Diabetes genes in SESL prototype Primary data sources SESL projectGene symbols UNIPROT Involvement in OMIM in ArrayExpress Atlas? SESL? SESL co-occurence SESL OMIM SESL Arrayexpress diabetes type 2? diabetes? in Full Text diabetes? Atlas? literature?ABCC8 Y - NIDDM Y Y Y Y Y NADIPOQ Y - NIDDM Y Y Y Y N NALMS1 Y - NIDDM N Y Y Y N NBLK Y - MODY11 Y Y Y Y N NCAPN10 Y - NIDDM Y Y Y Y Y NCEL Y - MODY8 Y Y Y Y Y NENPP1 Y - NIDDM Y Y Y Y Y NGCK Y - MODY2 Y Y Y Y Y NHNF1A Y - MODY3 Y Y Y Y Y NHNF1B Y - MODY5 Y Y Y Y Y NHNF4A Y - MODY1 Y Y Y Y Y NINPPL1 Y - NIDDM N Y NINS Y - MODY10 Y Y Y Y N NIRS1 Y - NIDDM Y Y Y Y N NKCNJ11 Y - NIDDM Y Y NKLF11 Y - MODY7 Y Y Y Y Y NLIPC Y - NIDDM Y Y Y Y Y NMAPK8IP1 Y - NIDDM Y Y Y Y Y NMCF2L2 Y - NIDDM N Y NNEUROD1 Y - MODY6 Y N Y Y Y NPAX4 Y - MODY9 Y Y Y Y Y NPDX1 Y - MODY4 Y Y Y Y N NPPARG Y - NIDDM Y Y Y Y Y NPPP1R3A Y - NIDDM Y Y Y Y N YSLC30A8 Y - NIDDM Y Y Y N Y NTCF7L2 Y - NIDDM Y Y Y Y Y NUCP1 Y - NIDDM N Y Y Y N NWFS1 Y - NIDDM Y Y Y Y Y N
    • Timelines: Development PhaseTask/Deliverable Phase Type Jan-10 Feb-10 Mar-10 Apr-10 May-10 Jun-10 Jul-10 Aug-10 Sep-10 Oct-10 Nov-10 Dec-10 Jan-11 Feb-11 Month 0 Month 1 Month 2 Month 3 Month 4 Month 5 Month 6 Month 7 Month 8 Month 9 Month 10 Month 11 Month 12Finalised Technical Specification Deliverable  ^document (Month 4) 1Build vocabularies within scope Development Task 2 RDF data export from UniProt Development Task 3and Ensembl RDF data export of Array Express Development Task 4 Extract literature assertions for Development Task 6T2DB from publishers’ content Develop RDF triple store schema Development Task 7and demonstrator Develop query definitions Development Task 8 Establish API services for remote Development Task 9access Develop simple user interface for Development Task 10demonstrator (based on mock-up) Write documentation that Development Task 11defines the standard frameworkAccess to early prototype Deliverabledemonstrator and report(Month 7 & 8) 2&3 ^^ Final prototype demonstrator, Deliverablerecommendations post-pilot, 4&5 ^ ^^report (Month 11 & 12) andpublic launch
    • Timelines: Testing and Communication PhaseTask/Deliverable Phase Type Jan-10 Feb-10 Mar-10 Apr-10 May-10 Jun-10 Jul-10 Aug-10 Sep-10 Oct-10 Nov-10 Dec-10 Jan-11 Feb-11 Month 0 Month 1 Month 2 Month 3 Month 4 Month 5 Month 6 Month 7 Month 8 Month 9 Month 10 Month 11 Month 12Tests of the demonstrator (full Testing and Task 12private and limited public communicationinstance)Deploy publc demonstrator Testing and Task 13 communicationWrite publication for standard Testing and Task 14definition communicationDevelop recommendations for Testing and Task 15post-pilot project communicationFinal prototype demonstrator, Deliverablerecommendations post-pilot 4&5 ^ ^and report (Month 11 & 12)Public release of limited Deliverabledemonstrator (Month 13) 6 ^
    • Summary of Accomplishments• Significant progress to towards realising the technical goal of knowledge brokering – Can a push model work? A hyperstandard?• A unique consortium from three cultures: industry, publishers and academia – Working together – sharing costs and risks• Business opportunities and concerns – For data providers and consumers?
    • AcknowledgementsIndustry Publishers EMBL-EBIMike Westaway Claire Bird – OUP Christoph GrabmuellerIan Stott Richard O’Bierne – OUP Silvestras KavaliauskasPeter Woollard Jabe Wilson – Elsevier Roderigo LopezNigel Wilkinson Bradley Allen – Elsevier Dominic ClarkCatherine Marshall Colin Batchelor – RSC Jo McEntyreIan Dix Richard Kidd – RSC Janet ThorntonNick Lynch David Hoole – NPG Cath BrooksbankMichael Braxenthaler Alf Eaton - NGPAshley George