Stratergies for the intergration of information (IPI_ConfEX)

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Stratergies for the intergration of information (IPI_ConfEX) - Presentation Transcript

    1. Approaches to Information Integration Ben Gardner Therapeutic Area Scientific Information Services IPI-ConFex (March 2007)
    2. Background
      • Therapeutic Area Scientific Information Services
        • Bringing Information Scientist and Information Technology expertise together
      • Formal organisational recognition of the importance of data
      • Access data across the whole of Pfizer (R, D & C)
      • Create synergies between IS and IT
      • Encourage innovation in data space
      • Leverage internal & external data
      BioMed Information Scientists Chemistry Data Experts Biology Data Experts Chemistry Information Scientists
    3. Overview
      • Integration of internal data
        • Across sites and lines
      • Integration of external data
        • Public domain, vendor supplied and internal
      • Web2.0 and information integration
    4. Utilise a Data Warehousing Strategy
      • Research Information Factory (RIF)
      • Single global repository for all research data
      • Data queried via Data viewers i.e. RGate
      RIF
    5. Research Information Factory – Pro’s
      • Pro’s
      • Easy solution for global data access and sharing
      • Incorporates business rules/Global consistency
      • Scalability – All in one place
      • More than project database – trend analysis
      RIF
    6. Research Information Factory – Con’s
      • Con’s
      • Difficult to incorporate legacy data
      • Performance issues
      • Scalability – interdependencies /complexity
      RIF
    7. What next?
      • Greater integration of internal & external data
      • Novel data models
      • Integrating data across the R,D & C continuum
      • Public domain & Vendor content
      • eMedical records and Clinical Trials
    8. BUT what about integrating external data
      • Internal data
        • Clinical trails
        • Commercial
      • Public domain – Pub Med
      • Vendor content
      • Patent literature
      • Electronic patient records
      RIF How? Defined and controlled data standards and ontology's Foreign/bespoke data standards and ontology's
    9. How can this be made easier?
      • Content suppliers can adopt industry wide standards
        • Common keys/ontology's
          • Single disease ontology
          • Single definition of molecular targets
          • Smiles strings & consistent definition of stereochemistry, tautomerism, etc
      • More vendors could enable simple data integration by
        • Provide content via web services
        • Allow direct querying of database
        • Provide access to the data base schema
      Competitive edge should not come from who owns the data but who can utilise it best
    10. Integrating External Information
      • What is PharmaMatrix?
      • A pre-indexed mine of information systematically relating indications to targets through evidence
      • A tool for reframing target-indication ontology's
      • An architecture to support idea generation and serendipity
      PharmaMatrix slides kindly supplied by Jerry Lanfear
    11. How many nets can we cast…? ~6000 diseases recognized by Medicine ~25K genes in human genome Indication 1 Indication 2 Indication N Indication 3 Diseases / Indications Target 1 Target 2 Target 3 Target N Targets
    12. How many nets can we cast…? Target 1 Target 2 Target 3 Target N Targets Indication 1 Indication 2 Indication N Indication 3 Diseases / Indications Informatics allows us to systematically mine for evidence against all hypotheses: All drug targets x All diseases
    13. How many nets can we cast…? Target 1 Target 2 Target 3 Target N Targets Indication 1 Indication 2 Indication N Indication 3 Diseases / Indications All diseases for a target
    14. How many nets can we cast…? Target 1 Target 2 Target 3 Target N Targets Indication 1 Indication 2 Indication N Indication 3 Diseases / Indications All targets for a disease
    15. What is the Matrix ? 12.8 Million Medline Abstracts 4 Billion words 4 Million Intersects (20 Synonyms per entity) Targets Diseases Evidence 2000 curated diseases 2000 curated targets
    16. “ Show me all the diseases associated with PDE5 from scientific literature ” PDE5 has 40 related terms (PDE5 or phosphodiesterase 5 or phosphodiesterase V or phosphodiesterase (PDE) 5 or phosphodiesterase (PDE) V or pde V or PDE-5 or PDE V or PDE 5 or phosphodiesterase-5 or phosphodiesterase 5A or HSPDE5A or PDE5A or phosphodiesterase-5A or PDE(5) or PDE(5A) or phosphodiesterase (PDE) 5A or PDE 5A or PDE-5A or UK-092,480 or viagra or sildenafil or IBMX or 3-isobutyl-1-methylxanthine or zaprinast or tadalafil or vardenafil or SKF-96231 or YC-1 or DMPPO or UK-83405 or Sch-51866 or UK-343664 or WIN-65579 or GF-248 or T-1032 or SR-265579 or KF-31327 or OPC-35564) There’s ~ 6000 curated diseases Here’s just one: (ASTHMA or asthmatic or Acute severe asthma or Asthmaticus or Excercise induced Asthma or mild intermittent asthma or mild persistant Asthma or moderate persisitant Asthma or Severe persistant Asthma or chronic persistant Asthma or extrinsic Asthma or intrinsic Asthma or aspirin-sensitive asthmatics or Aspirin induced Asthma or occupational asthma or Atopy or allergic asthma) There are about 17 million articles in Medline 17 million abstracts from 32000 separate journals. 4 billion words X X Systematic Searching for Evidence PharmaMatrix - US Patent No . US2005060305 17 th March 2003 Hopkins el al.
    17. And here’s a matrix! – All Targets for Asthma Simple abstract co-occurrence Big numbers! Filter for precedence Reduce the workload
    18. PharmaMatrix – Pro’s & Con’s
      • Pro’s
      • Optimised for hypothesis generation
      • Simplified information mining
      • Flexible data model
      • N-dimensional matrix – wider than just targets x disease
      • Con’s
      • Result sets can be very large
      • Data stewardship overhead
      • External data format can change – no control
      • Limited coverage
    19. From data to information to knowledge
      • Integration projects like the RIF/RGate & PharmaMatrix turn data in to information
      • How do you turn information into knowledge?
      • Searches now return too much information for one person to analyse
      • Need simple tools to share learning’s
      • Can the tools of Web2.0 help?
      Data Information Knowledge Searches Analysis Sharing
    20. What is Web2.0?
      • Web 2.0 refers to second generation of Web-based services that emphasize online collaboration and sharing among users.
      • These services include
        • Social networking sites
          • http:// del.icio.us / ; http:// www.flickr.com / ; http:// www.myspace.com / ;
        • Wikis
          • http:// www.wikipedia.org /
        • Communication tools - Blogs & RSS
        • Folksonomies – Tagging
      • Key to the success of these services is their simplicity to use
    21. Wiki
      • A wiki is a website that allows the visitors themselves to easily add, remove, and otherwise edit and change available content.
      • This ease of interaction and operation makes a wiki an effective tool for mass collaborative authoring .
    22. Easy of use is central
    23.  
    24.  
    25.  
    26.  
    27.  
    28.  
    29. Blog
      • A blog is a user-generated website where entries are made in journal style and displayed in a reverse chronological order.
      • Blogs often provide commentary or news on a particular subject, such as food, politics, or local news. The ability for readers to leave comments in an interactive format is an important part of many blogs.
      • Blog on a Target, Patent, Disease, Synthetic pathway, Business process……
      • Or even this conference
    30. Pharma-Bio-Med blog
    31. Tag’s
      • A tag is a (relevant) keyword or term associated with or assigned to a piece of information (like picture, article, or video clip), thus describing the item and enabling keyword-based classification of information it is applied to.
      • http:// www.connotea.org / - free online reference manager service
    32. Using tag clouds
    33. Summary
      • RIF – Enterprise level integration
        • Enables global sharing, leverage scale
        • Requires clean, curated data
      • PharmaMatrix – Internal/External data integration
        • Need to define industry wide standards
          • Common keys/ontology's
        • Need vendors to develop new pricing models
      • Web2.0 – An opportunity?
        • Social networking to enable information to knowledge
        • Easy, quick, proven and here
          • Wiki’s, Blog’s, Tagging
    34. Acknowledgements

    + bengardner135bengardner135, 3 years ago

    custom

    1512 views, 0 favs, 0 embeds more stats

    Presentation given at the 2007 IPI-ConfEX meeting.

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 1512
      • 1512 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories