Triples And Access

629 views
568 views

Published on

Presentation given at the inaugural meeting of the Concept Web Alliance, 8 May 2009

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
629
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
17
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Triples And Access

  1. 1. Triples & Access Jan Velterop
  2. 2. “ There is something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment of fact. ” Mark Twain, Life on the Mississippi
  3. 3. O yeah? We have far too few returns in terms of usable knowledge out of such overwhelming investment of fact! A lot of fact is deeply hidden!
  4. 4. Current Knowledge Transfer A metaphor (is Greek for ‘truck’ after all) Needle transport
  5. 5. Information overload? Too much knowledge? Stop acquiring it? Just filtering it? Or organisation underload? Lack of conceptual structure? Unprecedented opportunity?
  6. 6. Information overload? Too much knowledge? Stop acquiring it? Just filtering it? Or organisation underload? Lack of conceptual structure? Unprecedented opportunity!
  7. 7. Another metaphor: What is the use of water?
  8. 8. H2O Drink (take in)
  9. 9. What is the use of information?
  10. 10. Age to Know Read (take in)
  11. 11. Publish articles
  12. 12. Stretching the water metaphor: It’s already raining – we must build the ark
  13. 13. The ‘animals’ to come on board:
  14. 14. Slide by Carl Lagoze (Cornell) – from this presentation: http://journal.webscience.org/112/3/orechem.pdf
  15. 15. Stretching the metaphor further: If you need water, rain is free
  16. 16. But if you want quality control and convenience:
  17. 17. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value owner condi/on DOI. All Triples Smart Triples curated curated curated Curated Remove Co-occ Observational Ambiguity and Redundancy Inferred Knowledge Space
  18. 18. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value author condi/on DOI } <Type F1> Database facts (multiple attributes) <Type F2> Community Annotations F+

C+

A+ <Type C1> Co-occurrence sentence (abstracts e.g. PubMed) <Type C2> Co-occurrence Full Text (publisher e.g. Springer) C+

A+ <Type A1> Concept Profile Match <Type A3> Co-expression (gene expression Databases) A+ <Type A4> Modelling hypothesis (e.g. Plectix, InWeb) Multiple Triples T-Cell Development Graph Building (e.g. WikiPathways) Unique to 101668678 Cancer Promoting Genes Interleukin-7 Unique to Springer Unique to Plectix
  19. 19. Unique to 101668678
  20. 20. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value author condi/on DOI } <Type F1> Database facts (multiple attributes) <Type F2> Community Annotations F+

C+

A+ <Type C1> Co-occurrence sentence (abstracts e.g. PubMed) <Type C2> Co-occurrence Full Text (publisher e.g. Springer) C+

A+ <Type A1> Concept Profile Match <Type A3> Co-expression (gene expression Databases) A+ <Type A4> Modelling hypothesis (e.g. Plectix, InWeb) Multiple Triples T-Cell Development Graph Building (e.g. WikiPathways) Unique to 101668678 Cancer Promoting Genes Interleukin-7 Unique to Springer Unique to Plectix
  21. 21. (node 1, unique ID) (node 2, unique ID) < Source concept > < Relations (edge) > < Target Concept > class date value owner condi/on Etc. Triples Smart Triples In these areas significant value Remove is added to the triples Curated Ambiguity and Redundancy Remove Observational Ambiguity and Redundancy Remove Inferred; Ambiguity and constructed Redundancy Knowledge Space
  22. 22. The ‘trustmark’ CWATM: Triple ‘model’ Best practice Interoperability Et cetera
  23. 23. Download
Concept
Web
Alliance
cer/fied
triples Includes edges from: Pubmed (400,000,000 sentences, 5,000,000,000 concept co-occurrences) (from public data) Protein databases (UniProt, IntAct, PDB, HPRD – 75,000 human curated PPIs) (from public data) Gene (co-expression databases (GEO, Express… – 25 square genes) (from public data) STRING edges (200,000 gene-gene edges) (from semi public data) InWeb edges (240,000 unique edges from 17 species) (from proprietary data) Reactome edges (240,000 unique edges from 17 species) (from proprietary data) Chemspider edges (25,000,000 chemicals) (from semi public data) Wiki edges (WikEdge = WikiPathways, WikiProfessionals, Omegawiki, Wikigene) Plectix edges (5,000 extra edges (PPI modeling) (from proprietary data) Private expression data (3000 extra edges, by Merck) (from proprietary data) Et Cetera

×