A little semantics … can go a long way! What is the Semantic Web and how can it be used to accelerate translational research and biological discovery Helena F. Deus
Data is what you find on the Web
Data, data everywhere Sequences Microarrays Electrophoresis Chrystalography In vitro experiments
What pathways is my protein involved in?
Building bridges If you could have only 3 apps to do all your work, which ones would they be?
Building bridges Statistics
Building bridges species cc5 sub- type
Biological Knowledge Continuum Metabolome Knowledge Continuum Medical Records Microarrays Proteome Microbiome Genome Sequences Protein Gels
Enabling Translational Research
Re-Using Data in Biology ~20 000 genes ~100 interesting genes/proteins ~ 10 interesting pathways ~5 genes/proteins testable in the lab High-throughput technologies Literature Browse databases Computational statistics Hypothesis Generation “I like to call it low-input, high-throughput, no-output biology.”
Writing the story ??
Computers can make life easier! Statistics
A Little Semantics mecA Strain1 hasGene “resistance to met” causes mecA Strain1 Sample1 origin pneumon disease Sample1
Principle #1 Use URL to name things Principle #2 Organize data in Triples A Little Semantics http://mecA http://Strain1 hasGene “resistance to met” causes mecA http://Strain1 Sample1 origin pneumon disease Sample1
A Little Semantics http://mecA http://Strain1 hasGene “resistance to met” causes http://mecA http://Strain1 Sample1 origin pneumon disease Sample1
... a lot of knowledge networking! epidermal growth factor receptor rea:Membrane nci:has_description rea:keyword CCCCGGCGCAGCGCGGCCGCAGCAGCCTCCGCCCCCCGCACGGTGTGAGCGCCCGACGCGGCCGAGGCGG … nih:sequence rea:Receptor nih:EGFR nih:EGFR rea:keyword nih:organism rea:keyword Homo sapiens rea:Transferase nih:interacts nih:EGF nih:organism Reactome NCBI
Linked Data Cloud – the Story so Far Src: http://linkeddata.org/
How to make use of that data? What are the microbial Staphylococcus strains, belonging to clonal complex 5 and collected in Portugal? And when were they collected? Staphylococcus Clonal Complex 5 Date of Collection Portugal
How to make use of that data? What are the microbial Staphylococcus strains, belonging to clonal complex 5 and collected in Portugal? ?Strain :hasClonalComplex 5 :hasSpeciesStaphylococcus :hasOrigin Portugal And when were those isolates collected? ?Sample :hasIsolate ?Strain ; :wasCollected ?Date
Linking Diseases Src: Kwang-Il Goh et al. The human disease network PNAS 2007 104 (21)
Genetic Landscape Source: Science 22 January 2010: Vol. 327 no. 5964 pp. 425-431
How about the statistics?
Plugging data to the Web of the Future
Statements per rule 0 350 2500 2000 50 1500 1000 300 500 0 0 100 200 300 400 500 600 700 800 900 1000 Sessions Rules 0 10 20 30 40 50 60 70 100 0 5 250 10 15 20 150 200 25 Users A year in the life of a semantic database Measuring the re-engineering of ontologies Day 5
Day 17 Time (days) Day 152 Growth Day 25
Exploring TCGA via S3DB
2001: The Semantic Web Semantic Web A web where computers, not just humans, can read and write