Itqb talkslideshfd deritemplate
Upcoming SlideShare
Loading in...5
×
 

Itqb talkslideshfd deritemplate

on

  • 301 views

itqb slides

itqb slides

Statistics

Views

Total Views
301
Views on SlideShare
301
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Vivemos no mundo dos dadosA internet tornou o acessoaos dados de umacoisalimitada a umacoisasuperabundante
  • Alemn disso, nemtodosos dados saoiguaisNa biologiaem particular, a explosao dos dados estaacoplada a heterogeneidade dos dados
  • Porcausadaheterogeneidade,por um lado, e daabundancia de dados muitodisposerosna web poroutro, e cadavezmaisdificlencontraros dados queitneressamA maiorparta das bases de dados queexistemexigemque se conhecamuitobemnao so osnossos dados, mastambem a interface das bases de dados...Este processo e taodemoradoqueacabapornaofazersentido
  • The problem today is that experimental data, such as gene expression results or sequences, are being deposited in proprietary databases, which often do not share the models and therefore are difficult to interoperate. The current state of affairs is that data is brought into these databases, but researchers from different fields are kept out, making collaboration difficult. To create and environment where scientist from different areas can interconnect, share knowledge and ideas, we need to create a knowledge continuum. The knowledge continuum in biology can be used by multiple communities at the same time to provide an answer to complicated questions such as cancer.The semantic web technologies are seen as the ideal starting point for the creation of social machines where knowledge and data can be shared, because they rely entirely on the lessons learned from using the Web.
  • Messages:Finding the mathematics of biology;patterns and interrealtedness of biological entitiesBiological data in computational formats; automate data analysis and annotation is a dream which is not yet achievedTechnologies that could help make such a dream reality; transform the www into a computational platform where read and write operations are supported and boundaries between knowledge systems are erased
  • What if computers could do that for us?
  • Unlinked data would look something like this; the nih would have some information about the EGFR gene; when you go to reactome, some more information about it can be found; what linked data does is eliminate the boundaries between the systems and enable the joining of the data through its identifiers
  • Links to our origins
  • Simplified views of the complexity in the cell
  • the tcga model in s3db was indeed at the root of several studies that make use of the integrative capabilities of S3DB to integrate data that would otherwise require significant amounts of time parsing and aggregating
  • In 2001, Tim Berners Lee, who was also the inventor of the World Wide Web, planted the seeds for a new solution. He called it the Semantic Web.The primary goal of the Semantic Web was to create a space where data would be linked in such as way that not only humans, but also machines could read and interact with it. Ultimately, these machines or agents would become the main way of interaction between people and the data on the web. Instead of browsing the web, users could ask these agents to collect the necessary information to answers a question or schedule an appointment.

Itqb talkslideshfd deritemplate Itqb talkslideshfd deritemplate Presentation Transcript

  • A little semantics … can go a long way!
    What is the Semantic Web and how can it be used to accelerate translational research and biological discovery
    Helena F. Deus
  • Data is what you find on the Web
  • Data, data everywhere
    Sequences
    Microarrays
    Electrophoresis
    Chrystalography
    In vitro experiments
  • What pathways is my protein involved in?
  • Building bridges
    If you could have only 3 apps to do all your work, which ones would they be?
  • Building bridges
    Statistics
  • Building bridges
    species
    cc5
    sub-
    type
  • Biological Knowledge Continuum
    Metabolome
    Knowledge Continuum
    Medical Records
    Microarrays
    Proteome
    Microbiome
    Genome
    Sequences
    Protein Gels
  • Enabling Translational Research
  • Re-Using Data in Biology
    ~20 000 genes
    ~100 interesting genes/proteins
    ~ 10 interesting pathways
    ~5 genes/proteins testable in the lab
    High-throughput technologies
    Literature
    Browse databases
    Computational statistics
    Hypothesis Generation
    “I like to call it low-input, high-throughput, no-output biology.” 
  • Writing the story
    ??
  • !!
  • Computers can make life easier!
    Statistics
  • A Little Semantics
    mecA
    Strain1
    hasGene
    “resistance to
    met”
    causes
    mecA
    Strain1
    Sample1
    origin
    pneumon
    disease
    Sample1
  • Principle #1
    Use URL to name
    things
    Principle #2
    Organize data in Triples
    A Little Semantics
    http://mecA
    http://Strain1
    hasGene
    “resistance to
    met”
    causes
    mecA
    http://Strain1
    Sample1
    origin
    pneumon
    disease
    Sample1
  • A Little Semantics
    http://mecA
    http://Strain1
    hasGene
    “resistance to
    met”
    causes
    http://mecA
    http://Strain1
    Sample1
    origin
    pneumon
    disease
    Sample1
  • ... a lot of knowledge networking!
    epidermal growth factor receptor
    rea:Membrane
    nci:has_description
    rea:keyword
    CCCCGGCGCAGCGCGGCCGCAGCAGCCTCCGCCCCCCGCACGGTGTGAGCGCCCGACGCGGCCGAGGCGG …
    nih:sequence
    rea:Receptor
    nih:EGFR
    nih:EGFR
    rea:keyword
    nih:organism
    rea:keyword
    Homo sapiens
    rea:Transferase
    nih:interacts
    nih:EGF
    nih:organism
    Reactome
    NCBI
  • Linked Data Cloud – the Story so Far
    Src: http://linkeddata.org/
  • How to make use of that data?
    What are the microbial Staphylococcus strains, belonging to clonal complex 5 and collected in Portugal? And when were they collected?
    Staphylococcus
    Clonal Complex 5
    Date of
    Collection
    Portugal
  • How to make use of that data?
    What are the microbial Staphylococcus strains, belonging to clonal complex 5 and collected in Portugal?
    ?Strain :hasClonalComplex 5
    :hasSpeciesStaphylococcus
    :hasOrigin Portugal
    And when were those isolates collected?
    ?Sample :hasIsolate ?Strain ;
    :wasCollected ?Date
  • Linking genomes
  • Linking Diseases
    Src: Kwang-Il Goh et al. The human disease network PNAS 2007 104 (21)
  • Genetic Landscape
    Source: Science 22 January 2010: Vol. 327 no. 5964 pp. 425-431 
  • How about the statistics?
  • Plugging data to the Web of the Future
  • Statements per rule
    0
    350
    2500
    2000
    50
    1500
    1000
    300
    500
    0
    0
    100
    200
    300
    400
    500
    600
    700
    800
    900
    1000
    Sessions
    Rules
    0
    10
    20
    30
    40
    50
    60
    70
    100
    0
    5
    250
    10
    15
    20
    150
    200
    25
    Users
    A year
    in the life of a semantic database
    Measuring the re-engineering of ontologies
    Day 5
    • Seeding
    Day 365
    • Calibration
    Day 17
    Time (days)
    Day 152
    Growth
    Day 25
    • Maturation
  • Exploring TCGA via S3DB
  • 2001: The Semantic Web
    Semantic Web
    A web where computers, not just humans, can read and write