Date: 01/08/2014
Semantic Web 101:
Benefits for geologists
Daniel Garijo
Ontology Engineering Group,
Departamento de InteligenciaArtificial.
Universidad Politécnica de Madrid
What is the Semantic Web?
•Extension of the Web by using World Wide Web
Consortium (W3C) Standards
•Generally, a set of techniques for:
•Knowledge representation
•Improve data sharing
•Improve data access
•Link distributed resources.
•How?
•RDF, vocabularies, ontologies and standards
•Linked Data
RDF: The Resource Description Framework
• W3C recommendation
• Useful to represent metadata and describe any type of
information in a machine-accesible way.
• Resources are described in terms of properties and property
values using RDF statements
• Statements are represented as triples, consisting of a subject,
predicate, and object [S,P,O]
Object
property
Statement
© Slide adapted from “RDF and RDF Schema”- Raúl García et al.
Subject
RDF: Example
http://example.org/paper1 http://example.org/Tikoff
http://example.org/paper2
“Crustal-scale, en
echelon…”
hasTitle
hasAuthor
hasAuthor
Basil Tikoff
hasName
“Preexisting fractures and
the formation of an iconic
American landscape …”
hasTitle
Vocabularies and Ontologies
•Vocabulary:
•Defines the concepts and relationships used to describe and
represent an area of concern.
•Used to classify the terms that can be used in a particular
application, characterize possible relationships, and define
possible constraints on using those terms.
•Ontology:
•More complex, and possibly quite formal collection of terms.
http://www.w3.org/standards/semanticweb/ontology
Heterogeneity vs standardization
Image from: http://www.cs.vu.nl/~frankh/spool/ISWC2011Keynote/Slide32.JPG
Freedom of design
Guided design
(agreed vocabularies + extensions)
Linked Data
1.Use URIs as names for things.
2.Use HTTP URIs so that people can look up those names.
3.When someone looks up a URI, provide useful information.
4.Include links to other URIs.
“Linking Open Data clouddiagram, by Richard Cyganiak and AnjaJentzsch. http://lod-cloud.net/”
Challenges for geologists
How can this help YOU?
Some of the challenges I have discovered so far…
•No standard way to process , store and archive the metadata related to samples
•Not straightforward to find the relation between samples and scientific papers
•Repository redundancy: difficult to know if samples are duplicated
•Repository heterogeinity: difficult to establish links between data repositories
•Difficult to query a repository: the same query is not valid for several repositories.
•Which license do I add to my data? How do I attach it?
•Accessing data: sharing mappings from different authors is often done by direct
contact to the author.
•Trust in observations: you have to rely on the scientist who did them
•Map integration of heterogeneous observations
•How reproducible are the methods applied to the data in the analyses for the
paper?
•….
Some Helpful Standards
+ Linked Data
Sensor Network Ontology (SSN)
•Ontology for describing observations
•Provenance of the observation (who,
where, how)
•Other metadata like sensing method
PROV - O
•Vocabulary for provenance
•Tracking the resources and
activities that influenced on a result
•Credit
•Attribution
•Responsibility
Exposing scientific methods
Text:
Narrative of method,
software packages used
Workflow:
Workflow/scripts describing
dataflow, codes, and parameters
Data:
Key datasets and figures/plots
Typical Published Article
Text:
Narrative of method,
software packages used
Data:
Key datasets and figures/plots
Reproducible Article:
Weaver, GenePattern GRRD, etc.
Exposing scientific methods: Research Objects
Aggregation of resources that bundles together the contents
of a research work:
Conclusions
SW can be helpful to
•Enable accessibility to your research (paper) data (Linked Data)
•Facilitate data sharing and consumption (standards +Linked Data)
•Enable proper credit/citation (Provenance)
•Ease Metadata collection (Standards)
•Facilitate reproducibility (Workflows and Research Objects)
References
Useful links
•SSN: http://www.w3.org/2005/Incubator/ssn/XGR-ssn-20110628/
(observation module)
•PROV: http://www.w3.org/TR/prov-o/
•Workflows and provenance: http://www.opmw.org/model/OPMW/
•Research Objects: http://www.researchobject.org/
•Which License do I attach to my data?
http://creativecommons.org/choose/
•Data repositories: http://figshare.com/, http://zenodo.org/
Date: 01/08/2014
Semantic Web 101:
Benefits for geologists
Daniel Garijo
Ontology Engineering Group,
Departamento de InteligenciaArtificial.
Universidad Politécnica de Madrid

Semantic web 101: Benefits for geologists

  • 1.
    Date: 01/08/2014 Semantic Web101: Benefits for geologists Daniel Garijo Ontology Engineering Group, Departamento de InteligenciaArtificial. Universidad Politécnica de Madrid
  • 2.
    What is theSemantic Web? •Extension of the Web by using World Wide Web Consortium (W3C) Standards •Generally, a set of techniques for: •Knowledge representation •Improve data sharing •Improve data access •Link distributed resources. •How? •RDF, vocabularies, ontologies and standards •Linked Data
  • 3.
    RDF: The ResourceDescription Framework • W3C recommendation • Useful to represent metadata and describe any type of information in a machine-accesible way. • Resources are described in terms of properties and property values using RDF statements • Statements are represented as triples, consisting of a subject, predicate, and object [S,P,O] Object property Statement © Slide adapted from “RDF and RDF Schema”- Raúl García et al. Subject
  • 4.
    RDF: Example http://example.org/paper1 http://example.org/Tikoff http://example.org/paper2 “Crustal-scale,en echelon…” hasTitle hasAuthor hasAuthor Basil Tikoff hasName “Preexisting fractures and the formation of an iconic American landscape …” hasTitle
  • 5.
    Vocabularies and Ontologies •Vocabulary: •Definesthe concepts and relationships used to describe and represent an area of concern. •Used to classify the terms that can be used in a particular application, characterize possible relationships, and define possible constraints on using those terms. •Ontology: •More complex, and possibly quite formal collection of terms. http://www.w3.org/standards/semanticweb/ontology
  • 6.
    Heterogeneity vs standardization Imagefrom: http://www.cs.vu.nl/~frankh/spool/ISWC2011Keynote/Slide32.JPG Freedom of design Guided design (agreed vocabularies + extensions)
  • 7.
    Linked Data 1.Use URIsas names for things. 2.Use HTTP URIs so that people can look up those names. 3.When someone looks up a URI, provide useful information. 4.Include links to other URIs. “Linking Open Data clouddiagram, by Richard Cyganiak and AnjaJentzsch. http://lod-cloud.net/”
  • 8.
    Challenges for geologists Howcan this help YOU? Some of the challenges I have discovered so far… •No standard way to process , store and archive the metadata related to samples •Not straightforward to find the relation between samples and scientific papers •Repository redundancy: difficult to know if samples are duplicated •Repository heterogeinity: difficult to establish links between data repositories •Difficult to query a repository: the same query is not valid for several repositories. •Which license do I add to my data? How do I attach it? •Accessing data: sharing mappings from different authors is often done by direct contact to the author. •Trust in observations: you have to rely on the scientist who did them •Map integration of heterogeneous observations •How reproducible are the methods applied to the data in the analyses for the paper? •….
  • 9.
    Some Helpful Standards +Linked Data Sensor Network Ontology (SSN) •Ontology for describing observations •Provenance of the observation (who, where, how) •Other metadata like sensing method PROV - O •Vocabulary for provenance •Tracking the resources and activities that influenced on a result •Credit •Attribution •Responsibility
  • 10.
    Exposing scientific methods Text: Narrativeof method, software packages used Workflow: Workflow/scripts describing dataflow, codes, and parameters Data: Key datasets and figures/plots Typical Published Article Text: Narrative of method, software packages used Data: Key datasets and figures/plots Reproducible Article: Weaver, GenePattern GRRD, etc.
  • 11.
    Exposing scientific methods:Research Objects Aggregation of resources that bundles together the contents of a research work:
  • 12.
    Conclusions SW can behelpful to •Enable accessibility to your research (paper) data (Linked Data) •Facilitate data sharing and consumption (standards +Linked Data) •Enable proper credit/citation (Provenance) •Ease Metadata collection (Standards) •Facilitate reproducibility (Workflows and Research Objects)
  • 13.
    References Useful links •SSN: http://www.w3.org/2005/Incubator/ssn/XGR-ssn-20110628/ (observationmodule) •PROV: http://www.w3.org/TR/prov-o/ •Workflows and provenance: http://www.opmw.org/model/OPMW/ •Research Objects: http://www.researchobject.org/ •Which License do I attach to my data? http://creativecommons.org/choose/ •Data repositories: http://figshare.com/, http://zenodo.org/
  • 14.
    Date: 01/08/2014 Semantic Web101: Benefits for geologists Daniel Garijo Ontology Engineering Group, Departamento de InteligenciaArtificial. Universidad Politécnica de Madrid