• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Vocabularies and Linked Open Data
 

Vocabularies and Linked Open Data

on

  • 1,309 views

 

Statistics

Views

Total Views
1,309
Views on SlideShare
1,309
Embed Views
0

Actions

Likes
0
Downloads
19
Comments
1

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
  • Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
  • Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
  • Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
  • How does this work: A resource is connected with each concept URI in the web. The concepts between three vocabularies are having same literal which is connected with owl:sameAS/exactMatch relationship. As we are speakingaboutthesauri and notontologieswekept the relation tobechosenpurposelyvague. The conceptscouldbematchedwithowl:sameAS or the termscouldbematcheswith SKOS:exactMatch. A lotofdiscussion on thisisongoing
  • Note: we identified outlinks to RAMAEU and GEMET, and they have taken them as inlinks to their own thesaurus.
  • - All links are checked by a domain expert.
  • - All links are checked by a domain expert.
  • Once a content provider (icon person thinking) has decided to publish a bibliographical database as Linked Open Data….(arrow in red)1.- What kinds of entities and relationships are involved in bibliographic resource description? The definition of a conceptual model helps to bring an overall picture of involving entities and relationships in bibliographic descriptionto establish a common understanding of the involving data models. LODE-BD proposes a simple conceptual model based on three entities: resource, agent and thema. (arrow in blue)2. What properties should be considered for publishing meaningful/useful LOD-ready bibliographic data? In the Linked Data context any content provider can expose anything contained in its local database. However, in the case of bibliographical data, standardized types of information should be considered in order to maximize the impact of exposing, sharing, and connecting of data. LODE-BD has identified nine groups of common properties for describing bibliographic resources: about two dozen properties used for describing a bibliographic resource as well as an additional two sets of properties for describing relations between bibliographic resources or between agents. They form the backbone of LODE-BD, basis of the decision-trees (the next slide). 
  • (arrow in orange)3. What metadata standards should be used for preparing LOD-ready metadata? LODE-BD has selected a number of well-accepted and widely-used metadata vocabularies and used their metadata terms in the recommendations. Like dc, dcterms, bibo, agmes…. New metadata standards can be added on the list in the future depending on the needs on the Linked Open Data Community.(arrow in green)4. What metadata terms are appropriate in any given property for publishing LOD-ready metadata based on a local database?  Metadata terms from the DCMES (dc:) and DCMI Metadata Terms (dcterms:) namespaces are the fundamentals in the LODE-BD Recommendations, while metadata terms from other namespaces are supplemented when additional needs are to be satisfied. LODE-BD has prepared a crosswalk table where all metadata terms used in the Recommendations are included. 
  • This part of the LODE-BD report aims to assist in the metadata term selection process to be carry out by any bibliographical data provider. LODE-BD uses flowcharts to present individualized decision trees for the properties included in each of the nine groups (refer to the previous chapter). Starting from the property that describes a resource instance, each flowchart presents decision points and gives a step-by-step solution to a given problem of metadata encoding. These flowcharts are designed to facilitate the selection of the appropriate strategies adjustable to data providers according to their situations, while all work towards the goal of data exchange and reuse. At the end of each flowchart there are alternative sets of metadata terms for selection. Each chart is followed by the text-based explanations corresponding to the flowchart, with notes, steps, and examples whenever necessary in the tables.   
  • Oneof the groundbreakingenterprises in this area isThomsonReuters “Open Calais”. Thisis a webservicethatprovidessemanticmark up foranyunstructured text thatyoufeedintotheir service The service is free ofCharge. Why? I will show youlater.
  • My team in collaborationwith the IndianInstituteofTechnology in Kanpur isdeveloping a similar service foroursubject area.
  • Wehavehere a text from 1964 without a bibliographic record at handabout a plantprotectionissue
  • Open Calais isverygood in thoseareas, in whichtheyhavetheirownelaboratedconceptschemeagainstwhich the texts are analyzed: “Places”, “Persons”, “Business Processes” , “IndustryTerms”, butitisweak in the specifictopicanalysis, whattheycall “social tags”
  • AgroTaggerstilllacksmanyof the sophisticated featuresof “Open Calais” ,butismuch, muchbetter in the subjectanalysisof the text
  • The mainintegrationworksthroughcommonsemanticsCore ofagINFRAtechnologyisaLODstoreofsharedencodedknowledgeorganizationsystemsan automaticmarkupto link structuredandunstructureddatasourcesthroughthissharedKnowledgeOrganizationsystemsSharing withinthe R.I.N.G.Partner registertheirservices, notechnicallimitationLOD – Wrapper for all participatingInstitutionsFor all registered services a „triplificationwrapper“ will besetupThe triplifierworkswith „agConceptsandagIdentities“ tocreatelinkeddataSteadilygrowing LOD ecosystemThe agINFRA LOD ecosystemoffers Webservices forthewww

Vocabularies and Linked Open Data Vocabularies and Linked Open Data Presentation Transcript

  • Vocabularies and Linked Open Data
    Dr. Johannes Keizer
    Office ofKnowledge Exchange, Research and Extension
    Food andAgricultureOrganizationofthe UN
    Talk at Library ofCongress, 2011-05-18
  • We will promote research for food and agriculture, including research to adapt to, and mitigate climate change, and access to research results and technologies at national, regional and international levels.
    We will reinvigorate national research systems and will share information and best practices.
    We will improve access to knowledge.
    worldfoodsummit 2009
  • Information InfrastructureforAgriculturalResearch and Innovation
  • Vocabularies and
    Linked Open Data
  • http://aims.fao.org/aos/agrovoc/c_7825
  • http://eurovoc.europa.eu/218754
    http://aims.fao.org/aos/agrovoc/c_7825
  • http://eurovoc.europa.eu/218754
    http://aims.fao.org/aos/agrovoc/c_7825
  • http://eurovoc.europa.eu/218754
    http://agclass.nal.usda.gov/nalt/2011.xml#1780
    http://aims.fao.org/aos/agrovoc/c_7825
  • Linking data through common URIs
    TOXIC SUBSTANCES
    http://www.agnic.org/search/CAT85822953
    UNBIS
    AGROVOC
    NALT
    http://aims.fao.org/aos/agrovoc/c_7825
    http://agclass.nal.usda.gov/nalt/2011.xml#1780
    http://eurovoc.europa.eu/218754
    Eurovoc
    http://agris.fao.org/agris-search/search/display.do?f=1996/TR/TR96001.xml;TR9600026
    http://unbisnet.un.org:8080/ipac20/ipac.jsp?session=128F308557F34.283092&profile=bib&uri=full=3100001~!685149~!1&ri=1&aspect=subtab124&menu=search&source=~!horizon
    http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2010:202:0011:0015:EN:PDF
    http://aims.fao.org/aos/agrovoc/c_12332 owl:sameAshttp://eurovoc.europa.eu/219871
    skos: exact match UNBIS: Toxic Substances
  • If all institutions, which publish about toxic wastes would:
    • - Index their publications with URIs from AGROVOC,GEMET, NALT, LCSH or EUROVOC
    • (many do – low hanging fruit!)
    • - Publish their metadata as LOD
    • (quite easy to do, bibData map well to RDF
    Then
    Everyone who knows to write SparqlQeries could get all these publications with one shot for a new website on toxic wastes
  • Vocabularies and LOD
    Simply publishing your data as RDF does not link them to other data sets 
    Creating this links by humans is interesting in detail, but unrealistic as mass processing
    Linking 2 standard vocabularies can link 200 datasets which use these standard vocabularies
  • …just out of the pipele
    -----Original Message-----From: Antoine Isaac [mailto:aisaac@few.vu.nl] Sent: Thursday, May 12, 2011 7:19 PMTo: UDC SummaryCc: Anibaldi, Stefano (OEKC); Dan BrickleySubject: Re: AGRIS Journals and UDC URIs/ checkingAida, Stefano,…..Of course the first hints re. URIs is to keep it short. www.udcc.org/udcclass_631.1/50900 seems a bit long.Then it might be interesting to use "class" somewhere, if you're going to release entities with a different type one day.On the most difficult issue, class numbers vs. DB identifiers. Probably you will have to create both, if you want to intercept these cases where concepts have changed class number.…………
  • AGROVOC
  • AGROVOC
    A multilingual agricultural vocabulary organized as concept scheme in 20 languages
    Covers agriculture, forestry, fisheries and related themes (food security, land use, environment, etc.)
    Organized in sub-vocabularies, e.g. chemicals, fisheries terms, scientific/common names of organisms
    Maintained by a global community (e.g. librarians, terminologists, information managers) using VocBench
  • AGROVOC - Statistics
  • AGROVOC - Restructuring
    Goal: Transform AGROVOC from a traditional thesaurus into a concept scheme with distinction between conceptual level and terminological level
    Overall revision done by FAO in collaboration with KSI (Knowledge Sharing and Innovation) team at ICRISAT, Hyderabad, India
    Top concepts reduced from 918 to 25
    Around 85,000 term relations revised
    Non-hierarchical relationships refined by semantic relations
    Ca. 4,000 non-preferred terms changed to preferred terms
  • Top concepts
  • Relationships (examples)
  • Thesauri into the AGROVOC LOD Cloud
    • 18000 outlinks
    • 2000 inlinks
    EUROVOC
    NALT
    AGROVOC
    RAMEAU
    GEMET
    STW
    LCSH
  • AGROVOC LOD-inlinks
    Trusted Links from
    AGROVOC
  • AGROVOC Links after 3 weeks LOD
    Outlinks:
    GEMET-AGROVOC 1,198
    RAMEAU-AGROVOC  :700
    Total Outlinks: 1898
    Inlinks:
    AGROVOC-EUROVOC:1,297
    AGROVOC-GEMET:1,198
    AGROVOC-LCSH :1,093
    AGROVOC-NAL: 13,390
    AGROVOC-STW:1136
    AGROVOC-RAMEAU:700
    Total Inlinks:18,814
  • Europe:(It is better to use this example during the presentation)http://aims.fao.org/aos/agrovoc/c_2724From the Top concept:Ref:  http://aims.fao.org/aos/agrovoc/c_7644Vocbench (Production)Ref:   http://agrovoc.mimos.my/vocbenchv1.1i/VocBench(Sandbox)Ref:http://agrovoc.mimos.my/vocbenchv1.1i/
  • The VocBench
  • The VocBench
    VocBench
    concepts and entitiestriples
  • VocBench Features
    • Domain independent
    • Structure independent (i.e. thesauri, Glossaries, etc)
    • Supports RDF (SKOS, SKOS-XL), OWL
    • Supports collaborative editing
    • Supports editorial workflow, with user roles
    • Simple and advanced search
    • Supports data export: SKOS, Relational format (MySQL)
  • LODE - BD
  • ..what it means
    Guidelines how to produce data that easily can be transformed into LOD
  • LODE-BD Recommendations 1.1.
    What entities and relationships?
    What properties?
  • And….
    What metadata terms?
    What
    metadata
    standards?
    dc
    dcterms
    bibo
    agls
    ags
    eprint
    marcrel
  • Decision Trees
    Subject
  • AgroTagger
    And
    OpenCalais
    • Does Concept identification in unstructured texts
    • Uses Agrovoc as a controlled vocabulary
    • Prototype under testing with excellent results (entire repository of ICARDA indexed)
    • Will produce in future Structured RDF files that can be used to link data like “open Calais”
    AgroTagger
  • RING
    routemapto information nodes and gateways
    VocBench
    concepts and entitiesreferencetriples
    Cloud
    storagefor RDF data triples
    Tools
    LOD
    enabled software
    LOD Generator
    triplifier,
    concept and entityidentifier
    Data Services
    Webservices + APIsto triple stores
    agINFRA - the elements
  • Thank You!
    http://www.ciard.net
    http://ring.ciard.net
    http://aims.fao.org
    http://agris.fao.org