AGROVOC Linked Open Data and the VocBench, potentials for the communityDr. Johannes KeizerOffice ofKnowledge Exchange, Research and ExtensionFood andAgricultureOrganizationofthe UNTalk atthe US National Agricultural Library, May 20 2011
We will promote research for food and agriculture, including research to adapt to, and mitigate climate change, and access to research results and technologies at national, regional and international levels. We will reinvigorate national research systems and will share information and best practices. We will improve access to knowledge.worldfoodsummit  2009
…Sharing Information1975 – AGRIS1980 -  AGROVOC2000  - Rethinking of AGRIS2002  - AOS/AGMES2005  - CIARD2005 - AIMS
http://aims.fao.org
Todays Talk What’s new with AGROVOCAGROVOC Linked Open DataThe VocBenchWhy mapping VocabulariesAutomatic IndexingAGRIS
AGROVOC
AGROVOCA multilingual agricultural vocabulary organized as concept scheme in 20 languagesCovers agriculture, forestry, fisheries and related themes (food security, land use, environment, etc.)Organized in sub-vocabularies, e.g. chemicals, fisheries terms, scientific/common names of organismsMaintained by a global community (e.g. librarians, terminologists, information managers) using VocBench
AGROVOC - Statistics
AGROVOC - Restructuring Goal: Transform AGROVOC from a traditional thesaurus into a concept scheme with distinction between conceptual level and terminological levelOverall revision done by FAO in collaboration with KSI (Knowledge Sharing and Innovation) team at ICRISAT, Hyderabad, IndiaTop concepts reduced from 918 to 25Around 85,000 term relations revisedNon-hierarchical relationships refined by semantic relationsCa. 4,000 non-preferred terms changed to preferred terms
Top concepts
Relationships (examples)
Thesauri  into  the AGROVOC  LOD Cloud    18000 outlinks
2000 inlinksEUROVOCNALT AGROVOCRAMEAUGEMET STW  LCSH
AGROVOC LOD-inlinks Trusted  Links from AGROVOC
AGROVOC Links after 3 weeks LODOutlinks:GEMET-AGROVOC 1,198RAMEAU-AGROVOC  :700Total Outlinks: 1898Inlinks:AGROVOC-EUROVOC:1,297AGROVOC-GEMET:1,198AGROVOC-LCSH :1,093AGROVOC-NAL: 13,390AGROVOC-STW:1136AGROVOC-RAMEAU:700Total Inlinks:18,814
Get the NALT into the LOD cloud too!Some notes from AhsanIssues1.  Only  problem is that  the online version  and  the SKOS version 2011 of NALT are  totally different according to the terms code .  They need to make the stable version of their terms code .2. Since  the AGROVOC is connected with the  NALT so they are in the LOD according to the principle. They can put our 13,000 mapping links into their SKOS file. It will cover outlinks and inlinks.Necessary activities1. Put the SKOS version in the triple store.2. Make the dereferenceable URIs in their website (They can use the biotech drupal module that we are going to build now)3. Publish the URIs  by using Pubby  tool      http://www4.wiwiss.fu-berlin.de/pubby/
Europe:(It is better to use this example during the presentation)http://aims.fao.org/aos/agrovoc/c_2724From the Top concept:Ref:  http://aims.fao.org/aos/agrovoc/c_7644Vocbench (Production)Ref:   http://agrovoc.mimos.my/vocbenchv1.1i/VocBench(Sandbox)Ref:http://agrovoc.mimos.my/vocbenchv1.1i/
The VocBench
The VocBenchVocBenchconcepts and entitiestriples
VocBench Features Domain independent
Structure independent (i.e. thesauri, Glossaries, etc)
Supports RDF (SKOS, SKOS-XL), OWL
Supports collaborative editing
 Supports editorial workflow, with user roles
 Simple and advanced search
Supports data export: SKOS, Relational format (MySQL)
Why linking vocabularies?
http://aims.fao.org/aos/agrovoc/c_7825
http://eurovoc.europa.eu/218754http://aims.fao.org/aos/agrovoc/c_7825
http://eurovoc.europa.eu/218754http://aims.fao.org/aos/agrovoc/c_7825
http://eurovoc.europa.eu/218754http://agclass.nal.usda.gov/nalt/2011.xml#1780http://aims.fao.org/aos/agrovoc/c_7825
Linking data through common URIsTOXIC SUBSTANCEShttp://www.agnic.org/search/CAT85822953UNBISAGROVOCNALThttp://aims.fao.org/aos/agrovoc/c_7825http://agclass.nal.usda.gov/nalt/2011.xml#1780http://eurovoc.europa.eu/218754Eurovochttp://agris.fao.org/agris-search/search/display.do?f=1996/TR/TR96001.xml;TR9600026http://unbisnet.un.org:8080/ipac20/ipac.jsp?session=128F308557F34.283092&profile=bib&uri=full=3100001~!685149~!1&ri=1&aspect=subtab124&menu=search&source=~!horizonhttp://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2010:202:0011:0015:EN:PDFhttp://aims.fao.org/aos/agrovoc/c_12332        owl:sameAshttp://eurovoc.europa.eu/219871skos: exact match                  UNBIS: Toxic Substances
If all institutions, which publish about toxic wastes would:- Index their publications with URIs from AGROVOC,GEMET, NALT, LCSH or EUROVOC
(many do – low hanging fruit!)
- Publish their metadata as LOD
(quite easy to do, bibData map well to RDFThenEveryone who knows to write SparqlQeries could get all these publications with one shot for a new website on toxic wastes
Vocabularies and LODSimply publishing your data as RDF does not link them to other data sets Creating this links by humans is interesting in detail, but unrealistic as mass processingLinking 2 standard vocabularies can link 200 datasets which use these standard vocabularies
AgroTaggerAndOpenCalais

Agrovoc Linked Open Data and the Voc Bench, Potentials for the Community

Editor's Notes

  • #14 Note: we identified outlinks to RAMAEU and GEMET, and they have taken them as inlinks to their own thesaurus.
  • #18 - All links are checked by a domain expert.
  • #25 - All links are checked by a domain expert.
  • #29 Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
  • #30 Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
  • #31 Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
  • #32 Ifresources are marked up withsemanticallydefined and machinereadableconcepts, they can belinked and mashed up preciselyaswehaveseen in the examplefrom the BBC.In thisexamplewe start withan AGRIS record on Hazardouswaste, whichisindexedwith AGROVOC. Alreadynowwe can easily link to material indexedwithEurovoc, hereanexamplefromEuroLex. If the UNBIS thesaurus wouldberestructuredto a conceptscheme and publishedas LOD, related UN documentscouldbeattachedautomaticallyby the machine.
  • #33 How does this work: A resource is connected with each concept URI in the web. The concepts between three vocabularies are having same literal which is connected with owl:sameAS/exactMatch relationship. As we are speakingaboutthesauri and notontologieswekept the relation tobechosenpurposelyvague. The conceptscouldbematchedwithowl:sameAS or the termscouldbematcheswith SKOS:exactMatch. A lotofdiscussion on thisisongoing
  • #37 Oneof the groundbreakingenterprises in this area isThomsonReuters “Open Calais”. Thisis a webservicethatprovidessemanticmark up foranyunstructured text thatyoufeedintotheir service The service is free ofCharge. Why? I will show youlater.
  • #38 My team in collaborationwith the IndianInstituteofTechnology in Kanpur isdeveloping a similar service foroursubject area.
  • #39 Wehavehere a text from 1964 without a bibliographic record at handabout a plantprotectionissue
  • #40 Open Calais isverygood in thoseareas, in whichtheyhavetheirownelaboratedconceptschemeagainstwhich the texts are analyzed: “Places”, “Persons”, “Business Processes” , “IndustryTerms”, butitisweak in the specifictopicanalysis, whattheycall “social tags”
  • #41 AgroTaggerstilllacksmanyof the sophisticated featuresof “Open Calais” ,butismuch, muchbetter in the subjectanalysisof the text
  • #44 The chart represent the records published from the AGRIS partner institutions of the relevant Country (Scielo is the exception being a service provider itself)The citation column is the number of references of articles from scientific journals.