Multilingual information services in the area of agricultural data: the use case of AGRIS


Published on

The purpose of this presentation, was to present a real problem that could be solved using the multilingual framework produced by Organic.Lingua

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Multilingual information services in the area of agricultural data: the use case of AGRIS

  1. 1. Multilingual Information Services in the area of agricultural data The AGRIS use case Fabrizio Celli – FAO of the UN – 06/02/2014
  2. 2. OVERVIEW 2
  3. 3. The setting • The AGRIS database is a collection of more than 7.7 million bibliographic references in the agricultural domain • They are enhanced by the AGROVOC thesaurus, which is extensively used by cataloguers to enrich data indexing in agricultural information systems • AGRIS is an RDF-aware system ( ), a mashup application that allows users to query the AGRIS-RDF content, interlinking all records to external sources of information • 7 million bibliographic records become 7 million mashup pages! 3
  4. 4. Some statistics • • • • 7.7 million bibliographic references 190 million triples ~ 300.000 visits/month World wide used (accessed from more than 200 countries) 4
  5. 5. How data come to AGRIS • Centralization: bibliographic references in the AGRIS domain (agriculture, forestry, animal husbandry, aquatic sciences and fisheries, and human nutrition) • Interlinking: other kinds of information related to the AGRIS domain (statistics, maps, country profiles, etc.) 5
  6. 6. Data consuming • AGRIS consumes metadata provided by the community and publishes them as open data • Metadata are captured either by pulling data through harvesting from clients (e.g. aggregators, institutional repositories, using protocols such as OAI-PMH) • or by pushing data to AGRIS from clients (e.g. national libraries or journal publishers) 6
  7. 7. Accept any input format! 7
  8. 8. The AGRIS metadata format • AGRIS tries to accept any input format • The AGRIS input module is responsible for the translation of the source input format to the AGRIS RDF • The translation currently requires an intermediate step, in which metadata are converted to the AGRIS AP, a metadata standard based on Dublin Core 8
  10. 10. Multilingual metadata • 80% of AGRIS references have an english content: title, abstract, etc. • The most of the time, when the reference comes in another language, English is used as a translation for both the abstract and the title • Data providers send us multilingual records, where English is quite the default 10
  11. 11. <dc:title xml:lang="en">Effects of straw returned to the field on growth and …</dc:title> <dc:title xml:lang="Zh">砂姜黑土区秸秆还田对玉米生育及水分利用效率的影响</dc:title> <dc:creator> <ags:creatorPersonal>Shen Xueshan, Anhui Agricultural University, Hefei(China)</ags:creatorPersonal> <ags:creatorPersonal>Li Jincai, Anhui Agricultural University, Hefei(China)</ags:creatorPersonal> <ags:creatorPersonal>Qu Huijuan, Anhui Agricultural University, Hefei(China)</ags:creatorPersonal> </dc:creator> <dc:date> <dcterms:dateIssued>Apr. 2011</dcterms:dateIssued> </dc:date> <dc:subject> <ags:subjectClassification scheme="ags:ASC">F01</ags:subjectClassification> <ags:subjectThesaurus scheme="ags:AGROVOC">…</ags:subjectThesaurus> </dc:subject> <dc:description> <dcterms:abstract xml:lang="Zh">摘 要:为了在淮北砂姜黑土区推广小麦玉米秸秆全量还田技术, 采用大田定位试验,设置小麦玉米秸秆不还田、小麦玉米秸秆单季还田和小麦玉米秸秆两季还田4种秸 秆还田方式,研究了小麦、玉米秸秆全量粉碎还田对机播夏玉米出苗、...</dcterms:abstract> <dcterms:abstract xml:lang="En">The effects of straw returned to the field which including no straw returning(CK),wheat straw returning(T1),maize straw returning(T2) and wheat and maize straw returning(T3) on emergence,growth...</dcterms:abstract> </dc:description> <dc:language scheme="ags:ISO639-1">Zh</dc:language> 11
  12. 12. What about Agrovoc • AGRIS records are indexed with the AGROVOC thesaurus, the FAO multilingual vocabulary containing more than 40 000 concepts in 21 languages • Each record can contain one or more AGROVOC strings in a specific language • The translation to RDF allows to assign AGROVOC URIs to AGRIS record • From an AGROVOC URI the user can extract many information, as the translation of AGROVOC strings in many languages 12
  14. 14. The scope of this presentation • Multilinguality problems and needs for the AGRIS online service 14
  15. 15. Two issues • Displaying multilingual information • Multilingual search 15
  16. 16. Displaying multilingual information • AGRIS can display its content in all the languages available in the source metadata • For other languages, a naive translation is provided by the Google translator gadget (this step could be improved) • 16
  17. 17. 17
  18. 18. 18
  19. 19. Multilingual search • Currently not available • AGRIS records are indexed with AGROVOC keywords in a specific language • The translation to RDF provides AGROVOC URIs, which could be used to perform a multilingual search • Currently, only AGROVOC strings go to the Apache Solr index 19
  20. 20. An example of the issue • The search: +agrovoc:(AROMATIC COMPOUNDS) +agrovoc:(EXTRACTION) returns 467 results, but they don’t include the article «Degradacion de compuestos aromaticos por microorganismos y sus aplicaciones biotecnologicas» that was indexed with «Compuestos aromaticos», in Spanish 20
  21. 21. A possible need • It would be great for the AGRIS community if, when the user looks for «Aromatic compounds», the system returns also records indexed with «Composti aromatici», «Compuestos aromáticos», «芳香类», etc. • AGROVOC could help 21
  22. 22. Thank you ! 22