The Open Data Projects of STLab:
- Cultural Heritage: Ontology and Linked
Open Data on Italian cultural institutes,
sites and events;
- FOod in Open Data (FOOD): Ontologies
and Linked Open Data on Food Quality
Certification Schemes;
- MARE: A Single Information System for
Fishery and Aquaculture Products
Marketed in the EU;
- Linked Data for Smart Cities:
the case of Catania;
- Digital Libraries;
- The Linked Open Data platform of CNR: data.cnr.it
15. Research
• Open Knowledge ExtracBon
– SemanBc Web Machine Reading
– SemanBc SenBment Analysis
– Complex RelaBon ExtracBon and RepresentaBon
– AbstracBve SummarizaBon
• Similarity reasoning
• RoboBc natural language understanding
– IntegraBon of acBon schemas, linked data, and OKE
frames
– Social pracBces and norms
– Irony
– Modality
October 2015 STLab Open Data Projects 15
16. Development
• Open Knowledge ExtracBon, ReconciliaBon, Enrichment (Basic research)
• Design of ontology and data networks (Basic research)
• Geolinked data: shape files in RDF, use of ontology pa[erns (PRISMA)
• Geolinked semanBc iBneraries (PRISMA, HERMES)
• Address ontology pa[erns (S&TDL)
• ArBcle abstract finding (S&TDL)
• Automated enBty categorizaBon (S&TDL)
• Cultural objects ontology (MIBACT)
• DOP ontology extracBon and design (FOOD)
• SemanBc mulBlingual nomenclature of fishery products (MARE)
• Smart chat bots (MARIO)
October 2015 STLab Open Data Projects 16
29. Context and ObjecBves
ObjecBves
• Define and make available standardization models and reference
ontologies for representing data on food quality certification schemes
– Protected Designation of Origin – in Italian DOP
– Protected Geographical Indication – in Italian IGP
• Produce the Linked Open Data (RDF datasets)
– aligned with the defined ontologies
– using data extracted from food product specifications published by the
Ministry
• Make publicly available for re-use the RDF datasets so as to facilitate
the development of applications/services
October 2015 STLab Open Data Projects 29
Project funded by ISTC-CNR (STLab) and Agenzia per l’Italia Digitale
(AgID), and carried out with the collaboration of the Ministry of Agricultural,
Food and Forestry Policies
35. Which data to extract
• Not all the data included in the policy
documents J
• Data more oriented to consumers
– designation
– product typology
– raw material and percentage of raw material which
contributes to product composition
– production area
– characteristics of the consumer product (for products
different from wines)
October 2015 STLab Open Data Projects 35
45. Data Complexity and Challenges
• Bring together heterogeneous data and interact with mulBple
informaBon sources
– different informaBon content
• commercial designaBons, scienBfic names, producBon methods, fishing
areas and gears, markeBng standards, pictograms, distribuBon maps, ect.
– different data sources
• commercial designaBons lists published by naBonal authoriBes, taxonomic
systems, species factsheets, FAO classificaBon schemes, EU RegulaBons,
etc.
– different data formats and gathering modaliBes
• from database records accessible through Web services (e.g., taxonomic
systems) to unstructured text human-readable only (e.g., EU RegulaBons)
• XML/JSON/CSV data, spreadsheets, HTML Web resources, PDF and text
documents, etc.
• Deal with exisBng informaBon at local/regional, naBonal and
internaBonal levels, combined with mulBlingual perspecBve
– 28 EU Member States and 24 official languages
October 2015 STLab Open Data Projects 45
67. Online resources
October 2015 STLab Open Data Projects 67
• PRISMA ontology
• h[p://www.ontologydesignpa[erns.org/ont/prisma/ontology.owl
• SPARQL endpoint
• h[p://wit.istc.cnr.it:8894/sparql
• Select <prisma-ont> and <prisma>
• Endpoint also accessible as REST web service
• PRISMA web portal demo:
• h[p://wit.istc.cnr.it/prisma/WebContent/home.html
72. Context and Objectives
• Project funded by CNR and Agenzia per l’Italia Digitale (AgID), and carried
out with the collaboration of the Ministry of Instruction, Research, and
University (MIUR)
• Objectives
• designing and implementing a Digital Library (DL) for spreading of science
and technologies
• exploiting Semantic Technologies for publishing the Linked Open Data
(LOD) of the DL
• Result
• LOD dataset
• counting of ~8.4M RDF triples modelled by using an Ontology Design
Pattern (ODP) based methodology
• describing ~150K bibliographic resources
• linked to DBpedia, data.cnr, and Geonames
20/10/2015 STLab Open Data Projects - Rome Meetup 2
73. DL Data
• The DL organises heterogeneous
contents
• e.g., research products, datasets, data about
research activities, projects, researchers and
expertise, digitised content of historical and
cultural interest
• coming from different organisations
20/10/2015 STLab Open Data Projects - Rome Meetup 3
86. 9
Ongoing work
• Integration of more data sources
• More data linking (e.g., DBLP, Semantic Publishing
LOD, ACS, etc.)
• Publishing of data.stdl.cnr.it (still some policy
issues)
• More sophisticated identity resolution (e.g.,
homonyms detection)
• Integration of visual paradigms for Linked Data
exploration
88. Context and Objectives
• Joint work by STLab and the Information
Systems unit of CNR
• Objective
• providing a Open Data platform to enable
public access to the information of the CNR
organization
20/10/2015 STLab Open Data Projects - Rome Meetup 2
100. Results
• data.cnr.it
• the CNR Ontology Network and data
available as LOD
• counting of ~8.2M RDF triples
• linked to DBpedia and Geonames
• The Semantic Scout
• expert finding based on competence
• monitoring funding and evolution of
different research areas and units
• browsing and reporting capabilities
20/10/2015 STLab Open Data Projects - Rome Meetup 7
http://data.cnr.it
http://goo.gl/MFlStF
101. 8
• More data linking (e.g. DBLP)
• Automatic synchronisation with or virtual RDF
access to data sources
• More visual paradigms
Ongoing work