This document provides methodological guidelines for publishing linked data. It outlines the main steps in the process, including identifying and analyzing data sources, designing URIs, defining licenses, modeling ontologies, transforming and cleansing data, linking datasets, and publishing and discovering the linked data. The guidelines are based on experiences applying these methods to real government datasets.
Presentada en "World Library and Information Congress: 77th IFLA General Conference and Assembly. Semantic Web Special Interest Group. 17 de agosto. Puerto Rico
Presentada en la Conferencia Internacional de Dublin Core 2013, que tuvo lugar en Lisboa, del 2 al 6 de septiembre y donde participó la Biblioteca Nacional de España (BNE).
Quick intro to RDA for my staff includes basic overview of how RDA differs from AACR2, MARC, FRBR, and the Semantic Web. Includes examples. by robin fay for UGA Libraries/ DBM, georgiawebgurl@gmail.com
Presentada en "World Library and Information Congress: 77th IFLA General Conference and Assembly. Semantic Web Special Interest Group. 17 de agosto. Puerto Rico
Presentada en la Conferencia Internacional de Dublin Core 2013, que tuvo lugar en Lisboa, del 2 al 6 de septiembre y donde participó la Biblioteca Nacional de España (BNE).
Quick intro to RDA for my staff includes basic overview of how RDA differs from AACR2, MARC, FRBR, and the Semantic Web. Includes examples. by robin fay for UGA Libraries/ DBM, georgiawebgurl@gmail.com
2013 update of the secrets of the catalog (training for UGA Libraries), includes MARC, bibliographic structure and more. (Voyager, Ex-libris catalog) by robin fay, georgiawebgurl@gmail.com
The workshop focuses on constructing authorized access points for records under RDA, utilizing the LC/PCC Policy Statements (LCC/PCC PS). This is NOT a NACO workshop. Authorized access points for personal names, corporate bodies, conferences, and works and expressions (titles) will be covered, as will relationship designators for personal names and corporate bodies. Subject headings will not be covered.
The tools of our trade: AACR2/RDA and MARCAnn Chapman
Guest lecture at London Metropolitan University on 13th March 2009. The lecture covered the history behind RDA, the international collaborative process by which it is being developed, an overview of the text and a look at the RDA approach to cataloguing; this was followed by an overview of the history and development process for the MARC format.
Marc formats : Facilitating sharing of Catalogue RecordsOtuoma Peter
This item was presented by Otuoma Sanya as a guest speaker to Information Science Students at Meru University of Science and Technology in November 2015.
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...Boris Villazón-Terrazas
To speed up the ontology development by reusing and re-engineering
non-ontological resources that have already reached some consensus by standardization bodies.
Nowadays, information management systems deal with data originating from different sources including relational databases, NoSQL data stores, and Web data formats, varying not only in terms of data formats, but also in the underlying data model. Integrating data from heterogeneous data sources is a time-consuming and error-prone engineering task; part of this process requires that the data has to be transformed from its original form to other forms, repeating all along the life cycle. With this report we provide a principled overview on the fundamental data shapes tabular, tree, and graph as well as transformations between them, in order to gain a better understanding for performing said transformations more efficiently and effectively.
2013 update of the secrets of the catalog (training for UGA Libraries), includes MARC, bibliographic structure and more. (Voyager, Ex-libris catalog) by robin fay, georgiawebgurl@gmail.com
The workshop focuses on constructing authorized access points for records under RDA, utilizing the LC/PCC Policy Statements (LCC/PCC PS). This is NOT a NACO workshop. Authorized access points for personal names, corporate bodies, conferences, and works and expressions (titles) will be covered, as will relationship designators for personal names and corporate bodies. Subject headings will not be covered.
The tools of our trade: AACR2/RDA and MARCAnn Chapman
Guest lecture at London Metropolitan University on 13th March 2009. The lecture covered the history behind RDA, the international collaborative process by which it is being developed, an overview of the text and a look at the RDA approach to cataloguing; this was followed by an overview of the history and development process for the MARC format.
Marc formats : Facilitating sharing of Catalogue RecordsOtuoma Peter
This item was presented by Otuoma Sanya as a guest speaker to Information Science Students at Meru University of Science and Technology in November 2015.
A Method for Reusing and Re-engineering Non-ontological Resources for Buildin...Boris Villazón-Terrazas
To speed up the ontology development by reusing and re-engineering
non-ontological resources that have already reached some consensus by standardization bodies.
Nowadays, information management systems deal with data originating from different sources including relational databases, NoSQL data stores, and Web data formats, varying not only in terms of data formats, but also in the underlying data model. Integrating data from heterogeneous data sources is a time-consuming and error-prone engineering task; part of this process requires that the data has to be transformed from its original form to other forms, repeating all along the life cycle. With this report we provide a principled overview on the fundamental data shapes tabular, tree, and graph as well as transformations between them, in order to gain a better understanding for performing said transformations more efficiently and effectively.
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesPrateek Jain
The recent emergence of the “Linked Data” approach for publishing data represents a major step forward in realizing the original vision of a web that can “understand and satisfy the requests of people and machines to use the web content” – i.e. the Semantic Web. This new approach has resulted in the Linked Open Data (LOD) Cloud, which includes more than 70 large datasets contributed by experts belonging to diverse communities such as geography, entertainment, and life sciences. However, the current interlinks between datasets in the LOD Cloud – as we will illustrate – are too shallow to realize much of the benefits promised. If this limitation is left unaddressed, then the LOD Cloud will merely be more data that suffers from the same kinds of problems, which plague the Web of Documents, and hence the vision of the Semantic Web will fall short.
This thesis presents a comprehensive solution to address these issues using a bootstrapping based approach. It showcases using bootstrapping based methods to identify and create richer relationships between LOD datasets. The BLOOMS project (http://wiki.knoesis.org/index.php/BLOOMS) and the PLATO project, both built as part of this research, have provided evidence to the feasibility and the applicability of the solution.
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain
The recent emergence of the “Linked Data” approach for publishing data represents a major step forward in realizing the original vision of a web that can "understand and satisfy the requests of people and machines to use the web content" – i.e. the Semantic Web. This new approach has resulted in the Linked Open Data (LOD) Cloud, which includes more than 70 large datasets contributed by experts belonging to diverse communities such as geography, entertainment, and life sciences. However, the current interlinks between datasets in the LOD Cloud – as we will illustrate – are too shallow to realize much of the benefits promised. If this limitation is left unaddressed, then the LOD Cloud will merely be more data that suffers from the same kinds of problems, which plague the Web of Documents, and hence the vision of the Semantic Web will fall short.
This thesis presents a comprehensive solution to address the issue of alignment and relationship identification using a bootstrapping based approach. By alignment we mean the process of determining correspondences between classes and properties of ontologies. We identify subsumption, equivalence and part-of relationship between classes. The work identifies part-of relationship between instances. Between properties we will establish subsumption and equivalence relationship. By bootstrapping we mean the process of being able to utilize the information which is contained within the datasets for improving the data within them. The work showcases use of bootstrapping based methods to identify and create richer relationships between LOD datasets. The BLOOMS project (http://wiki.knoesis.org/index.php/BLOOMS) and the PLATO project, both built as part of this research, have provided evidence to the feasibility and the applicability of the solution.
The recent emergence of the “Linked Data” approach for publishing data represents a major step forward in realizing the original vision of a web that can "understand and satisfy the requests of people and machines to use the web content" – i.e. the Semantic Web. This new approach has resulted in the Linked Open Data (LOD) Cloud, which includes more than 70 large datasets contributed by experts belonging to diverse communities such as geography, entertainment, and life sciences. However, the current interlinks between datasets in the LOD Cloud – as we will illustrate – are too shallow to realize much of the benefits promised. If this limitation is left unaddressed, then the LOD Cloud will merely be more data that suffers from the same kinds of problems, which plague the Web of Documents, and hence the vision of the Semantic Web will fall short.
This thesis presents a comprehensive solution to address the issue of alignment and relationship identification using a bootstrapping based approach. By alignment we mean the process of determining correspondences between classes and properties of ontologies. We identify subsumption, equivalence and part-of relationship between classes. The work identifies part-of relationship between instances. Between properties we will establish subsumption and equivalence relationship. By bootstrapping we mean the process of being able to utilize the information which is contained within the datasets for improving the data within them. The work showcases use of bootstrapping based methods to identify and create richer relationships between LOD datasets. The BLOOMS project (http://wiki.knoesis.org/index.php/BLOOMS) and the PLATO project, both built as part of this research, have provided evidence to the feasibility and the applicability of the solution.
Tsakonas-Robbio·Open Bibliographic Data E-LisLIS EPI Meeting
1st Workshop of Transfer Information for Innovation · 3rd November 2011 · Valencia. Robbio, Antonella De; Tsakonas, Giannis. "Open Bibliographic Data and E-Lis: marrying good intentions"
Knowledge Organization System (KOS) for biodiversity information resources, G...Dag Endresen
Slides from a presentation on the Knowledge Organization System (KOS) work program for GBIF. KOS developments for biodiversity information resources and input to the emerging Vocabulary Management Task Group (VoMaG).
Links
GBIF KOS prototype tools, http://kos.gbif.org/
Tool: Semantic Wiki prototype, http://terms.gbif.org/wiki/
Tool: ISOcat prototype demo, http://kos.gbif.org/isocat/
GBIF concept vocabulary term browser, http://kos.gbif.org/termbrowser/
GBIF Resources Repository, http://rs.gbif.org/terms/
GBIF Vocabulary Server, http://vocabularies.gbif.org/
GBIF Resources Browser, http://tools.gbif.org/resource-browser/
Linked Data and Semantic Technologies can support a next generation of science. This talk shows examples of discovery, access, integration, analysis, and shows directions towards prediction and vision.
A presentation given at the "Data Stewardship: Increasing the Integrity and Effectiveness of Science and Scholarship" Session on Friday, June 8 2012 at the IASSIT 2012 conference in Washington DC.
This presentation introduced data publishing, using a social science (archaeology) case study to explore editorial processes and dissemination outcomes that increasingly demand “Linked Data” capabilities.
Keynote presentation delivered at ELAG 2013 in Gent, Belgium, on May 29 2013. Discusses Research Objects and the relationship to work my team has been involved in during the past couple of years: OAI-ORE, Open Annotation, Memento.
Similar to Methodological Guidelines for Publishing Linked Data (20)
A Provenance-Aware Linked Data Application for Trip Management and OrganizationBoris Villazón-Terrazas
A Provenance-Aware Linked Data Application for Trip Management and Organization, presented at the Triplification Challenge, I-Semantics 2011.
We present, an application for exploiting, managing and organizing Linked Data in the domain of news and blogs about travelling. El Viajero makes use of several heterogeneous datasets to help users to plan future trips, and relies on the Open Provenance Model for modelling the provenance information of the resources.
GeoLinked Data (.es) is an open initiative whose aim is to enrich the Web of Data with Spanish geospatial data. This initiative started off by publishing diverse information sources belonging to the Spanish National Geographic Institute. Such sources are made available as RDF (Resource Description Framework) knowledge bases according to the Linked Data principles. With this work, Spain has joined the Linked Data initiative, in which the United Kingdom and Germany are already participating. In this presentation, we provide an overview of the process that has been followed for the development of this initiative.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Leading Change strategies and insights for effective change management pdf 1.pdf
Methodological Guidelines for Publishing Linked Data
1. Methodological Guidelines for
Publishing Linked Data
g
Boris Villazón-Terrazas, Oscar Corcho
Facultad de Informática, Universidad Politécnica de Madrid
,
Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid
http://www.oeg-upm.net
{bvillazon,ocorcho}@fi.upm.es
Phone: 34 91 3366605 Fax: 34 91 3524819
34.91.3366605, 34.91.3524819
Slides available at: http://www.slideshare.net/boricles/
Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches,
Victor Saquicela, Al
Vi t S i l Alexander d L ó and many others th t we
d de León, d th that
may have omitted.
WorkdistributedunderthelicenseCreativeCommonsAttribution-
Noncommercial-Share Alike 3.0
2. Main References
Wood, David (Ed) Linking Government Data - 2011
Methodological Guidelines for Publishing Government Linked Data
Boris Villazón-Terrazas, Luis M. Vilches, Oscar Corcho, Asunción Gómez-Pérez
Best Practices for Publishing Linked Data
W3C Editor’s Draft – Government Linked Data Working Group
Michael Hausenblas, Bernadette Hyland, Boris Villazón-Terrazas
https://dvcs.w3.org/hg/gld/raw-file/bcb72f87b5cc/bp/index.html
Cookbook for Open Government Linked Data
W3C Editor’s Draft – Government Linked Data Working Group
Bernadette Hyland, Boris Villazón-Terrazas, Sarven Capadisli
http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook
http://www w3 org/2011/gld/wiki/Linked Data Cookbook
3. Guidelines for Publishing Linked Data
• The process of publishing Linked Data has an
iterative incremental life cycle model.
• Based on our experience in the production of Linked
Data in several Governmental Contexts, have been
applied in real case scenarios.
3
7. Specification
Identification and analysis of the data sources
We have to distinguish
• O
Open and publish d t th t government agencies h
d bli h data that t i have
not yet opened up and published
• Task that may require contacting to specific government data
owners to get access to their legacy data
• Reuse and leverage on data already opened up and
p
published by g
y government agencies
g
• Task to look for these data in public government catalogs
• Open Government Data
• datacatalogs org
datacatalogs.org
• Open Government Catalog
7
8. Specification
Identification and analysis of the data sources
After we have identified and selected the government data
sources
• Search and compile all the available data and
documentation about those resources
• Identify the schema of those resources including
conceptual components and th i relationships
t l t d their l ti hi
• Identify the items in the domain i e things whose
domain, i.e.,
properties and relations are described in the data
sources
8
9. Specification
GeoLinkedData – Identification of the data sources
Agreement with the IGN
IGN
National Geographic Institute of Spain
Oracle & MySQL
Data
D t sources available
il bl
in a public data catalog
INE
National Statistic Institute of Spain
9
10. Specification
GeoLinkedData – Analysis of the data sources
Year
Province Industry Production Index
10
11. Specification
URI Design
• Use meaningful URIs, instead of opaque URIs, when
possible
• Separate TBox (ontology model) from ABox
(instances) URIs
URIs.
• Base URI
http://data.gov.bo/
http://health.data.gov.bo/
• TBox URIs
http://data.gov.bo/ontology/{class|property}
p g gy { |p p y}
• ABox URIs
http://data.gov.bo/resource/
http://data.gov.bo/resource/province/Tiraque
http://data gov bo/resource/province/Tiraque
11
12. Specification
GeoLinkedData - URI design
• Base URI
http://linkeddata.es/
http://geo.linkeddata.es/
• TBox URIs
http://geo.linkeddata.es/ontology/{concept|property}
http://geo.linkeddata.es/ontology/Provincia
http://geo linkeddata es/ontology/Provincia
• ABox URIs
http://geo.linkeddata.es/resource/{r. type}/{r. name}
http://geo.linkeddata.es/resource/Provincia/Madrid
12
13. Specification
Definition of the license
• Several possibilities
• The UK Open Government License
• Open Database License
• Public Domain Dedication and License
• Open Data Commons Attribution License
• The C
Creative C
Commons Licenses
It is also possible to reuse and apply an existing license
p pp y g
of the government data sources.
13
14. Specification
GeoLinkedData - Definition of the license
• Reusing the original license of the government data
sources. IGN and INE data sources have their own
license, similar t Att ib ti Sh
li i il to Attribution-Share Alik 2 5 G
Alike 2.5 Generic
i
License
http://creativecommons.org/licenses/by-sa/2.5/
14
16. Modelling
Ontology
• An ontology is an engineering artifact, which provides:
• A set of terms
• A set of explicit assumptions regarding the intended meaning of the terms.
• Almost always including concepts and their classification
• Almost always including properties between concepts
• Shared understanding of a domain of interest
• Ontologies expressed in OWL or RDF(S), both based on RDF
16
17. Modelling
Reuse available vocabularies
Search f suitable
S h for it bl
vocabularies
Linked Open Vocabularies
are there Yes Build the vocabulary by
suitable reusing available
g
vocabularies? vocabularies
No
…
17
18. Modelling
Reuse available non-ontological resources
Highly reliable Web Sites
Search f suitable
S h for it bl Domain related
Domain-related sites
non-ontological resources
Government Catalogs
are there Yes Build the vocabulary by
suitable transforming available
t f i il bl
resources? resources
No
Build the vocabulary from
scratch
18
19. Modelling
GeoLinkedData
WGS84 Geo
Positioning: an RDF
vocabulary scv:Dimension
scv:Item
scv:Dataset
hydrographical
phenomena (rivers,
lakes, etc.)
Vocabulary for
instants, intervals,
durations, etc.
Names and
international code
Ontology for OGC systems for
Geography Markup territories and
Language groups
Classes 33 33
Object Properties 44 44
Data Properties 318 318
http://neon-toolkit.org/
19
23. Generation
Transformation
• Take the data sources selected in the specification
activity and transform them to RDF according to the
vocabulary created i th modelling activity
b l t d in the d lli ti it
• Some tools
• CSV and spreadsheets
• RDF extension of Google Refine, XLWrap, RDF123, NOR2O
• RDB
• D2R Server, ODEMapster, W3C RDB2RDF WG – R2RML
• XML
• GRDDL, ReDeFer
23
25. Generation
GeoLinkedData - Transformation
Industry Production Index Year
Province
NOR2O
25
26. Generation
GeoLinkedData - Transformation
• R2O is an e te s b e, fully dec a at e language to desc be
s a extensible, u y declarative a guage describe
mappings between relational database schemas and ontologies.
• The ODEMapster processor generates RDF instances from
relational instances based on the mapping description
pp g p
expressed in the R2O document
www.oeg-upm.net/index.php/en/downloads/9-r2o-odempaster
26
27. Generation
GeoLinkedData - Transformation
• Creation of the R2O Mappings
27
29. Generation
GeoLinkedData - Transformation
• Tool for generating RDF from geometrical information
• The geometry could be available in GML or WKT
• The RDF generated follows our Geometry Model
http://www.oeg-upm.net/index.php/en/downloads/151-geometry2rdf
29
30. Generation
GeoLinkedData - Transformation
Oracle STO UTIL package
SELECT TO_CHAR(SDO_UTIL.TO_GML311GEOMETRY(geometry))
AS Gml311Geometry
FROM "BCN200"."BCN200_0301L_RIO" c
WHERE c.Etiqueta='Arroyo'
30
32. Generation
Data Cleansing
• To find possible errors, identified by Hogan et al.
• http-level issues, such as accessibility and derefencability,
e.g.,
e g HTTP URIs ret rn 40 /50 errors
return 40x/50x
• reasoning issues such as namespace without vocabulary,
e.g., rss:item term invented
• malformed/incompatible datatypes, e.g., “true” as xsd:int
• To fix the identified errors
32
33. Generation
GeoLinkedData – Data Cleansing
• Errors
• Some resources, with the same name, were mixed. For
example,
e ample Granada municipality belongs to Granada
m nicipalit
province, and La Granada municipality belongs to Barcelona
Province.
• Autonomous communities that only have one province, e.g.,
Murcia Region, missed some municipalities, but their
corresponding provinces, e g Murcia Province have the
provinces e.g., Province,
correct number of municipalities.
• S
Some hydrographical resources missed some parts of their
f
geometrical information.
33
34. Generation
Linking
Identify suitable data sets http://ckan.net
as linking targets
Discover relationships
between data items
LIMES Silk Framework
http://aksw.org/Projects/limes http://www4.wiwiss.fu-berlin.de/bizer/silk/
Validate the relationships
discovered sameAs Validator
http://oegdev.dia.fi.upm.es:8080/sameAs/
34
40. Publication
Metadata Publication
• VoID allows to express metadata about RDF
datasets
• Open Provenance Model
40
41. Publication
Dataset discovery
• Register the dataset into CKAN Registry
• Generate sitemap files for your dataset, by using
sitemap4rdf
• Submit the sitemap location to Google and Sindice
http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation
41
42. Publication
GeoLinkedData – Dataset publication
HTML Linked Data SPARQL
Including Provenance Pubby
Support
http://www4.wiwiss.fu-berlin.de/pubby/ Pubby 0.3
Virtuoso 6.1.0
610
42
46. Exploitation
GeoLinkedData
http://oegdev.dia.fi.upm.es/projects/map4rdf/
map4rdf:
• Google maps viewer of RDF resources
• Resources with spatial information
• Extensible with google plugins
• Used in other applications like Aemet Goodrelations
Aemet,
map4rdf SPARQL
Triplestore
46
53. Methodological Guidelines for
Publishing Linked Data
g
Boris Villazón-Terrazas, Oscar Corcho
Facultad de Informática, Universidad Politécnica de Madrid
,
Campus de Montegancedo sn, 28660 Boadilla del Monte, Madrid
http://www.oeg-upm.net
{bvillazon,ocorcho}@fi.upm.es
Phone: 34 91 3366605 Fax: 34 91 3524819
34.91.3366605, 34.91.3524819
Slides available at: http://www.slideshare.net/boricles/
Acknowledgements: Asunción Gómez-Pérez, Luis M. Vilches,
Victor Saquicela, Al
Vi t S i l Alexander d L ó and many others th t we
d de León, d th that
may have omitted.
WorkdistributedunderthelicenseCreativeCommonsAttribution-
Noncommercial-Share Alike 3.0