Open Tourism!
The importance of enriching your online content
with semantic annotations.
#openbelgium15 #opentourism
Open Tourism!
AN OPEN KNOWLEDGE BELGIUM
WORKING GROUP
@rafke #opentourism
WHAT
…
WHAT
WHAT
COULD
BE?
WHAT
SHOULD
WE DO?
Joined forces with ‘sustainable mobile tourism
guides’, Thomas More - iMinds
Open Tourism!
Enriching your online content
with semantic annotations.
Anastasia Dimou
@natadimou #opentourism
http://dbpedia.org/resource/Belfry_of_ghent
Data model - Schemas - Vocabularies
Identify an Entity
http://dbpedia.org/resource/Belfry_of_ghent
Data model - Schemas - Vocabularies
Identify an Entity
http://dbpedia.org/resource/Belfry_of_ghent
Describe the attributes of an Entity
Belfry construction_started 1313
Belfry construction_finished 1380
Data model - Schemas - Vocabularies
Identify an Entity
http://dbpedia.org/resource/Belfry_of_ghent
Describe the attributes of an Entity
Belfry construction_started 1313
Belfry construction_finished 1380
Describe the relationship of an Entity with other Entities
Belfry located_in Gent
Reasons to semantically enrich your data
• discoverability / searchability
information recognized by major search providers
• indexing
• cross-referenced data
• (structured) data interoperability / integration / reuse
• automation
Semantically Enrich your Web presence
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
raw data with metadata (data catalogues)
Linked Open Data
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
data catalogues (raw data with metadata)
Linked Open Data
Inline Markup of HTML Web pages
nested metadata within HTML Web pages content (RDFa, Microdata)
(+) Search Engine Optimization (SEO) and Indexing
(+) Single infrastructure
(-) manual annotation by editors
(-) hard to maintain
(-) adjust existing Content Management System (CMS)
plugin OR custom code
1-time adjustment/development cost
+ maintenance cost (perhaps extra adjustment/development cost)
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
data catalogues (raw data with metadata)
Linked Open Data
Embedded metadata within HTML Web pages
structured data islands embedded in HTML Web pages (JSON-LD)
(+) Search Engine Optimization (SEO) and Indexing
(+) Single infrastructure
(+) easily deployable
(-) manual incorporation by admins
perhaps 1-time adjustment/development cost
+ maintenance cost (perhaps extra adjustment/development cost)
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
data catalogues (raw data with metadata)
Linked Open Data
Raw metadata complementary to the Web pages
structured data complementary to the Web pages (raw JSON-LD)
(+) easily deployable
(-) manual annotation by experts
(-) shortage of crawlers
at no cost
Semantically Enrich your Web presence
Many Web sites are generated from structured data,
which is often stored in databases.
and in general, you have more data to share…
Semantically Enrich your Web presence
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
raw data with metadata (data catalogues)
Linked Open Data
Open Data
raw data
(+) simple solution
(+) at no cost
data as Web APIs
(+) easily deployable
(+) publishing data in Web readable formats
(+) low cost - low maintenance cost
(-) reusable data only from people who know this data exists
(-) not synchronized with the Web site content
Semantically Enrich your Web presence
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
raw data with metadata (data catalogues)
Linked Open Data
Open Data with metadata - Data catalogues
(data) with metadata → data catalogues
(+) catalogue integration with existing CMSs
(+) increased Searchability / Discoverability
(+) Federated structure: easily set up instances with common search
1-time cost to set up the infrastructure + maintenance cost
integration with CMS:
1-time cost + maintenance/synchronization cost
Semantically Enrich your Web presence
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
raw data with metadata (data catalogues)
Linked Open Data
Linked Open Data
semantically annotated data
domain modeling
map raw data to their semantic representation
publishing Linked Open Data
(+) Answering to queries
(+) data integration / interlinking
1-time (human resources) cost for the domain modeling
1-time cost to set up the infrastructure (mapping and publishing)
maintenance cost
Reasons to semantically enrich your data
• discoverability / searchability
information recognized by major search providers
• indexing
• cross-referenced data
• (structured) data interoperability / integration / reuse
• automation
Open Tourism!
How to add semantic annotations?
Laurens De Vocht
@Laurens_d_v #opentourism
Without Metadata With Metadata
Web Documents HTML HTML with rich snippets
HTML with schema.org
Raw Data File Dump, API DCAT, VOID (description)
CKAN (catalog)
Linked Data RDF (data model)
file dumps, dereferencing (retrieval)
endpoint, API (queries)
Adding Semantic Annotations
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
raw data with metadata (data catalogues)
Linked Open Data
Adding Metadata to Web Documents
<div>
<a href="http://www.example.com/events/spinaltap">Spinal Tap</a>
<img src="spinal_tap.jpg" />
After their highly-publicized search for a new drummer,
Spinal Tap kicks off their latest comeback tour with a San
Francisco show.
When: Oct 15, 7:00PM—9:00PM
Where: Warfield Theatre, 982 Market St, San Francisco, CA
<div xmlns:v="http://rdf.data-vocabulary.org/#" typeof="v:Event">
<a href="http://www.example.com/events/spinaltap" rel="v:url"
property="v:summary">Spinal Tap</a>
<img src="spinal_tap.jpg" rel="v:photo" />
<span property="v:description">After their highly-publicized search for a new drummer,
Spinal Tap kicks off their latest comeback tour with a San Francisco show. </span>
When:
<span property="v:startDate" content="2015-10-15T19:00-08:00">Oct 15,
7:00PM</span>
<span property="v:endDate" content="2015-10-15T21:00-08:00">9:00PM</span>
Where:
<span rel="v:location">
<span typeof="v:Organization">
<span property="v:name">Warfield Theatre</span>,
<span rel="v:address">
<span typeof="v:Address">
<span property="v:street-address">982 Market St</span>,
<span property="v:locality">San Francisco</span>,
<span property="v:region">CA</span>
</span>
</span>
...
what search engines
display to us
what search
engines read
Adding Metadata to Web Documents
http://rdf.data-vocabulary.org/
a 'lightweight' alternative to
http://schema.org
schema.org:
has broader range of types
more documentation
broad support by different search engines
it has types for Event, Organization, Person, Product, Review, AggregateRating, Offer and hundreds of
others.
full list -> http://schema.org/docs/full.html
Schemas - Vocabularies - Schema.org
collection of schemas to markup HTML pages
Leveraging the CMS infrastructure
source: builtwith.com
Wordpress, Drupal and
Joomla cover over 50% of
the internet website CMS.
Wordpress: Adding Rich Snippets Example
e.g. use Google SEO Pressor Plugin
(https://wordpress.org/plugins/google-seo-author-snippets/)
Wordpress: Adding Schema.org Example
e.g. use Add Metadata Tags Plugin
(https://wordpress.org/plugins/add-meta-tags/)
Other CMS offer plugins with similar functionality
e.g. Joomla: J4Schema
(http://extensions.joomla.org/extensions/extension/sit
e-management/seo-a-metadata/j4schema)
e.g. Drupal 7 Schema.org
(https://www.drupal.org/node/1194024)
Alternative to CMS plugins
Custom Scripts
Manual Annotations
Annotation Templates
Does it really work? How to be sure?
Try with your HTML snippet or URL -> https://developers.google.com/structured-data/testing-tool/
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
raw data with metadata (data catalogues)
Linked Open Data
What's next?
Raw
Data
CMS
?
publishing raw data directly
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as Web APIs
raw data with metadata (data catalogues)
Linked Open Data
Publishing Data Directly
e.g. http://datahub.io/dataset/outbound-travel-by-new-zealanders
Publish link to raw data (file dump)
from website to platform like
datahub.io
How can others find it?
Add metadata and register it!
catalog with metadata
of dataset
(Meta)Data Catalog
CKAN is one of the most common
platforms for maintaining a data catalog
e.g. Flanders Open Data portal:
opendataforum.info
is built on top of CKAN.
e.g. Also datahub.io makes use of it.
(Meta)Data Description of Data
DCAT VOID
:ds1 a dcat:Dataset ;
dcat:distribution :dist1 .
:dist1 a dcat:Download ;
dcat:accessURL <http://example.org/dist1.csv>;
dcat:format [ rdfs:label "CSV" ].
integrate with catalogs,
such as CKAN.
http://www.w3.org/TR/vocab-dcat/
express additional information such as
categories, links with other datasets,
hierarchical relationships etc
http://www.w3.org/TR/void/
:DBpedia rdf:type void:Dataset ;
foaf:homepage <http://dbpedia.org/> .
:DBLP rdf:type void:Dataset ;
foaf:homepage <http://www4.wiwiss.fu-berlin.de/dblp/all> ;
dcterms:subject <http://dbpedia.org/resource/Computer_science> ;
:DBpedia void:subset :DBpedia2DBLP .
:DBpedia2DBLP rdf:type void:Linkset ;
void:target :DBpedia ;
void:target :DBLP .
what machines see (more)
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
raw data with metadata (data catalogues )
Linked Open Data
How to query and link datasets?
Raw
Data
CMS
querying, linking?
?
?
The catalog indicates supported ways of accessing data.
API
Querying
Website
Dereferencing
Metadata
http://datahub.io/dataset/tourpedia
Adding Semantics and Representing Data as Linked Data
When?
http://rml.io http://www.w3.org/TR/r2rml/
How?
CSV, HTML,
JSON, XML
RDB Transformation
http://d2rq.org/
Layer on top of RDB
http://any23.apache.org/
Web documents to triples
Raw
Data
CMS
Other
Raw DataCatalog
Web of
Documents
Web of Data
Semantically Enrich your Web presence
Web of Documents
Inline markup of HTML documents
Embedded metadata within HTML documents
Raw metadata accompanying the HTML document
Web of Data
Open Data
raw data AND data as APIs
raw data with metadata (data catalogues)
Linked Open Data
Open Tourism!
Barriers and Solutions to Open
Tourism Data
Panel discussion
@rafke @mindwraps #opentourism
Panel discussion
Join our discussion on http://bit.ly/opendisc
Marc
@mportier
Anastasia
@natadimou
Laurens
@Laurens_d_v
Raf
@rafke
Veronique
@VeroniqueCosse
Wouter
@mindwraps
join Open Tourism working group!
#opentourism
raf{dot}buyle{at}okfn{dot}be
A world where knowledge creates power
for the many, not the few.
A world where data frees us — to make
informed choices about how we live, what
we buy and who gets our vote.
A world where information and insights are
accessible — and apparent — to everyone.
This is the world we choose
#openbelgium15 #opentourism

Open belgium 2015 - open tourism

  • 1.
    Open Tourism! The importanceof enriching your online content with semantic annotations. #openbelgium15 #opentourism
  • 2.
    Open Tourism! AN OPENKNOWLEDGE BELGIUM WORKING GROUP @rafke #opentourism
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
    Joined forces with‘sustainable mobile tourism guides’, Thomas More - iMinds
  • 8.
    Open Tourism! Enriching youronline content with semantic annotations. Anastasia Dimou @natadimou #opentourism
  • 14.
  • 15.
    Data model -Schemas - Vocabularies Identify an Entity http://dbpedia.org/resource/Belfry_of_ghent
  • 16.
    Data model -Schemas - Vocabularies Identify an Entity http://dbpedia.org/resource/Belfry_of_ghent Describe the attributes of an Entity Belfry construction_started 1313 Belfry construction_finished 1380
  • 17.
    Data model -Schemas - Vocabularies Identify an Entity http://dbpedia.org/resource/Belfry_of_ghent Describe the attributes of an Entity Belfry construction_started 1313 Belfry construction_finished 1380 Describe the relationship of an Entity with other Entities Belfry located_in Gent
  • 20.
    Reasons to semanticallyenrich your data • discoverability / searchability information recognized by major search providers • indexing • cross-referenced data • (structured) data interoperability / integration / reuse • automation
  • 21.
    Semantically Enrich yourWeb presence Web of Documents Inline markup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs raw data with metadata (data catalogues) Linked Open Data
  • 22.
    Web of Documents Inlinemarkup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs data catalogues (raw data with metadata) Linked Open Data
  • 23.
    Inline Markup ofHTML Web pages nested metadata within HTML Web pages content (RDFa, Microdata) (+) Search Engine Optimization (SEO) and Indexing (+) Single infrastructure (-) manual annotation by editors (-) hard to maintain (-) adjust existing Content Management System (CMS) plugin OR custom code 1-time adjustment/development cost + maintenance cost (perhaps extra adjustment/development cost)
  • 24.
    Web of Documents Inlinemarkup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs data catalogues (raw data with metadata) Linked Open Data
  • 25.
    Embedded metadata withinHTML Web pages structured data islands embedded in HTML Web pages (JSON-LD) (+) Search Engine Optimization (SEO) and Indexing (+) Single infrastructure (+) easily deployable (-) manual incorporation by admins perhaps 1-time adjustment/development cost + maintenance cost (perhaps extra adjustment/development cost)
  • 26.
    Web of Documents Inlinemarkup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs data catalogues (raw data with metadata) Linked Open Data
  • 27.
    Raw metadata complementaryto the Web pages structured data complementary to the Web pages (raw JSON-LD) (+) easily deployable (-) manual annotation by experts (-) shortage of crawlers at no cost
  • 28.
    Semantically Enrich yourWeb presence Many Web sites are generated from structured data, which is often stored in databases. and in general, you have more data to share…
  • 29.
    Semantically Enrich yourWeb presence Web of Documents Inline markup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs raw data with metadata (data catalogues) Linked Open Data
  • 30.
    Open Data raw data (+)simple solution (+) at no cost data as Web APIs (+) easily deployable (+) publishing data in Web readable formats (+) low cost - low maintenance cost (-) reusable data only from people who know this data exists (-) not synchronized with the Web site content
  • 31.
    Semantically Enrich yourWeb presence Web of Documents Inline markup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs raw data with metadata (data catalogues) Linked Open Data
  • 32.
    Open Data withmetadata - Data catalogues (data) with metadata → data catalogues (+) catalogue integration with existing CMSs (+) increased Searchability / Discoverability (+) Federated structure: easily set up instances with common search 1-time cost to set up the infrastructure + maintenance cost integration with CMS: 1-time cost + maintenance/synchronization cost
  • 33.
    Semantically Enrich yourWeb presence Web of Documents Inline markup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs raw data with metadata (data catalogues) Linked Open Data
  • 34.
    Linked Open Data semanticallyannotated data domain modeling map raw data to their semantic representation publishing Linked Open Data (+) Answering to queries (+) data integration / interlinking 1-time (human resources) cost for the domain modeling 1-time cost to set up the infrastructure (mapping and publishing) maintenance cost
  • 35.
    Reasons to semanticallyenrich your data • discoverability / searchability information recognized by major search providers • indexing • cross-referenced data • (structured) data interoperability / integration / reuse • automation
  • 36.
    Open Tourism! How toadd semantic annotations? Laurens De Vocht @Laurens_d_v #opentourism
  • 37.
    Without Metadata WithMetadata Web Documents HTML HTML with rich snippets HTML with schema.org Raw Data File Dump, API DCAT, VOID (description) CKAN (catalog) Linked Data RDF (data model) file dumps, dereferencing (retrieval) endpoint, API (queries) Adding Semantic Annotations
  • 38.
    Web of Documents Inlinemarkup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs raw data with metadata (data catalogues) Linked Open Data
  • 39.
    Adding Metadata toWeb Documents <div> <a href="http://www.example.com/events/spinaltap">Spinal Tap</a> <img src="spinal_tap.jpg" /> After their highly-publicized search for a new drummer, Spinal Tap kicks off their latest comeback tour with a San Francisco show. When: Oct 15, 7:00PM—9:00PM Where: Warfield Theatre, 982 Market St, San Francisco, CA <div xmlns:v="http://rdf.data-vocabulary.org/#" typeof="v:Event"> <a href="http://www.example.com/events/spinaltap" rel="v:url" property="v:summary">Spinal Tap</a> <img src="spinal_tap.jpg" rel="v:photo" /> <span property="v:description">After their highly-publicized search for a new drummer, Spinal Tap kicks off their latest comeback tour with a San Francisco show. </span> When: <span property="v:startDate" content="2015-10-15T19:00-08:00">Oct 15, 7:00PM</span> <span property="v:endDate" content="2015-10-15T21:00-08:00">9:00PM</span> Where: <span rel="v:location"> <span typeof="v:Organization"> <span property="v:name">Warfield Theatre</span>, <span rel="v:address"> <span typeof="v:Address"> <span property="v:street-address">982 Market St</span>, <span property="v:locality">San Francisco</span>, <span property="v:region">CA</span> </span> </span> ... what search engines display to us what search engines read
  • 40.
    Adding Metadata toWeb Documents http://rdf.data-vocabulary.org/ a 'lightweight' alternative to http://schema.org schema.org: has broader range of types more documentation broad support by different search engines it has types for Event, Organization, Person, Product, Review, AggregateRating, Offer and hundreds of others. full list -> http://schema.org/docs/full.html
  • 41.
    Schemas - Vocabularies- Schema.org collection of schemas to markup HTML pages
  • 42.
    Leveraging the CMSinfrastructure source: builtwith.com Wordpress, Drupal and Joomla cover over 50% of the internet website CMS.
  • 43.
    Wordpress: Adding RichSnippets Example e.g. use Google SEO Pressor Plugin (https://wordpress.org/plugins/google-seo-author-snippets/)
  • 44.
    Wordpress: Adding Schema.orgExample e.g. use Add Metadata Tags Plugin (https://wordpress.org/plugins/add-meta-tags/)
  • 45.
    Other CMS offerplugins with similar functionality e.g. Joomla: J4Schema (http://extensions.joomla.org/extensions/extension/sit e-management/seo-a-metadata/j4schema) e.g. Drupal 7 Schema.org (https://www.drupal.org/node/1194024)
  • 46.
    Alternative to CMSplugins Custom Scripts Manual Annotations Annotation Templates
  • 47.
    Does it reallywork? How to be sure? Try with your HTML snippet or URL -> https://developers.google.com/structured-data/testing-tool/
  • 48.
    Web of Documents Inlinemarkup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs raw data with metadata (data catalogues) Linked Open Data
  • 49.
  • 50.
    Web of Documents Inlinemarkup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as Web APIs raw data with metadata (data catalogues) Linked Open Data
  • 51.
    Publishing Data Directly e.g.http://datahub.io/dataset/outbound-travel-by-new-zealanders Publish link to raw data (file dump) from website to platform like datahub.io
  • 52.
  • 53.
    Add metadata andregister it! catalog with metadata of dataset
  • 54.
    (Meta)Data Catalog CKAN isone of the most common platforms for maintaining a data catalog e.g. Flanders Open Data portal: opendataforum.info is built on top of CKAN. e.g. Also datahub.io makes use of it.
  • 55.
    (Meta)Data Description ofData DCAT VOID :ds1 a dcat:Dataset ; dcat:distribution :dist1 . :dist1 a dcat:Download ; dcat:accessURL <http://example.org/dist1.csv>; dcat:format [ rdfs:label "CSV" ]. integrate with catalogs, such as CKAN. http://www.w3.org/TR/vocab-dcat/ express additional information such as categories, links with other datasets, hierarchical relationships etc http://www.w3.org/TR/void/ :DBpedia rdf:type void:Dataset ; foaf:homepage <http://dbpedia.org/> . :DBLP rdf:type void:Dataset ; foaf:homepage <http://www4.wiwiss.fu-berlin.de/dblp/all> ; dcterms:subject <http://dbpedia.org/resource/Computer_science> ; :DBpedia void:subset :DBpedia2DBLP . :DBpedia2DBLP rdf:type void:Linkset ; void:target :DBpedia ; void:target :DBLP . what machines see (more)
  • 56.
    Web of Documents Inlinemarkup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs raw data with metadata (data catalogues ) Linked Open Data
  • 57.
    How to queryand link datasets? Raw Data CMS querying, linking? ? ?
  • 58.
    The catalog indicatessupported ways of accessing data. API Querying Website Dereferencing Metadata http://datahub.io/dataset/tourpedia
  • 59.
    Adding Semantics andRepresenting Data as Linked Data When? http://rml.io http://www.w3.org/TR/r2rml/ How? CSV, HTML, JSON, XML RDB Transformation http://d2rq.org/ Layer on top of RDB http://any23.apache.org/ Web documents to triples
  • 60.
  • 61.
    Semantically Enrich yourWeb presence Web of Documents Inline markup of HTML documents Embedded metadata within HTML documents Raw metadata accompanying the HTML document Web of Data Open Data raw data AND data as APIs raw data with metadata (data catalogues) Linked Open Data
  • 62.
    Open Tourism! Barriers andSolutions to Open Tourism Data Panel discussion @rafke @mindwraps #opentourism
  • 63.
    Panel discussion Join ourdiscussion on http://bit.ly/opendisc
  • 64.
  • 65.
    join Open Tourismworking group! #opentourism raf{dot}buyle{at}okfn{dot}be
  • 66.
    A world whereknowledge creates power for the many, not the few. A world where data frees us — to make informed choices about how we live, what we buy and who gets our vote. A world where information and insights are accessible — and apparent — to everyone. This is the world we choose #openbelgium15 #opentourism