Semantic Web and Drupal 7
Stéphane Corlosquet
ESIP webinar
June, 2013
About the speaker
● Stéphane “scor” Corlosquet
● 7 years with Drupal
● Software engineer
● Drupal 7 RDF core maintainer
● Drupal Security Team member
● Co-authored the
Definitive Guide to Drupal 7
● Contrib modules: RDF Extensions,
SPARQL, schema.org, WebID
● Member of the RDFa WG at W3C
The Semantic Web
The Web today
Many information silos
Image credits: www.pidgintech.com
Many isolated and disparate communities
Image credits: www.pidgintech.com
Growing amount of information
● Blogs, News, Comments
● Social platforms: Facebook, Google plus
● Everyday more and more content is published
● Desktop, laptops, tablets, smartphones...
● Sensor data for weather, traffic, healthcare
● Billions of public pages
● Deep web?
What do machines see?
Challenge:
How can machines help us
search all this information?
Vision of the Semantic Web
● Transition to the Giant Global Graph
● WWW = content+links
● GGG = WWW+relationships+descriptions
● Universal medium for data, information and
knowledge exchange
Evolution of the Web
The One Machine
● All devices connected
● Personal computers
● Data servers
● Cell phones
● PDAs
● RFID tags
http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php
Key
● Agree on Standards
● Open Data
Linked Data
Linked Open Data Cloud
LOD cloud by domain
Distribution of triples by domain.
Source: Linking Open Data cloud staistics, by Richard Cyganiak and Anja
Jentzsch. http://lod-cloud.net/
LOD cloud providers
● BBC
● US Census
● UK Gov
● Music Brainz
● Dbpedia
● Wikidata (WikiMedia foundation, funded by Google, etc.)
● Freebase (Google)
Rich Snippets
Google
Yahoo!
Bing
Structured Data in HTML
● Helps machines extract
relevant data from HTML
● Can make use of this data
in new ways:
– enhanced search results
– Knowledge graph
● Search engines only index HTML
Structured Data in HTML
● HTML attributes
● Syntaxes
– Microformats (@class, @rel)
– RDFa (@property, @typeof, @resource…)
– Microdata (@itemscope, @itemtype, @itemprop, …)
Schema.org
Schema.org
● Describe the type of your content (Person,
Event, Recipe, Product, Book, Movie, etc.)
– 416 types and counting
● Each type has a set of properties
– Common properties: name, description, image, url
– Specific properties depending on the type (see type page
on schema.org)
– 544 properties and counting
Credits: Dan Brickley - link.
Schema.org
How does schema.org apply to Drupal?
● Content types
Schema.org module for Drupal
● Map your content types and fields to the
schema.org terms
http://drupal.org/project/schemaorg
Content types and Fields
Content types and Fields
Content types and Fields
Rich Snippet testing tool
● http://www.google.com/webmasters/tools/richsnippets
Schema.org module
● http://drupal.org/project/schemaorg
– UI for mapping content types and fields to schema.org
– Documentation on drupal.org
– Screencast + examples
RDF architecture
in Drupal 7
Architecture
● User driven data model
Content types and Fields
Content types and Fields
Node
Drupal 7 and RDF
● The RDF mapping API allows any vocabulary
● Default mappings on blogs, forums, comments,
etc. using FOAF, SIOC, DC, SKOS
● Drupal 7 core outputs these mappings in RDFa
● Mappings can be changed to include other
vocabularies like schema.org
Drupal 7 default RDF mappings
Drupal 7 core RDF limitations
● No schema.org out of the box
● No UI for managing the RDF mappings
● Only core fields are supported (text, file, image)
– No support for contrib fields: addressfield, fivestar
● No native support for Views or Panels
– Display suite 2.0 is OK
● Some contrib modules can help fix the above
● Drupal 8 to fix these many of these issues
Drupal 7 and RDF
● Contributed module for more features
● RDF Extensions
● Serialization formats: RDF/XML, Turtle, N-Triples
● Mapping UI
● RDF Indexer
● Expose Drupal RDF data in a SPARQL Endpoint
● SPARQL Views
● Display remote RDF data in Drupal using SPARQL
● JSON-LD
● Expose Drupal RDF data as JSON-LD (CORS-enabled)
● Features and packaging
● Build distributions / deployment workflow
Drupal and SPARQL
RDF store + SPARQL Endpoint
● Indexing
Module: https://drupal.org/project/rdf_indexer
Documentation: https://drupal.org/node/2028111
RDF store + SPARQL Endpoint
● Public endpoint available at /sparql
RDF store + SPARQL Endpoint
● Lessons learnt:
– Previous implementation didn't scale
– http://drupal.org/project/sparql is deprecated (unless you
use the SPARQL registry with other modules)
– Use https://drupal.org/project/rdf_indexer instead
– Documentation: https://drupal.org/node/2028111
RDF store + SPARQL Endpoint
● Functionalities
– RDF Indexer is built on top of Search API
– Search API offers many of the capabilities needed:
● Track entities that need to be indexed
● Workflows
● Integration with Drush (list, status, clear, index)
● Cron
● OOP – Service class for plugins
– Integration with Features (config management)
RDF store + SPARQL Endpoint
● Example: popular tags by comments
– http://openspring.net/sparql
PREFIX dc: <http://purl.org/dc/terms/>
PREFIX sioc: <http://rdfs.org/sioc/ns#>
SELECT ?tag sum(?replies) as ?total_replies
WHERE {
?post sioc:num_replies ?replies.
?post dc:subject [ rdfs:label ?tag ] .
}
GROUP BY ?tag
ORDER BY DESC(?total_replies)
JSON-LD
JSON-LD in Drupal
● Client side as well as server side friendly
● Browser Scripting:
– Native javascript format
– Can be combined with RDFa API in the DOM
● Data can be fetched from anywhere:
– Cross-Origin Resource Sharing (CORS) enabled
● Client can mash data
● http://drupal.org/project/jsonld

Drupal and the Semantic Web - ESIP Webinar

  • 1.
    Semantic Web andDrupal 7 Stéphane Corlosquet ESIP webinar June, 2013
  • 2.
    About the speaker ●Stéphane “scor” Corlosquet ● 7 years with Drupal ● Software engineer ● Drupal 7 RDF core maintainer ● Drupal Security Team member ● Co-authored the Definitive Guide to Drupal 7 ● Contrib modules: RDF Extensions, SPARQL, schema.org, WebID ● Member of the RDFa WG at W3C
  • 3.
  • 4.
  • 5.
    Many information silos Imagecredits: www.pidgintech.com
  • 6.
    Many isolated anddisparate communities Image credits: www.pidgintech.com
  • 7.
    Growing amount ofinformation ● Blogs, News, Comments ● Social platforms: Facebook, Google plus ● Everyday more and more content is published ● Desktop, laptops, tablets, smartphones... ● Sensor data for weather, traffic, healthcare ● Billions of public pages ● Deep web?
  • 8.
  • 9.
    Challenge: How can machineshelp us search all this information?
  • 10.
    Vision of theSemantic Web ● Transition to the Giant Global Graph ● WWW = content+links ● GGG = WWW+relationships+descriptions ● Universal medium for data, information and knowledge exchange
  • 11.
  • 12.
    The One Machine ●All devices connected ● Personal computers ● Data servers ● Cell phones ● PDAs ● RFID tags http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php
  • 13.
    Key ● Agree onStandards ● Open Data
  • 14.
  • 15.
  • 16.
    LOD cloud bydomain Distribution of triples by domain. Source: Linking Open Data cloud staistics, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
  • 17.
    LOD cloud providers ●BBC ● US Census ● UK Gov ● Music Brainz ● Dbpedia ● Wikidata (WikiMedia foundation, funded by Google, etc.) ● Freebase (Google)
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
    Structured Data inHTML ● Helps machines extract relevant data from HTML ● Can make use of this data in new ways: – enhanced search results – Knowledge graph ● Search engines only index HTML
  • 23.
    Structured Data inHTML ● HTML attributes ● Syntaxes – Microformats (@class, @rel) – RDFa (@property, @typeof, @resource…) – Microdata (@itemscope, @itemtype, @itemprop, …)
  • 24.
  • 25.
    Schema.org ● Describe thetype of your content (Person, Event, Recipe, Product, Book, Movie, etc.) – 416 types and counting ● Each type has a set of properties – Common properties: name, description, image, url – Specific properties depending on the type (see type page on schema.org) – 544 properties and counting
  • 26.
  • 27.
  • 28.
    How does schema.orgapply to Drupal? ● Content types
  • 29.
    Schema.org module forDrupal ● Map your content types and fields to the schema.org terms http://drupal.org/project/schemaorg
  • 30.
  • 31.
  • 32.
  • 33.
    Rich Snippet testingtool ● http://www.google.com/webmasters/tools/richsnippets
  • 34.
    Schema.org module ● http://drupal.org/project/schemaorg –UI for mapping content types and fields to schema.org – Documentation on drupal.org – Screencast + examples
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
    Drupal 7 andRDF ● The RDF mapping API allows any vocabulary ● Default mappings on blogs, forums, comments, etc. using FOAF, SIOC, DC, SKOS ● Drupal 7 core outputs these mappings in RDFa ● Mappings can be changed to include other vocabularies like schema.org
  • 41.
    Drupal 7 defaultRDF mappings
  • 42.
    Drupal 7 coreRDF limitations ● No schema.org out of the box ● No UI for managing the RDF mappings ● Only core fields are supported (text, file, image) – No support for contrib fields: addressfield, fivestar ● No native support for Views or Panels – Display suite 2.0 is OK ● Some contrib modules can help fix the above ● Drupal 8 to fix these many of these issues
  • 43.
    Drupal 7 andRDF ● Contributed module for more features ● RDF Extensions ● Serialization formats: RDF/XML, Turtle, N-Triples ● Mapping UI ● RDF Indexer ● Expose Drupal RDF data in a SPARQL Endpoint ● SPARQL Views ● Display remote RDF data in Drupal using SPARQL ● JSON-LD ● Expose Drupal RDF data as JSON-LD (CORS-enabled) ● Features and packaging ● Build distributions / deployment workflow
  • 44.
  • 45.
    RDF store +SPARQL Endpoint ● Indexing Module: https://drupal.org/project/rdf_indexer Documentation: https://drupal.org/node/2028111
  • 46.
    RDF store +SPARQL Endpoint ● Public endpoint available at /sparql
  • 47.
    RDF store +SPARQL Endpoint ● Lessons learnt: – Previous implementation didn't scale – http://drupal.org/project/sparql is deprecated (unless you use the SPARQL registry with other modules) – Use https://drupal.org/project/rdf_indexer instead – Documentation: https://drupal.org/node/2028111
  • 48.
    RDF store +SPARQL Endpoint ● Functionalities – RDF Indexer is built on top of Search API – Search API offers many of the capabilities needed: ● Track entities that need to be indexed ● Workflows ● Integration with Drush (list, status, clear, index) ● Cron ● OOP – Service class for plugins – Integration with Features (config management)
  • 49.
    RDF store +SPARQL Endpoint ● Example: popular tags by comments – http://openspring.net/sparql PREFIX dc: <http://purl.org/dc/terms/> PREFIX sioc: <http://rdfs.org/sioc/ns#> SELECT ?tag sum(?replies) as ?total_replies WHERE { ?post sioc:num_replies ?replies. ?post dc:subject [ rdfs:label ?tag ] . } GROUP BY ?tag ORDER BY DESC(?total_replies)
  • 50.
  • 51.
    JSON-LD in Drupal ●Client side as well as server side friendly ● Browser Scripting: – Native javascript format – Can be combined with RDFa API in the DOM ● Data can be fetched from anywhere: – Cross-Origin Resource Sharing (CORS) enabled ● Client can mash data ● http://drupal.org/project/jsonld
  • 52.
    Domeo + Drupal ●Data mash up from independent, but related sources
  • 53.
    Domeo + Drupal ●Data mash up from independent, but related sources
  • 54.
    How to getinvolved ● Drupal group: – http://groups.drupal.org/semantic-web ● IRC on freenode: #drupal-rdf ● Report bugs or ask support questions on drupal.org
  • 55.
    Thanks! ● Stéphane Corlosquet: –scorlosquet@gmail.com – @scorlosquet – http://openspring.net/