Drupal and the semantic web - SemTechBiz 2012

Uploaded on


More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Leveraging the Semantic Web with Drupal 7Stéphane Corlosquet, Paolo CiccareseMIND InformaticsSemTechBiz San Francisco 2012June 4th, 2012
  • 2. About the speakers● Stéphane Corlosquet ● 6 years with Drupal ● Drupal core maintainer (RDF) ● Drupal Security Team member ● Co-authored the Definitive Guide to Drupal 7 ● Co-maintain RDF Extensions, SPARQL, schema.org ● Member of the RDFa WG
  • 3. About the speakers● Paolo Ciccarese, PhD ● Assistant in Neurology at Mass General Hospital ● Research faculty at Harvard Medical School ● Author of 30+ scientific publications ● Senior software and knowledge engineer ● Member of W3C HCLS Interest Group ● Co-chair of the W3C Open Annotation Community Group
  • 4. Tutorial outline● Introduction to Drupal ● What is it good for ● Installation / Hosted Drupal● Semantic Web and Drupal ● Technology stack ● Use cases, hands on session● Domeo & Drupal
  • 5. Drupal● Dries Buytaert - small news site in 2000● Open Source - 2001● Content Management System● LAMP stack● Non-developers can build sites and publish content● Control panels instead of code http://www.flickr.com/photos/funkyah/2400889778 /
  • 6. Drupal ● Open & modular architecture ● Extensible by modules ● Standards-based ● Low resource hosting ● Scalable
  • 7. Building a Drupal site http://www.flickr.com/photos/toomuchdew/3792159077/
  • 8. Building a Drupal site● Create the content types you need Blog, article, wiki, forum, polls, image, video, podcast, e- commerce... (be creative) http://www.flickr.com/photos/georgivar/4795856532/
  • 9. Building a Drupal site● Enable the features you want Comments, tags, voting/rating, location, translations, revisions, search... http://www.flickr.com/photos/skip/42288941/
  • 10. Building a Drupal siteSet how your content is displayed
  • 11. Building a Drupal siteThousands of freecontributed modules● Google Analytics● Wysiwyg● Captcha● Calendar● XML sitemap● Five stars● Twitter● ... http://www.flickr.com/photos/kaptainkobold/1422600992/
  • 12. The Drupal Community http://www.flickr.com/photos/x-foto/4923221504/
  • 13. The Drupal Community “It’s really the Drupal community and not so muchthe software that makes the Drupal project what it is. So fostering the Drupal community is actuallymore important than just managing the code base.” - Dries Buytaert http://webchick.net/node/80
  • 14. Who uses Drupal?
  • 15. Who uses Drupal?
  • 16. Who uses Drupal?
  • 17. Who uses Drupal?
  • 18. Who uses Drupal?
  • 19. Who uses Drupal?
  • 20. Who uses Drupal?
  • 21. Who uses Drupal?
  • 22. Who uses Drupal?
  • 23. Who uses Drupal? http://buytaert.net/tag/drupal-sites
  • 24. Try Drupal 7● Download and Install Drupal 7 ● Grab latest release http://drupal.org/project/drupal ● LAMP stack: – Mac OS: http://www.mamp.info/ – Acquia Stack http://acquia.com/downloads● Drupal Gardens: free Drupal 7 site http://www.drupalgardens.com/
  • 25. Rich Snippets
  • 26. Google
  • 27. Yahoo!
  • 28. Bing
  • 29. Why Structured Data in HTML● Help machines extract relevant data from HTML● Can make use of this data in amazing ways (e.g. enhanced search results)
  • 30. Structured Data in HTML● Add or alter HTML attributes● Syntaxes – Microformats (@class, @rel) – RDFa (@property, @about, @typeof, …) – Microdata (@itemscope, @itemtype, @itemprop, …) – RDFa 1.1 & RDFa Lite
  • 31. Structured Data in HTML● Evolution and cross-syntax influence
  • 32. Schema.org
  • 33. Schema.org● Describe the type of your content (Person, Event, Recipe, Product, Book, Movie, etc.) – 290 types and counting● Each type has a set of properties – Common properties: name, description, image, url – Specific properties depending on the type (see type page on schema.org) – 240 properties and counting
  • 34. Credits: Dan Brickley - link.
  • 35. Schema.org
  • 36. Schema.org module for Drupal● UI instead of code● Map your content types and fields to the schema.org terms http://drupal.org/project/schemaorg
  • 37. Example: Event
  • 38. Rich Snippet testing tool● http://www.google.com/webmasters/tools/richsnippets
  • 39. Examples in the wild● Events – “force11 events”: http://goo.gl/VVhNM – DrupalCon Munich: http://goo.gl/jgMvw● Recipes – “delicious lemon coconut squares”: http://goo.gl/ORdl1 – Apple pie with ingredients: http://goo.gl/wCO1w
  • 40. Examples in the wild● University of Waterloo – School of Public Health and Health Systems launch: http://goo.gl/Df9hp● Curling tournament calendar – European Curling Championships 2012: http://goo.gl/YXgXl – World Women’s Curling Championships 2013: http://goo.gl/BDNZW
  • 41. Schema.org module● http://drupal.org/project/schemaorg – Download module (beta) – Documentation on drupal.org – Screencast + examples
  • 42. Schema.org module Play time!http://www.google.com/webmasters/tools/richsnippets
  • 43. Drupal 7 and RDF
  • 44. History of RDF in Drupal● rdf.php (2000, Dries)● FOAF, vCard (2004, walkah)● Relationship (2005, dman)● Semantic Search (2006, hendler)● RDF (2007, Arto)● OpenCalais (febbraro, 2008)● RDF CCK (2008, scor)
  • 45. Drupal 7 and RDF● Drupal 7 core is RDFa enabled● RDFa output by default on blogs, forums, comments, etc. 
  • 46. Architecture● User driven data model● Content type => RDF class● Field => RDF property● Node => RDF resource http://en.wikipedia.org/wiki/File:Oriente_Station_Lisboa_roof.jpg
  • 47. Content types and Fields
  • 48. Content types and Fields
  • 49. Node
  • 50. Drupal 7 and RDF
  • 51. Drupal 7 and RDF● Contributed module for more features ● RDF Extensions ● Serialization formats: RDF/XML, Turtle, N-Triples ● SPARQL ● Expose Drupal RDF data in a SPARQL Endpoint ● SPARQL Views ● Display remote RDF data in Drupal using SPARQL ● JSON-LD ● Expose Drupal RDF data as JSON-LD (CORS-enabled) ● Features and packaging ● Build distributions / deployment workflow
  • 52. SPARQL Endpoint● Indexing http://drupal.org/project/sparql
  • 53. SPARQL Endpoint● Public endpoint available at /sparql● http://prefix.cc/sioc,rnews.sparql
  • 54. JSON-LD in Drupal● Client side as well as server side friendly● Browser Scripting: – Native javascript format – RDFa API in the DOM● Data can be fetched from anywhere: – Cross-Origin Resource Sharing (CORS) enabled● Client can mash data● http://drupal.org/project/jsonld
  • 55. JSON-LD plug
  • 56. RDFa 1.1● RDFa Lite● RDFa 1.1 Full● http://rdfa.info/play/
  • 57. DemosrNews / SPARQL PREFIX dc: <http://purl.org/dc/terms/> PREFIX rnews: <http://iptc.org/std/rNews/2011-10-07#> SELECT * WHERE { ?s a rnews:Article; dc:title ?title. }
  • 58. Demos● Occupy Directory – http://directory.occupy.net/occupations – JSON-LD: http://directory.occupy.net/node/19652.jsonld● Federated General Assembly – Drupal distribution for occupy movement – http://wiki.occupy.net/wiki/Federated_General_Assembly
  • 59. D OM E O : a web-based tool forsemantic annotation of online documents
  • 60. As (biomedical) scientists…• We deal with an increasing amount of digital resources (documents, images, videos, datasets, databases… )• We commonly use annotation but… – are we really efficient? – can we leverage machine computation? – can we share it easily with our colleagues? – can we capitalize on the work of colleagues?
  • 61. Annotation Framework (C omponents)• A nno ta tion O ntolog y (A O ): O WL   vocabulary for representing and sharing annotation of digital resources and their fragments – Website http:/purl.org/ home / ao/ – P aper http:/www.jbiomedsem.com/ / content/ S 2/ 4 2/ S• D O M E O c lient: web application for producing and sharing manual, semi- automatic and automatic annotation – Website http:/annotationframework.org / – P aper http:/www.jbiomedsem.com/ / content/ S 1/ 1 3/ S
  • 62. Annotation of digital resources Visually and effectively annotate - bettersemantically annotate - any digital resourceand resource fragment, while performing our regular browsing/ reading activities http:/ antibodyregis try.org/ / antibody17/antibodyform.html? gui_type=advanced&ab_id=2266850 antibodyregistry.org
  • 63. Leverage text mining and community curation R un text mining and entities recognition algorithms on scientific documents and persist the results in a standard format B enefit from crowdsourcing by supportingcuration of manual and automatic annotation
  • 64. … and more• E fficiently search and reuse the annotation – S emantic inference• S ubscribe to feeds related to topics of interest – P roteins, C ells, Authors, P apers…• R etrieve additional content (mashups) – E ntrez gene, UniP rot, …
  • 65. S emantic tagging through ontologies S emantic Taghttp:/ purl.obolibrary.org/ / obo/ R _000004168 PLabel ‘amyloid beta A4 protein’E xact synonyms ‘AP P ’, ‘amyloidogenic glycoprotein’, …R elated S ynonyms ‘A4’, ‘AB P P ’, Is a http:/ purl.obolibrary.org/ / obo/ R _000000001 P Label ‘protein’ D efinition ‘An amino acid chain that… ’ S ource: P rotein Ontology (P R O ) https:/ pir5.georgetown.edu/ / wiki/ R O P
  • 66. AP P s for the S emantic R esources P roject, M ay 2010
  • 67. Zooming in AP P s for the S emantic R esources P roject, M ay 2010
  • 68. Annotation O ntology (AO ) O WL  vocabulary for representing and sharingannotation of digital resources and their fragments Not only for biomedicine! –Website http:/purl.org/ home / ao/ –P aper http:/www.jbiomedsem.com/ / content/ S 2/ 4 2/ S
  • 69. A simplified view of AOAO allows to annotate: R es o urc es : D ocuments (HTM L, P D F, Word, E xcel), Images, D atabases, Web S ervices... (and their fragments) S pecifying (or not) an: A nno ta tio n Type : through one of the already available types (errata, highlight, qualifiers...) or the ones the users will define.With (or without) a: Topic : free text, structured text, UR Is, R D F entities, R D F graphs, domain ontologies…Tracing: P rovena nc e : who created what, when, with which software, with what expectations…
  • 70. AlzS WAN: http:/tinyurl.com/ / 18r Annotating a document
  • 71. Annotating a document fragmentP rotein O ntology – P R O : http:/ purl.org/ / obo/owl/ R O P
  • 72. E xperiments Workflows HyQ ue triplesS WAN O ntology 2.0: http:/ code.google.com/ swan-ontology/ / p/
  • 73. Annotation O ntology Network B ioteaThe Living D ocument P roject
  • 74. Open Annotation Community GroupAnnotation O ntology is going to be replacedin our applications by the O pen Annotation M odel developed through the W3C Open Annotation C ommunity Group –Website http:/www.w3.org/ / community/openannotation/ –C ore M odel http:/www.openannotation.org/ / spec/core/ –E xtensions http:/www.openannotation.org/ / spec/extension/
  • 75. D O M E O: D ocument M etadata Organizer
  • 76. S emantic Tags or Q ualifiers [1]
  • 77. S emantic Tags or Q ualifiers [2]
  • 78. S emantic Tags or Q ualifiers [3]
  • 79. D omeo and the NC B O Annotator annotator-service D omeo allows automatic/ manual annotation withterms coming from selected ontologies managed by the B ioP ortal http:/ www.bioontology.org/ /
  • 80. R unning NC BO AnnotatorAdditional text mining serviceswill be listed here
  • 81. NC BO Annotator R esults in D omeoList of recognizedentities
  • 82. R esults C uration C ustomizable
  • 83. C umulative R esults C uration One item only All instances with the same text matchAll instances independently from the text match
  • 84. S erialization in AO / D F (S hare) R
  • 85. UIM A, C lerezza and AO E valuating P erformance C omparing Algorithms Learning … Text Curated M ining R esults AO R D F Text M ining R esults Applications AO R D F P ublishinghttp:/ www.slideshare.net/ / paolociccarese/domeo-and-text-mining
  • 86. CombiningDisparateSourcesofData h[p://annota7onframework.org/!
  • 87. Demos● Domeo + Drupal – Data mash up from independent, but related sources
  • 88. Thanks!● Stéphane Corlosquet: scorlosquet@gmail.com – @scorlosquet – http://openspring.net/● Paolo Ciccarese: paolo.ciccarese@gmail.com