Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Exploring the Semantic Web


Published on

Talk about Exploring the Semantic Web, and particularly Linked Data, and the Rhizomer approach. Presented August 14th 2012 at the SRI AIC Seminar Series, Menlo Park, CA

Published in: Technology, Education
  • Be the first to comment

Exploring the Semantic Web

  1. 1. Exploring Semantic Web Data and particularly Linked Data Roberto García AIC Seminar Series SRI International, Menlo Park, August 14th 2012Human-Computer Interaction Universitat de Lleida and Data Integration Spain Research Group
  2. 2. Who• Associate Professor, Universitat de Lleida, Spain• Visiting Associate Professor, Standford University – Stanford HCI Group• +12 years Semantic Web research – 1999 MSc Thesis: Knowledge Management using RDF plus reasoning (SiLRI) – 2006 PhD Thesis: A Semantic Web approach to DRM – 2006- Copyright Ontology – 2007- Lleida HCI Group, Semantic Web User Interfaces
  3. 3. What is Open Data?“Open data is data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike” Open Knowledge Foundation• Make your data OPEN – Available online with open license • For instance Creative Commons CC-BY – No more than reproduction cost – No matter format
  4. 4. Open Data Worldwide• 169 initiatives Rate: – City (40), Country, Region or State (125), Supranational (4)
  5. 5. Welcome to Data.CA.Gov
  6. 6. Open Data Formats • However, encourage formats that facilitate reuse and interoperability – Tim Berners-Lee 5 stars classification
  7. 7. ★ Open Data • Make data available on the Web under an open license – Data licenses: • Public Domain Dedication and License (PDDL), Open Data Commons Attribution License (ODC-by) or Creative Commons Public Domain Dedication (CC0) • Whatever format – Example: PDF • But… data is locked-up in a document – Hard to get data out, custom scrapers
  8. 8. ★★ Open Data • Make it available as structured data – Example: Excel instead of image scan of a table • But… data still locked-up – You depend on proprietary software
  9. 9. ★★★ Open Data • Use non-proprietary formats – Example: CSV instead of Excel "Temperature forecast for Galway", "Day","Lowest Temperature (C)" "Saturday, 13 November 2010",2 "Sunday, 14 November 2010",4 "Monday, 15 November 2010",7 • But… data on the Web and not data in the Web – What does “Galway” mean? Is it a temperature? What is the unit? Local time?...
  10. 10. Galway (disambiguation)• Places – Ireland • Galway • County Galway • Galway Bay – Sri Lanka • Galways Land National Park – United States • Galway (town), New York • Galway (village), New York• Things – Galway (sheep), a breed of sheep that originated in Galway, Ireland – Galway harp, a type of harp – Galway Hooker, a type of sailing boat – Galway or Claddagh Ring, a type of wedding ring made in Galway• …
  11. 11. ★★★★ Open Data• Use URIs to identify things, so that people can point at your stuff – Example: RDF1 (but also Atom, OData, JSON-LD,…) @prefix meteo: <> . @prefix galweather: <> . Vocabularies @prefix xsd: <> . Ontologies <> meteo:forecast [ meteo:predicted "2010-11-13T12:00:00Z"^^xsd:dateTime ; meteo:temperature [ meteo:celsius "2"^^xsd:decimal ] ] .• But… what if we (humans or computers) don’t know what means? 1 Resource Description Framework,
  12. 12. ★★★★★ Linked Open Data• Link your data to other data to provide context (semantics, meaning) – Example: HTTP GET @prefix dbpedia: <> . ... dbpedia:Galway a <>, <>, <>; rdfs:label "Galway"@en; dbp:populationBlank "Galwegian, Tribesman"@en; dbp:populationTotal "75529"^^xsd:int; dbp:populationUrban "76778"^^xsd:int; dcterms:subject <>, <>, <>, <>; rdfs:comment "Galway or City of Galway (Cathair na Gaillimhe) is a city on the west coast of Ireland. It is located on the River Corrib between Lough Corrib and Galway Bay and is surrounded by County Galway. It is the third largest city within the state, though if the wider urban area is included then it falls into fourth place behind Limerick. The population of Galway city at the 2011 census was 75,529, rising to 76,778 across the entire urban area."@en; geo:lat 53.2719; geo:long -9.04889; foaf:homepage <> . … and also dbp:PopulationTotal, dct:subject,…
  13. 13. Network Effect ~31 billion statements
  14. 14. Fine for computers… but people? C. Warren (blogger) I’m writing about “Films I Like”. Can I reuse LinkedMDB? M. Harper (developer) I’m developing a bird watching application. Can I reuse DBPedia?
  15. 15. User Testing• Users typical questions: – Where do I start? – Where do I go now? – What is this data about? – How do I find this? – …• What do Linked Data user interfaces offer?
  16. 16. DBPedia Scenario• Linked Data version of Wikipedia – 3.5 million things described • Ontology: 257 classes y 1276 properties
  17. 17. Target Technical Users• DBPedia main page
  18. 18. Semantic Query Languages• SPARQL: – select distinct(?c) (count(?i) as ?n) where {?i a ?c} order by desc(?n) c n 1668503 632607 571764 462349 363751 355100 340443 296595
  19. 19. Text Search• What to type? A URI? A URI label?• How to take advantage from semantics?
  20. 20. Semantic Query UIs• iSPARQL
  21. 21. Proposal Ontologies and dataset structureAutomatic UI Generation Information Architecture Components [Morville] Overview Menus, Sitemaps,…InteractionPatterns for Zoom & Filter FacetsData Analysis [Shneiderman] Details Lists, Maps, Timelines…
  22. 22. IA Components. Menus– From dataset ontologies and thesaurus • For each class/topic – URI, label, # instances/uses, subclasses/subtopics– Flatten to desired # entries and subentries • When there is room, entries or subentries, divide class/topic with the most instances • When too many, group that with the fewest – “Other” is the generic group
  23. 23. IA Components. Menus7 menus with 10 submenus Automatic Generation
  24. 24. DEMO IA Components. MenusProvide DBPedia overview… …but what about 12.334 birds?
  25. 25. IA Components. Facets• Pre-computed list of facets / class or topic – Ontologies or thesaurus + instance data – Facet metrics: • frequency, #values, most common value cardinality…• DBPedia Birds class: – 226 properties • dbo:kingdom, 100%, 3 values, 6846 (Animalia),…
  26. 26. DEMO DBPedia
  27. 27. DEMO LinkedMDB
  28. 28. Testing LinkedMDB• Evaluation with lay users as part of RITE1 development process – Iteration test with 6 users – LinkedMDB (Linked Data version of iMDb) User Task: “Find three films where Woody Allen is director and also actor”. 1 Rapid Iterative Testing and Evaluation
  29. 29. Evaluation Results• Seemed easy but… no user completed task without help• Really, just 1 issue: – Users started from “Actor” instead than from “Film”, and got lost from there• User interaction is too constrained by underlying “explicit” data structure• Lack of context while browsing graph
  30. 30. New Features• Facets for all inverse properties (explicit or implicit) – Actor  actor – Film: • Actor has facet “is actor of Film”• Breadcrumbs show “query” built so far – Click Film, then for facet “Actor” search “Woody Allen”: • “Showing Film has actor where actor name is Woody Allen”
  31. 31. New Features• What about getting from Actors to Films to restrict by director?• Add Actor facet “directed by”? – DANGER: facets explosion • Director  Film  Country  Continent Director facet: “continents of countries where films directed”!
  32. 32. New Features• Pivoting: switch from faceted view to related faceted view (keeping filters) – E.g.: from Actors facets move to Films facets through “is Actor of Film” facet• For each class facet also compute: – Most specific class for target instances • Actor “is Actor of” Film and TV Episode  Audiovisual Work – Pivot that facet to get: • Faceted view for target class… + filters so far
  33. 33. DEMO
  34. 34. Next Round Evaluation• Semantic Web Exploration Tools Quality in Use Model: – Task success, Task time, Satisfaction,… – UI Component Efficiency, Task Flexibility, Layout Flexibility,…• Task: “Films Woody Allen director and actor” – Task time: Pre-pivot Pivot Reduction Minimum 1.05 0.89 15% Maximum 5.23 2.23 57% Mean 2.41 1.69 30% St. Dev. 1.49 0.57 62%
  35. 35. Summary• Menus – Dataset classes (topics) overview• Facets – Filter class using properties and values• Pivoting – Switch faceted views, carrying filters
  36. 36. DEMO Conclusions• Users build queries without SPARQL or dataset structure knowledge• Example: – Who has directed more films in Oceania? SELECT DISTINCT ?r1 WHERE { ?r1 a movie:Director . ?r2 movie:director ?r1 . ?r2 a movie:Film. ?r2 movie:country ?r3 . ?r3 movie:country_continent ?r3var0 FILTER(str(?r3var0)="Oceania") }
  37. 37. Work in Progress• Interaction design – Explore the best way to make pivoting, and un- pivoting, evident for users – Improve “breadcrumbs”• Specialized facets: – Range dependent: histogram for numbers, calendar for dates,…
  38. 38. Work in ProgressIntegrateRDF2SVG
  39. 39. Work in Progress • Object-Action interaction paradigm – Objet properties determine actions – Actions: plugable Semantic Web Services lat, long, point…time, date, start, end…
  40. 40. DEMO Work in Progress• Other IA components: sitemaps
  41. 41. DEMO Work in Progress• Interactively select data and configure visualizations
  42. 42. Data Quality
  43. 43. Assisted Edition (and Trust) WebID
  44. 44. Thanks for your attention Roberto García roberto.garcia@udl.catHuman-Computer Interaction Universitat de Lleida and Data Integration Spain Research Group