Successfully reported this slideshow.
Your SlideShare is downloading. ×

Semantic Web questions we couldn't ask 10 years ago

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 45 Ad

Semantic Web questions we couldn't ask 10 years ago

Download to read offline

Talk given at the SSSW 2013 Semantic Web Summerschool.
Part 1: What is "Semantic Web" (in 4 principles and 1 movie)
Part 2: What question can we ask now that we couldn't ask 10 years ago
Part 3: Treat Computer Science as a *science*, not just as engineering!
(this part a short version of http://slidesha.re/SaUhS4 )

Talk given at the SSSW 2013 Semantic Web Summerschool.
Part 1: What is "Semantic Web" (in 4 principles and 1 movie)
Part 2: What question can we ask now that we couldn't ask 10 years ago
Part 3: Treat Computer Science as a *science*, not just as engineering!
(this part a short version of http://slidesha.re/SaUhS4 )

Advertisement
Advertisement

More Related Content

Slideshows for you (20)

Advertisement

More from Frank van Harmelen (17)

Advertisement

Recently uploaded (20)

Semantic Web questions we couldn't ask 10 years ago

  1. 1. Frank van Harmelen All the questions we couldn’t ask 10 years ago Creative Commons License: allowed to share & remix, but must attribute & non-commercial
  2. 2. The bad news: you’re going to get 3 talks 1. Where are we now? – The Semantic Web in 4 principles & a movie – Did we get anywhere? 2. Now what? – Questions we couldn’t ask 10 years ago 3. Methodological hobby horse – Science or engineering?
  3. 3. Semantic Web: What is it?
  4. 4. a web page in English about Frank And this page is about LarKC and another web page about Frank And this page is about Stefano This page is about the Vrije Uniersitei “The Semantic Web” a.k.a. “The Web of Data”
  5. 5. http://www.youtube.com/watch?v=tBSdYi4EY3s
  6. 6. P1. Give all things a name
  7. 7. P2. Relations form a graph between things
  8. 8. P3. The names are addresses on the Web x T [<x> IsOfType <T>] different owners & locations <analgesic>
  9. 9. P1+P2+P3 = Giant Global Graph
  10. 10. P4. explicit & formal semantics • assign types to things • assign types to relations • organise types in a hierarchy • impose constraints on possible interpretations
  11. 11. Examples of “semantics” Frank Lynda married-to • Frank is male • married-to relates males to females • married-to relates 1 male to 1 female • Lynda = Hazel lowerbound upperbound Hazel
  12. 12. Semantic Web: Where are we now?
  13. 13. Did we get anywhere? • Google = meaningful search • NXP = data integration • BBC = content re-use • Wallmart= SEO (RDF-a) • data.gov = data-publishing
  14. 14. NXP: data integration about 26.000 products Triple store Triple store Departments Customers Notice the 3-layer architecture
  15. 15. BBC Notice the 3-layer architecture
  16. 16. Did we get anywhere? • Google = meaningful search • NXP = data integration • BBC = content re-use • BestBuy = SEO (RDF-a) • data.gov = data-publishing Oracle DB, IBM DB2 Reuters, New York Times, Guardian Sears, Kmart, OverStock, Volkswagen, Renault GoodRelations ontology, schema.org
  17. 17. Size Matters: 25-45 billion facts
  18. 18. The questions that we couldn’t ask 10 years ago
  19. 19. • Heterogeneity • Self-organisation, long tails • Distribution • Provenance & trust • Dynamics • Errors & Noise • Scale
  20. 20. heterogeneity is unavoidable •Linguistic, •Structural, •Logical, •Statistical, .... Socio- economic first to market market- share
  21. 21. Self-organisation
  22. 22. Self-organisation
  23. 23. Self-organisation
  24. 24. Self-organisation
  25. 25. Self-organisation
  26. 26. Bio-medical ontologies in Bio-portal > 5 links Self-organisation
  27. 27. knowledge follows a long-tail
  28. 28. incidental or universal? impact on mapping? impact on reasoning? impact on storage?
  29. 29. Distribution Caching? Subgraphs? Payload priority? query- planning?
  30. 30. Provenance Representation? From provenance to trust? (Re)construction? knowledge about knowledge?
  31. 31. Dynamics Streams? Incremental reasoning? Non- monotonicity? versioning?
  32. 32. Errors & noise Maximally consistent subsets? Fuzzy Semantics? Uncertainty Semantics? Rough Semantics? Modules? Repair? Argumentation?
  33. 33. Maximally consistent subsets? Modules? Repair? Argumentation? Fuzzy Semantics? Uncertainty Semantics? Rough Semantics? Streams? Incremental reasoning? Non- monotonicity? versioning? Representation? From provenance to trust? (Re)construction? knowledge about knowledge? Caching? Subgraphs? Payload priority? incidental or universal? impact on mapping? impact on reasoning? impact on storage? Socio- economic first to market market- share
  34. 34. Methodological Hobby horse
  35. 35. Laws about the physical universe Laws about the information universe ?
  36. 36. knowledge follows a long-tail Law: F = a-br
  37. 37. Law: |T|<< |A| T = terminological knowledge A = assertional knowledge
  38. 38. Dataset Closure of T Closure of T + A Ratio LUBM 8sec 1h15min 562 Linked Life Data 332sec 1h05min 11 FactForge 89sec 2h45min 111 We don’t have any good laws on complexity

Editor's Notes

  • @TODO@: do a slide on data-integration at NXP@TODO@: find a slide on RDF-a in Wallmart etc
  • @TODO@: do a slide on data-integration at NXP@TODO@: find a slide on RDF-a in Wallmart etc@TODO@: replace company names with logo’s?
  • @@ Add: trust@@Add: noisy data (inconsistency, misleading, incomplete)
  • Suggests to let a 1000 ontologies blossom, to have lots of connections between lots of datasets.
  • Some known information laws already apply:Zipf law / long tail distributions are everywhere= vast majority of occurrences are caused by a vast minority of itemsthis phenomen is sometimes a blessing, sometimes a cursenice for compressionawful for load balancingand knowing the law helps us deal with the phenomenonthat’s why it’s worth trying to discover these laws.
  • @add another long-tail example@ (e.g. in-degree?)
  • Physical distribution doesn’t work the web is not a database (and never will be)@@ADD: even worse for long tail
  • - Compare to physics laws: gravity F = G m_1 m_2 / r^2 conservation of energy (dE/dt = 0), increase of entropy (dS/dt \geq 0),we cannot yet hope for such beautifully mathematised laws,in such a concise language that fits on a very compact space computer science is like alchemy, a &quot;protoscience&quot;
  • Some known information laws already apply:Zipf law / long tail distributions are everywhere= vast majority of occurrences are caused by a vast minority of itemsthis phenomen is sometimes a blessing, sometimes a cursenice for compressionawful for load balancingand knowing the law helps us deal with the phenomenonthat’s why it’s worth trying to discover these laws.
  • this only works because terminologies are in general only simple hierarchies. (it’s easy to build examples where this doesn’t hold, but in practice it turns out to hold).So, this law depends on the previous lawas an aside: the graph is now big enough to do statistics on it.
  • use complexity” as a measure, not just “size”. spell out LLD,don’t break FactForge
  • - Semantic Web = engineering enterprise.- This talk = what are the scientific observations/facts/theories after 10 yearsWhat are the big CS (or: KR?) lessons we can learn from a decade of SemWeb?(= regard SemWeb adoption as a giant laboratory for CS laws)Did we learn any science? (and of course the laws won’t be specific to SemWeb? Hopefully not. Hopefully they are generic laws about the structure and behaviour of informaiton!)
  • a gazillion new open questionsdon’t just try to build things, also try to understand thingsdon’t just ask how, also ask why

×