Nuxeo World Session: Semantic Technologies - Update on Recent Research

  • 2,845 views
Uploaded on

Presentation from Nuxeo World 2010 (November 17-18, 2010).

Presentation from Nuxeo World 2010 (November 17-18, 2010).

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,845
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
98
Comments
0
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Nov. 17 2010 - S. Fermigier & O. Grisel, Nuxeo Towards semantic ECM: report on the IKS and Scribo projects Monday, November 22, 2010
  • 2. Outline • Introduction to semantic technologies • Collaborative R&D within the Scribo and IKS projects • Fise & Apache Stanbol / Nuxeo Integration Monday, November 22, 2010
  • 3. 1. Introduction to semantic technologies Monday, November 22, 2010
  • 4. Illustration source: Mills Davis, “Semantic Social Computing”, sept. 2007 Monday, November 22, 2010
  • 5. Photo source: http://www.flickr.com/photos/pixelydixel/ Monday, November 22, 2010
  • 6. Invented the web in 1989 (yeah!) Photo source: http://www.flickr.com/photos/pixelydixel/ Monday, November 22, 2010
  • 7. Invented the web in 1989 (yeah!) Invented the semantic web in 1999 (duh?) Photo source: http://www.flickr.com/photos/pixelydixel/ Monday, November 22, 2010
  • 8. Historical perspective • From web 1.0: web of pages, aka the World Wide Web • To web 2.0: web of people and of participation, aka the Social Web • To web 3.0: web of data, of meaning and of connected knowledge, aka the Semantic Web Monday, November 22, 2010
  • 9. Picture source: http://www.flickr.com/photos/pixelydixel/ Monday, November 22, 2010
  • 10. Monday, November 22, 2010
  • 11. Monday, November 22, 2010
  • 12. Monday, November 22, 2010
  • 13. A “layer cake” of technologies Monday, November 22, 2010
  • 14. Linked Online Data in 2007 “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” Monday, November 22, 2010
  • 15. 2008 “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” Monday, November 22, 2010
  • 16. 2009 “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” Monday, November 22, 2010
  • 17. 2010 “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” Monday, November 22, 2010
  • 18. Good for Enterprise apps too! Diagram source: http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/ Monday, November 22, 2010
  • 19. Key Enablers • Open Data and Linked Online Data • Advances in automatic content analysis (linguistics, image processing) • Computing power (Moore’s law + MapReduce) • Classical logic and classical AI Monday, November 22, 2010
  • 20. The technologies and data are available, let’s put them to use! Monday, November 22, 2010
  • 21. Semantic ECM Metadata Text Sound Tags Entities Image Relations Video Reasoning Content Meaning Monday, November 22, 2010
  • 22. Goals for Semantic ECM (& Nuxeo) • Repurpose existing content • Improve search and collaboration • Make information contextual • Extract and use information from your content • Make your content smarter! Monday, November 22, 2010
  • 23. Challenges • Extract meaning from content • Enrich content with knowledge • Enhance interaction with content thanks to added meaning Monday, November 22, 2010
  • 24. Business value from semantic ECM • Efficiency gains: 20% to 90% (ex: in search, collaboration) • Effectiveness gains: better returns from your assets (ex: news and images from AFP) • Strategic edge: growth, value capture, new services, gain unfair strategic advantage (ex: vertical ontologies for CEVAs / CCAs) Monday, November 22, 2010
  • 25. 2. SCRIBO and IKS Monday, November 22, 2010
  • 26. • Project under the french FUI program, with 9 partners, and a budget of 4.7 M€ • Goal: to develop algorithms and collaborative tools for extracting knowledge from unstructured documents and images • Started in 2008, finishing in Dec. 2010, with results already integrated as a Nuxeo plugin Monday, November 22, 2010
  • 27. • European project under the FP7, with 13 partners (6 SMEs) and a 8.5 M€ budget • Goal: create a semantic software “stack” that will be used by CMS vendors to add semantic features to their products • Started in Jan. 2009, will last until Dec. 2012 • First tangible result: FISE, already integrated in a Nuxeo plugin Monday, November 22, 2010
  • 28. 3. Linking Semantic Entities Apache Stanbol - Nuxeo integration Monday, November 22, 2010
  • 29. What are entities? 27 Monday, November 22, 2010
  • 30. 28 Monday, November 22, 2010
  • 31. What is wrong with tags? • Many terms for same meaning • NYC, New York, New York City • Many meanings for same terms • Need context to remove any ambiguity 29 Monday, November 22, 2010
  • 32. Washington is... 30 Monday, November 22, 2010
  • 33. Tagging with Entities • Global namespace / universal meaning context • Interoperability across domains • Interoperability across applications 31 Monday, November 22, 2010
  • 34. Demo time! Screencast online at http://blogs.nuxeo.com/dev 32 Monday, November 22, 2010
  • 35. How does this work? 33 Monday, November 22, 2010
  • 36. 34 Monday, November 22, 2010
  • 37. • Open Source Semantic Engine • HTTP Services • For content driven applications • OSGi: loosely coupled components • Analysis Engines • Knowledge RDF vocabularies 35 Monday, November 22, 2010
  • 38. What is a semantic engine? • Unstructured content => Knowledge • Language guessing • Topic classification (Business, Sports, Media, ...) • Named Entities extraction and linking • Relationships and properties extraction 36 Monday, November 22, 2010
  • 39. 37 Monday, November 22, 2010
  • 40. 38 Monday, November 22, 2010
  • 41. RESTful is Beautiful 39 Monday, November 22, 2010
  • 42. curl -X POST -H "Accept: application/json" -H "Content-type: text/plain"  --data "John Smith works at Smith Consulting in Paris."  http://fise.demo.nuxeo.com/engines { "urn:enhancement-1564680b-861c-df6f-fdf9-d34a75d68dfe": { "http://fise.iks-project.eu/ontology/selected-text": [ { "datatype": "http://www.w3.org/2001/XMLSchema#string", "type": "literal", "value": "Paris" } ], "http://fise.iks-project.eu/ontology/selection-context": [ { "datatype": "http://www.w3.org/2001/XMLSchema#string", "type": "literal", "value": "John Smith works at Smith Consulting Paris." } ], "http://purl.org/dc/terms/type": [ { "type": "uri", "value": "http://dbpedia.org/ontology/Place" } ] 40 }, … Monday, November 22, 2010
  • 43. 41 Monday, November 22, 2010
  • 44. 42 Monday, November 22, 2010
  • 45. = fise + fast Linked Data local index + semantic rule engine + more ? 43 Monday, November 22, 2010
  • 46. Apache Stanbol / Nuxeo integration 44 Monday, November 22, 2010
  • 47. Apache Stanbol Engine 1 DBpedia Engine 2 2 1 Engine 3 Freebase Nuxeo DM 3 addon Geonames LDAP Local IT infrastructure (LAN) 45 Monday, November 22, 2010
  • 48. • Implemented as an Operation for Studio • Entities & Relationships stored in Nuxeo Core • CMIS interoperability 46 Monday, November 22, 2010
  • 49. Soon available on marketplace.nuxeo.com 47 Monday, November 22, 2010
  • 50. Questions? • http://iks-project.eu • http://fise.demo.nuxeo.com • http://scribo.ws • http://incubator.apache.org/stanbol • http://blogs.nuxeo.com/dev 48 Monday, November 22, 2010