Nuxeo World Session: Semantic Technologies - Update on Recent ResearchPresentation Transcript
Nov. 17 2010 - S. Fermigier & O. Grisel, Nuxeo
Towards semantic ECM:
report on the IKS and Scribo projects
Monday, November 22, 2010
Outline
• Introduction to semantic technologies
• Collaborative R&D within the Scribo and
IKS projects
• Fise & Apache Stanbol / Nuxeo Integration
Monday, November 22, 2010
1. Introduction to semantic
technologies
Monday, November 22, 2010
Illustration source: Mills Davis, “Semantic Social Computing”, sept. 2007
Monday, November 22, 2010
Photo source: http://www.flickr.com/photos/pixelydixel/
Monday, November 22, 2010
Invented the web in 1989
(yeah!)
Photo source: http://www.flickr.com/photos/pixelydixel/
Monday, November 22, 2010
Invented the web in 1989
(yeah!)
Invented the semantic
web in 1999 (duh?)
Photo source: http://www.flickr.com/photos/pixelydixel/
Monday, November 22, 2010
Historical perspective
• From web 1.0: web of pages, aka the
World Wide Web
• To web 2.0: web of people and of
participation, aka the Social Web
• To web 3.0: web of data, of meaning
and of connected knowledge, aka the
Semantic Web
Monday, November 22, 2010
Picture source: http://www.flickr.com/photos/pixelydixel/
Monday, November 22, 2010
Monday, November 22, 2010
Monday, November 22, 2010
Monday, November 22, 2010
A “layer cake” of
technologies
Monday, November 22, 2010
Linked Online Data in 2007
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Monday, November 22, 2010
2008
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Monday, November 22, 2010
2009
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Monday, November 22, 2010
2010
“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
Monday, November 22, 2010
Good for Enterprise apps too!
Diagram source: http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/
Monday, November 22, 2010
Key Enablers
• Open Data and Linked Online Data
• Advances in automatic content analysis
(linguistics, image processing)
• Computing power (Moore’s law +
MapReduce)
• Classical logic and classical AI
Monday, November 22, 2010
The technologies and data
are available,
let’s put them to use!
Monday, November 22, 2010
Semantic ECM
Metadata
Text
Sound Tags Entities
Image Relations
Video
Reasoning
Content Meaning
Monday, November 22, 2010
Goals for Semantic ECM
(& Nuxeo)
• Repurpose existing content
• Improve search and collaboration
• Make information contextual
• Extract and use information from your
content
• Make your content smarter!
Monday, November 22, 2010
Challenges
• Extract meaning from content
• Enrich content with knowledge
• Enhance interaction with content thanks to
added meaning
Monday, November 22, 2010
Business value
from semantic ECM
• Efficiency gains: 20% to 90% (ex: in
search, collaboration)
• Effectiveness gains: better returns from
your assets (ex: news and images from AFP)
• Strategic edge: growth, value capture,
new services, gain unfair strategic advantage
(ex: vertical ontologies for CEVAs / CCAs)
Monday, November 22, 2010
2. SCRIBO and IKS
Monday, November 22, 2010
• Project under the french FUI program, with
9 partners, and a budget of 4.7 M€
• Goal: to develop algorithms and collaborative
tools for extracting knowledge from
unstructured documents and images
• Started in 2008, finishing in Dec. 2010, with
results already integrated as a Nuxeo plugin
Monday, November 22, 2010
• European project under the FP7, with 13
partners (6 SMEs) and a 8.5 M€ budget
• Goal: create a semantic software “stack” that
will be used by CMS vendors to add semantic
features to their products
• Started in Jan. 2009, will last until Dec. 2012
• First tangible result: FISE, already integrated
in a Nuxeo plugin
Monday, November 22, 2010
What is wrong with tags?
• Many terms for same meaning
• NYC, New York, New York City
• Many meanings for same terms
• Need context to remove any ambiguity
29
Monday, November 22, 2010
Washington is...
30
Monday, November 22, 2010
Tagging with Entities
• Global namespace / universal meaning context
• Interoperability across domains
• Interoperability across applications
31
Monday, November 22, 2010
Demo time!
Screencast online at http://blogs.nuxeo.com/dev
32
Monday, November 22, 2010
How does this work?
33
Monday, November 22, 2010
34
Monday, November 22, 2010
• Open Source Semantic Engine
• HTTP Services
• For content driven applications
• OSGi: loosely coupled components
• Analysis Engines
• Knowledge RDF vocabularies
35
Monday, November 22, 2010
What is a semantic engine?
• Unstructured content => Knowledge
• Language guessing
• Topic classification (Business, Sports, Media, ...)
• Named Entities extraction and linking
• Relationships and properties extraction
36
Monday, November 22, 2010
37
Monday, November 22, 2010
38
Monday, November 22, 2010
RESTful
is
Beautiful
39
Monday, November 22, 2010
curl -X POST
-H "Accept: application/json"
-H "Content-type: text/plain"
--data "John Smith works at Smith Consulting in Paris."
http://fise.demo.nuxeo.com/engines
{
"urn:enhancement-1564680b-861c-df6f-fdf9-d34a75d68dfe": {
"http://fise.iks-project.eu/ontology/selected-text": [
{
"datatype": "http://www.w3.org/2001/XMLSchema#string",
"type": "literal",
"value": "Paris"
}
],
"http://fise.iks-project.eu/ontology/selection-context": [
{
"datatype": "http://www.w3.org/2001/XMLSchema#string",
"type": "literal",
"value": "John Smith works at Smith Consulting Paris."
}
],
"http://purl.org/dc/terms/type": [
{
"type": "uri",
"value": "http://dbpedia.org/ontology/Place"
}
] 40
},
…
Monday, November 22, 2010
41
Monday, November 22, 2010
42
Monday, November 22, 2010
=
fise
+
fast Linked Data local index
+
semantic rule engine
+
more ? 43
Monday, November 22, 2010
Apache Stanbol / Nuxeo
integration
44
Monday, November 22, 2010
Apache Stanbol
Engine 1 DBpedia
Engine 2
2
1 Engine 3
Freebase
Nuxeo DM
3
addon
Geonames
LDAP
Local IT infrastructure (LAN) 45
Monday, November 22, 2010
• Implemented as an Operation for Studio
• Entities & Relationships stored in Nuxeo Core
• CMIS interoperability
46
Monday, November 22, 2010
Soon available on
marketplace.nuxeo.com
47
Monday, November 22, 2010