Content Repositories vs Knowledge Bases

  • 2,173 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • i thing very useful to me
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
2,173
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
84
Comments
1
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Content Repositories vs. Knowledge Bases… 12 November, 2009, Gokce Banu Laleci, SRDC
  • 2. Outline
    • Content Management Systems and Content Repositories
    • Strength of Ontologies, Reasoners, Kowledge bases
    • Possible Synergy: How can semantic web tools can be exploited by CMSs…
    • Possible architecture…
    (c) Interactive Knowledge 2009-2012 Slide
  • 3. Content Management Systems
    • Content management system is designed to support a content management cycle
      • creation and collection of content
      • the publication of content for access by users and/or other systems
      • the management of these content
    • Content Repository: a high-level information management system that is a superset of traditional data repositories, [which] implements 'content services‘:
      • author based versioning
      • full textual searching
      • fine grained access control
      • content categorization
      • content event monitoring
    • Content Repositories implemented
      • RDBMS
      • File Systems
      • XML DBs
      • … .
    (c) Interactive Knowledge 2009-2012 Slide
  • 4. How Content is Structured in Content Repositories
    • JSR-170 : Java Content Repository API
    • Content Management Interoperability Services (CMIS)
    (c) Interactive Knowledge 2009-2012 Slide Repository Item Property Node parent parent * * 1 0..1 child * 1 Root Node News Article News Article Title=Genetic Clues to Eating Disorders Author: John Adams Content:Attachment title= Anorexic says man need more help Author: Frank Smith Content:Attachment Repository Object Document Folder Relationship Policy Source, target Property Property Content target * Object Type -type id -parent -abstract -queryable -controllable Document Object Type -versionable -allow content Folder Object Type Relationship Object Type -allowed source/target types Policy Object Type Property Type - property id -type -required -default value Node Type -name -supertypes -mixin ststus -orderable child node -primary item name Child Definitions -name -Required primary node types -Default primary Node Type -Auto-created -Mandatory -OnParent version -Protected -Same-name siblings Property Definitions - name -type -value constraints -default value -mandatory -protected -multiple values -on parent version -auto created
  • 5. How Metadata is added, Supported search methods..
    • Metadata
      • Organizing the content as hierarchies
      • Through properties/parameters of nodes/objects/documents
        • Free format values, or selected from a constrained vocabulary ( which can be a taxonomy)
          • Can be used as content categories
        • By representing relationships between nodes/objects/documents
      • Taxonomies can be represented as t ags hierarchies (as a hierarchy of nodes..)
      • Node/Object/Document types
        • XML Schemas
    • Search
      • Full-text Search
        • Lucene, SOLR, Text indices in databases
      • F ield-based searches
      • Structured Query methods over Repository Data Model
        • SQL-based, XPath based
      • S ynonym check
        • Through a list
    (c) Interactive Knowledge 2009-2012 Slide
  • 6. Strength of Semantic Technologies 1
    • An ontology is an engineering artifact consisting of:
      • A vocabulary used to describe (a particular view of) some domain
      • An explicit specification of the intended meaning of the vocabulary.
        • Almost always includes how concepts should be classified
      • Constraints capturing additional knowledge about the domain
        • Through rules
    • Ideally, an ontology should:
      • Capture a shared understanding of a domain of interest
      • Provide a formal and machine manipulable model of the domain
    • Aims “ machine understanding ”
      • Understanding is closely related to reasoning
      • Recognising semantic similarity in spite of syntactic differences
      • Recognising implicit consequences given explicitly stated facts
    • An ontology together with a set of instances of its classes constitutes a knowledge base
    (c) Interactive Knowledge 2009-2012 Slide
  • 7. Examples… (c) Interactive Knowledge 2009-2012 Slide Workspace1 NewsSubjectCodes Health Economy Business Finance Disaster/ Accident Education NewsArticles Article2 Article1 Article3 Disease HealthTreatment Illness Cancer ViralDiseases classifiedBy classifiedBy classifiedBy -NewsSubjectCodes -ArtsCultureEntertainment -DisasterAccident -EconomyBusinessFinance -Education -EnvironmentalIssues -Health -HealthTreatment -Illness -ViralDisease -Cancer -......... -Medicine -SocialIssues instanceOf instanceOf -Disease instanceOf A. Content Repository B. Apart of the Extracted Ontology SwineFlu Rule: If a Disease iscausedby PathogenicAgent then it is an infectiousDisease Facts: Virus Is a PathogenicAgent Fungi Is a PathogenicAgent ViralDisease iscausedby Virus Search: Find me the “Health” related Articles Results: Article1, Article 2, Article 3 (due to subsumption relations in the ontology) Search: Find me the Articles related with “Infectious Diseases” Results: Article 3 Article1 Article 2 Article3
  • 8. How Semantic Technologies can be exploited by CMSs.. (c) Interactive Knowledge 2009-2012 Slide
  • 9. Approaches for semantically enabled content management
    • Semantic / Ontology enabled Web Portals 2,3
      • Using ontologies as a backbone of Web portals
        • Designing the Schema based on Ontologies
        • Ontology enabled Data Collection
        • Ontology based Navigating
        • Ontology based Search mechanisms supported through reasoning
    (c) Interactive Knowledge 2009-2012 Slide
  • 10. Approaches for semantically enabled content management
    • Semantic Wikis 4,5
      • Ontology enabled links
      • Ontology enabled enhanced search and browsing
    • Semantic / Ontology enabled CMS Systems 6,7
      • Developing a domain Ontology
      • Ontology assisted content creation
      • Ontology enabled navigation
      • Ontology integrated search
    (c) Interactive Knowledge 2009-2012 Slide
  • 11. How about already existing CMSs?
    • Content Repositories already provide certain amount of semantics for content items
      • Through content hierarchies, properties, taxonomies, node/object types…
      • However this semantics is not “machine understandable”: can not be reasoned on…
    • There is a need for an “Integrated semantic engineering method”
      • Enabling CMS developers to easily utilize semantic functionalities provided by ontologies, reasoners, without duplicating data and effort, and without a major change in their systems
    (c) Interactive Knowledge 2009-2012 Slide
  • 12. IKS Approach for Extracting the Semantics from CMSs as Ontologies
    • Nodetypes/Object types/Document Types can be automatically converted in to OWL Classes
      • Properties as object and Data type Properties
      • Restrictions when necessary
      • Nodes of these nodetypes can be created as instances…
    • A similar approach has been provided for Drupal System 8
    • How about the semantics other than node/object types?
      • Links between content items
      • Taxonomies
      • Content hierarchies
    • IKS should provide a generic approach for a variety of different CMS Systems…
  • 13. Workspace1 Nodetype unstructured File Resource HotelDescription supertypes AsteriaTourismPortal IbisHotel Hilton NovHotel Sheroton facility=“Pool” primaryType sisterHotel < HotelDescription rdf:ID=“Novotel” > <sisterHotel rdf:about=“#IbisHotel”/> <facility rdf:dataType=“&xsd:String”>Pool </quality> </HotelDescription> …… propertyDefinition propertyDefinition Name=“sisterHotel requiredType Name=“facility”
  • 14. Workspace1 Nodetype unstructured File Resource HotelDescription supertypes AsteriaTourismPortal IbisHotel Hilton NovHotel Sheroton facility=“Pool” primaryType sisterHotel propertyDefinition propertyDefinition Name=“sisterHotel requiredType TourismServicesClassification FlightBooking Tours Hotel 4StarHotel 3StarHotel type < HoteDescription rdf:ID=“Novotel” > <sisterHotel rdf:about=“#IbisHotel”/> <facility rdf:dataType=“&xsd:String”>Pool </quality> </HotelDescription> <4StarHotel rdf:about=“#Novotel”/ > …… <owl:Class rdf:ID=“ ToursismServicesClassification” > <owl:Class rdf:ID=“ Hotel > <rdfs:subClassOf rdf:resource=&quot;# TourismServicesClassification &quot;/> </owl:Class> <owl:Class rdf:ID=“ 4StarHotel > <rdfs:subClassOf rdf:resource= “Hotel &quot;/> </owl:Class> … .. Represented as Classes… Represented as Instances… How can I know the semantics of “type” What if the property was “suitableFor” And its is bound to a classification of people < HoteDescription rdf:ID=“Novotel” > <sisterHotel rdf:about=“#IbisHotel”/> <facility rdf:dataType=“&xsd:String”>Pool </quality> <suitableFor rdf:about=“#YoungCouples”/> </HotelDescription> <4StarHotel rdf:about=“#Novotel”/ > ……
  • 15. Mapping GUI Content Repository Flex RIA Content Repository Model JCR ConceptBridge PropertyBridge Subsumption Bridge InstanceBridge Mapping Definition Mapping Engine
  • 16. Mapping Engine Content Repository JCR Mapping Definition Mapping Engine ConceptBr. Processor PropertyBr. Processor SubsumptonBr . Processor PropertyBr. Processor Enforced PropertyBr. Processor InstanceBr. Processor PropertyBr. Processor PropertyBr. Processor JCR JCR Queries OWL Representation IKS Persistence Store
  • 17. Mapping Engine Content Repository JCR Mapping Definition Mapping Engine ConceptBr. Processor PropertyBr. Processor SubsumptonBr . Processor PropertyBr. Processor Enforced PropertyBr. Processor InstanceBr. Processor PropertyBr . Processor PropertyBr. Processor JCR IKS Persistence Store JCR Observations -Node Added -PropertyAdded
  • 18. Initially Envisioned Architecture Content Repository JCR CMIS Semantic Extractor / Synchronization Persistency Store Restful API (Ontology Administration, Query) Lifted Ontology Restful API + GUI Domain Ontology Horizontal Ontology Ontology Lifting/ Alignment GUI+Engine Harmonized Ontology Reasoners Rule Engines DBPedia, WordNet Example Horizontal Application: RIA Faceted Search GUI Search Building Blocks Structured Query LuceneSAIL/ LARQ Semantic Similarity Metrics Hybrid Approach JCR CMIS CMS
  • 19. Merging with External Domain Ontologies (c) Interactive Knowledge 2009-2012 Slide equavilentTo instanceOf instanceOf instanceOf instanceOf MeSH Biomedical Ontology -NewsSubjectCodes -ArtsCultureEntertainment -DisasterAccident -EconomyBusinessFinance -Education -EnvironmentalIssues -Health -HealthTreatment -Illness -EatingDisorder -Obesity -Medicine -SocialIssues -Diease -Neurological Disease MotorNeuroneDiseaseGeneClue .................. ................ Professor Christopher Shaw, from the Institute of Psychiatry at Kings College London, said…….. GeneticCluesToEatingDisorders .................. ................ Doctors studying the causes of the eating disorders anorexia and bulimia believe it has less to do with media images of slim-figured models and more to do with biological and genetic factors…….. -NewsSubjectCodes -ArtsCultureEntertainment -DisasterAccident -EconomyBusinessFinance -Education -EnvironmentalIssues -Health -HealthTreatment -Illness -EatingDisorder -Obesity -Medicine -SocialIssues -Diease -Neurological Disease -MeSH -Anatomy -Diseases -Organisms -BehaviorMechanisms -Psychiatry -BehaviorDisciplines -MentalDisorders -AnxietyDisorders -EatingDisorders -SleepingDisorders -SomotoformDisorders Article_ED
  • 20. Exploiting DBPedia relationships (c) Interactive Knowledge 2009-2012 Slide
        • MerkelOffersStateAidForOpel
        • ......... German Chancellor has given assurances that any investor in General Motors (GM) subsidiary Opel will have state support ……
        • UKsaysMerkelbacksFiscalBoost
        • ......... And he added that German Chancellor was &quot;fully engaged&quot; with the European economic debate……
    Iptc:Politics Iptc:Economy Tagged By Tagged By dbp:Chancellorof_Germany dbp:Chancellorof_Germany Tagged By Tagged By
        • GermanyAgreesBadBankScheme......... Reports have said that Angela Merkel's government wants to see this achieved before the summer recess starts in early July ……
    dbp:Angela_Merkel Tagged By Returned by Solr dbprop:order
  • 21. Future Plans
    • Current System is for JCR enabled content repositories
      • Once configured, the metadata extraction is automatically, continuously kept synchronized with knowledge base
        • Based on observation mechanism
      • It can easily be extended for CMIS enabled content repositories
    • Future Plans
      • Propose RESTfull interfaces
        • To present a dump of data and metadata to knowledge base
        • To inform updates, additions, deletions of data and metadata...
    (c) Interactive Knowledge 2009-2012 Slide
  • 22. Thank you, Questions… Contact Information: Gokce B. Laleci, Phd [email_address] Viewlets of Demonstration: http://www.srdc.com.tr/iks/screencast/
  • 23. References..
    • Ian Horrocks, Ontology Reasoning: the Why and the How
    • Y. Jin, S. Decker, G. Wiederhold. OntoWebber: Model-Driven Ontology-Based Web Site Management
    • S. Staab, J. Angele, S. Decker, M. Erdmann, A. Hotho, A. Maedche, H. P. Schnurr, R. Studer, Y. Sure. Semantic community Web portals.
    • Max Völkel, Markus Krötzsch, Denny Vrandecic, Heiko Haller, Rudi Studer , Semantic Wikipedia
    • Sebastian Schaffert , IkeWiki: A SemanticWiki for Collaborative Knowledge Management
    • DUC MINH LE, LAU Lydia, An Open Architecture for Ontology-Enabled Content Management Systems : A Case Study in Managing Learning Objects
    • Roberto García, Juan Manuel Gimeno, Ferran Perdrix, Rosa Gil, and Marta Oliv a, The Rhizomer Semantic Content Management System
    • Stephane Corlosquet , Renaud Delbru, Tim Clark, Axel Polleres, and Stefan Decker , Produce and Consume Linked Data with Drupal
    (c) Interactive Knowledge 2009-2012 Slide
  • 24. The IKS Consortium 20.11.09 Project Lead and Coordination Salzburg Research Wernher Behrendt Salzburg Research Forschungsgesellschaft m.b.H. Jakob Haringer Straße 5/3 | 5020 Salzburg, Austria T +43.662.2288-409 | F +43.662.2288-222 [email_address] www.salzburgresearch.at Deutsches Forschungsinstitut für Künstliche Intelligenz (DFKI) Universität St. Gallen Consiglio Nationale delle Ricerche (CNR) Software Quality Lab Unversität Paderborn Software Research and Development Consultancy Ltd (SRDC) Hochschule Furtwangen Nuxeo Sa. Alkacon Software GmbH TXT Polymedia Pisano Holding GmbH Nemein Oy Day Software AG