Work Together Effectively
Cross Media Concept and
Entity Driven Search for
Enterprise
Chalitha Perera and Dileepa Jayakody
R&D Engineers
Work Together Effectively
•  Headquartered in London with office in Colombo, Sri Lanka
•  Focused on delivering enterprise content management solutions
•  Our Skills
Work Together Effectively
Zaizi R&D Department
•  Giving sense to the content  
–  Enriching it semantically
•  Adding value to ECM/CMS
–  More structured content, easy to manage, link and search
•  Improving search
–  Across different domains, data sources, User Experience
•  Machine Learning applied research
Work Together Effectively
Agenda
•  Problem
•  Solution
•  Sensefy and MICO
•  Demo
•  Q&A
Work Together Effectively
Problem
•  Unstructured Text Content
–  Text documents, PDFs, Word …
•  Rapid growth in multimedia content
•  Heterogeneous Data Sources
–  ECMs (Alfresco, Sharepoint), File System,
Confluence, JIRA …
•  Data is not useful without effective methods for
–  Knowledge Extraction
–  Information Retrieval
Work Together Effectively
Current Enterprise Search
Limitations
•  Limited to keyword based search
•  Search context is not considered
•  Ambiguity of terms
•  Low precision
•  Inability to properly handle multimedia files
Work Together Effectively
Desired traits of Solution
•  Semantically Enhance documents
–  Unstructured text
–  Multimedia documents
•  Cross media search
•  Search with semantic concepts and entities
•  Federated Search
–  Search across different content repositories
–  User permissions
Work Together Effectively
Sensefy
•  Semantic Enterprise Search Engine
•  Cross Media Search
•  Federated Search
•  Smart Search Assistance
•  Open Source
Work Together Effectively
Sensefy Architecture
Work Together Effectively
Repository Crawler
•  Four types of connectors
–  Repository Connectors
–  Authority Connectors
–  Transformation Connectors
–  Output Connectors
•  Connect different source repositories with different target indexes
–  Source repositories (Alfresco, Sharepoint, Confluence etc)
–  Target Indexes (Solr, ElasticSearch, Amazon CloudSearch)
•  Security Model to enforce source repository security policies
Work Together Effectively
Media In Context (MICO)
Platform
•  MICO provides an integrated platform for
–  Cross media analysis
–  Metadata publishing
–  Metadata querying
•  Sensefy uses MICO as the cross media analysis engine to extract entities and concepts
from multimedia
Work Together Effectively
Cross Media Extraction Pipeline
Work Together Effectively
Semantic Content Enrichment
•  Named Entity Recognition
–  People, places, organizations and concepts
•  Entity Linking
–  DBpedia, Yago, Custom Enterprise knowledge bases
•  Entity Disambiguation
Work Together Effectively
Entity Search with Suggestions
•  Named Entity Suggestions
•  Ability to query with disambiguated entities
•  Search results with high precision
–  Keyword search results for “ronaldo”-  “Cristiano Ronaldo” and “Ronaldo”
–  Entity Search - will contain only the documents related to selected entity
Work Together Effectively
Entity Search with Suggestions
•  Combine entities and concepts for more complex queries
Work Together Effectively
DEMO
Work Together Effectively
Q&A
Work Together Effectively
Thank you.

Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise

  • 1.
    Work Together Effectively CrossMedia Concept and Entity Driven Search for Enterprise Chalitha Perera and Dileepa Jayakody R&D Engineers
  • 2.
    Work Together Effectively • Headquartered in London with office in Colombo, Sri Lanka •  Focused on delivering enterprise content management solutions •  Our Skills
  • 3.
    Work Together Effectively ZaiziR&D Department •  Giving sense to the content   –  Enriching it semantically •  Adding value to ECM/CMS –  More structured content, easy to manage, link and search •  Improving search –  Across different domains, data sources, User Experience •  Machine Learning applied research
  • 4.
    Work Together Effectively Agenda • Problem •  Solution •  Sensefy and MICO •  Demo •  Q&A
  • 5.
    Work Together Effectively Problem • Unstructured Text Content –  Text documents, PDFs, Word … •  Rapid growth in multimedia content •  Heterogeneous Data Sources –  ECMs (Alfresco, Sharepoint), File System, Confluence, JIRA … •  Data is not useful without effective methods for –  Knowledge Extraction –  Information Retrieval
  • 6.
    Work Together Effectively CurrentEnterprise Search Limitations •  Limited to keyword based search •  Search context is not considered •  Ambiguity of terms •  Low precision •  Inability to properly handle multimedia files
  • 7.
    Work Together Effectively Desiredtraits of Solution •  Semantically Enhance documents –  Unstructured text –  Multimedia documents •  Cross media search •  Search with semantic concepts and entities •  Federated Search –  Search across different content repositories –  User permissions
  • 8.
    Work Together Effectively Sensefy • Semantic Enterprise Search Engine •  Cross Media Search •  Federated Search •  Smart Search Assistance •  Open Source
  • 9.
  • 10.
    Work Together Effectively RepositoryCrawler •  Four types of connectors –  Repository Connectors –  Authority Connectors –  Transformation Connectors –  Output Connectors •  Connect different source repositories with different target indexes –  Source repositories (Alfresco, Sharepoint, Confluence etc) –  Target Indexes (Solr, ElasticSearch, Amazon CloudSearch) •  Security Model to enforce source repository security policies
  • 11.
    Work Together Effectively MediaIn Context (MICO) Platform •  MICO provides an integrated platform for –  Cross media analysis –  Metadata publishing –  Metadata querying •  Sensefy uses MICO as the cross media analysis engine to extract entities and concepts from multimedia
  • 12.
    Work Together Effectively CrossMedia Extraction Pipeline
  • 13.
    Work Together Effectively SemanticContent Enrichment •  Named Entity Recognition –  People, places, organizations and concepts •  Entity Linking –  DBpedia, Yago, Custom Enterprise knowledge bases •  Entity Disambiguation
  • 14.
    Work Together Effectively EntitySearch with Suggestions •  Named Entity Suggestions •  Ability to query with disambiguated entities •  Search results with high precision –  Keyword search results for “ronaldo”-  “Cristiano Ronaldo” and “Ronaldo” –  Entity Search - will contain only the documents related to selected entity
  • 15.
    Work Together Effectively EntitySearch with Suggestions •  Combine entities and concepts for more complex queries
  • 16.
  • 17.
  • 18.