OSS Enterprise Search
EU Tour
Spreading Enterprise Search Solutions around Europe
London – Amsterdam – Rome
25 – 26 – 28 O...
Summary
✓ Sourcesense is involved in many Open Source projects
✓ We continuously spot opportunities to integrate them
✓ Ha...
Solr - UIMA
Semantic content extraction while indexing
Solr - UIMA
✓ A Solr plugin to automatically extract relevant
knowledge from documents while indexing them
✓ Recognize and...
Solr – UIMA use cases
✓ Automatic enable language specific documents’ search
✓ Easy sentence scoped search
✓ Full text sea...
CMIS
✓ Interoperability between different Enterprise Content
Management Systems
✓ OASIS Specification on May 1, 2010
✓ Sta...
Why CMIS
✓ Allows to build and leverage applications against
multiple repositories
✓ Decouples Web Services from the Conte...
Solr CMIS Integration
✓ Retrieves documents from multiple CMIS repositories
✓ Configurable mapping cmis:document into
solr...
“From scratch Deployment”
●
The need:
✓ Reliable, resilient, scalable search solution
✓ Ability to roll out new 'rows' at ...
Sample Solr Setup
shard3shard3
co-ordinatorco-ordinator
shard1shard1
shard2shard2
Load balancer
Load balancer
shard3shard3...
DeployX Stages
Instantiate VMInstantiate VM
Configure hostConfigure host
Deploy applicationDeploy application
Duplicate da...
Demo
✓Extract CMIS documents
✓Index on Solr
✓Enrich with UIMA
Upcoming SlideShare
Loading in …5
×

OSS Enterprise Search EU Tour

1,659 views

Published on

Sourcesense - Lucid OSS Enterprise Search EU Tour in London, Amsterdam and Rome presentation about "Cool stuff we've done"

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,659
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
12
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

OSS Enterprise Search EU Tour

  1. 1. OSS Enterprise Search EU Tour Spreading Enterprise Search Solutions around Europe London – Amsterdam – Rome 25 – 26 – 28 October Upayavira Maurizio Pillitu Tommaso Teofili
  2. 2. Summary ✓ Sourcesense is involved in many Open Source projects ✓ We continuously spot opportunities to integrate them ✓ Has contributors to Apache UIMA and CMIS we saw opportunities to integrate them with Lucene/Solr... ✓ ...and so we did! ✓ Everything is already released as OSS or will be shortly
  3. 3. Solr - UIMA Semantic content extraction while indexing
  4. 4. Solr - UIMA ✓ A Solr plugin to automatically extract relevant knowledge from documents while indexing them ✓ Recognize and search document’s language, sentences, keywords, concepts, named entities, ... ✓ Extensible architecture provided by Apache UIMA to extract and index more information via configuration ✓ Proposed as Apache Solr patch (issue SOLR-2129)
  5. 5. Solr – UIMA use cases ✓ Automatic enable language specific documents’ search ✓ Easy sentence scoped search ✓ Full text search on concepts, keywords or other named entities (cities, persons, companies) ✓ Semantic faceting ✓ Plug other semantic enrichment engines (no further architectural layers required)
  6. 6. CMIS ✓ Interoperability between different Enterprise Content Management Systems ✓ OASIS Specification on May 1, 2010 ✓ Standard Data Model ✓ SOAP and ATOM Pub WS over REST ✓ Java, JavaScript, PHP, Python, .NET implementations
  7. 7. Why CMIS ✓ Allows to build and leverage applications against multiple repositories ✓ Decouples Web Services from the Content Management System ✓ Avoids yet another custom WS tier ✓ Standardized and certified interfaces ✓ Platform and language agnostic
  8. 8. Solr CMIS Integration ✓ Retrieves documents from multiple CMIS repositories ✓ Configurable mapping cmis:document into solr:document ✓ Leverages Solr Multicore feature ✓ Smooth integration with pre-existing data ✓ Keeps Solr indexes up-to-date with CMIS repository changes
  9. 9. “From scratch Deployment” ● The need: ✓ Reliable, resilient, scalable search solution ✓ Ability to roll out new 'rows' at will ● The solution: ✓ Virtualisation ✓ Automation ✓ Tools: Capistrano, bash, potentially Puppet/Chef
  10. 10. Sample Solr Setup shard3shard3 co-ordinatorco-ordinator shard1shard1 shard2shard2 Load balancer Load balancer shard3shard3 co-ordinatorco-ordinator shard1shard1 shard2shard2
  11. 11. DeployX Stages Instantiate VMInstantiate VM Configure hostConfigure host Deploy applicationDeploy application Duplicate dataDuplicate data Add to poolAdd to pool Push buttonPush button Inparallel
  12. 12. Demo ✓Extract CMIS documents ✓Index on Solr ✓Enrich with UIMA

×