• Save
Searching alfresco with solr cloud 4, elastic search and amazon cloud search
 

Like this? Share it with your network

Share

Searching alfresco with solr cloud 4, elastic search and amazon cloud search

on

  • 2,323 views

This presentation demonstrates how Zaizi Intelligent Search solution can be used to index and search content stored in Alfresco, any other CMIS repository or file system using either Apache Solr Cloud ...

This presentation demonstrates how Zaizi Intelligent Search solution can be used to index and search content stored in Alfresco, any other CMIS repository or file system using either Apache Solr Cloud 4, Elastic Search or Amazon Cloud Search, while still ensuring the confidentiality of the documents based on the permissions configured in Alfresco or any other repositories.

Statistics

Views

Total Views
2,323
Views on SlideShare
2,214
Embed Views
109

Actions

Likes
4
Downloads
0
Comments
0

1 Embed 109

https://twitter.com 109

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Searching alfresco with solr cloud 4, elastic search and amazon cloud search Presentation Transcript

  • 1. Alfresco with Solr Cloud 4, Elastic Search and Amazon Cloud SearchFriday, 21 June 13
  • 2. • Search Background• Current Approach and Limitations• Our Solution• Demo• Conclusions and Future• Q&AAgendaFriday, 21 June 13
  • 3. • Zaizi is a consultancy and systems integrator specialising in assemblingsmart content solutions using Alfresco, Ephesoft, Solr and Drupal.• Our team have experience building and delivering a wide range of enterprisesolutions including document and web content management systems, portalsand corporate extranets.• We are an Alfresco & Ephesoft certified Platinum Partner, Red Hat EnterpriseLinux Ready Partner.• Alfresco Partner of the Year 2012!Friday, 21 June 13
  • 4. Scaling Search in Alfresco• Up to Alfresco 3.4• Lucene as Search component• Indexes were managed within Alfresco context• Permissions were checked after Lucene returned all resultsBackgroundFriday, 21 June 13
  • 5. Scaling Search in Alfresco• Up to Alfresco 3.4• Lucene as Search component• Indexes were managed within Alfresco context• Permissions were checked after Lucene returned all resultsBackgroundFriday, 21 June 13
  • 6. Scaling Search in Alfresco• From Alfresco 4.0• Solr as Search Subsystem• Indexes are managed outside Alfresco context• It can be scaled outside Alfresco repo• No need to have different indexes for every Alfresco cluster node• Permissions are checked at query time• Results are brought filtered to Alfresco• No in-transaction indexCurrentFriday, 21 June 13
  • 7. Scaling Search in Alfresco• From Alfresco 4.0• Solr as Search Subsystem• Indexes are managed outside Alfresco context• It can be scaled outside Alfresco repo• No need to have different indexes for every Alfresco cluster node• Permissions are checked at query time• Results are brought filtered to Alfresco• No in-transaction indexCurrentFriday, 21 June 13
  • 8. Scaling Search in AlfrescoBenefits• It scales across the repo• It can be setup as a separatedinstance• Own App Server, own LB…• It saves performance issues forlarge Alfresco installationsBenefits Vs LimitationsLimitations• It does not scale betweendifferent repos or other providers• Other technologies scale better• It’s tied to an old Solr version• 1.4.1 Vs Current 4.3• It does not cover some basic usecases or scenarios• No Cloud benefitsFriday, 21 June 13
  • 9. Scaling Search in Alfresco• Current Approach:• Each Alfresco has its own Searchsubsystem• They can’t share indexes• Implications:• Federated search is not an option• Results can’t be merged• If so, what resultset should be first?ConclusionResults could be presented to users indifferent tabs or “manually” merged.Not the best approachScenario: Several Alfresco instancesFriday, 21 June 13
  • 10. Scaling Search in Alfresco• Current Approach:• Alfresco has its own Search subsystem• Other repository may have (or not) itsown Search subsystem• Implications:• Different data providers mean differentformats• E.g. Filesystem does not supportCMIS• Alfresco can’t reach external dataConclusionNo way to merge results and presentthem uniformly to end usersScenario: Alfresco + Other data providersScaling Search in AlfrescoFriday, 21 June 13
  • 11. Scaling Search in Alfresco• Current Approach:• Alfresco has its own Search subsystem• All data is in one (or several if cluster)Solr instance• Implications:• Every Solr node manages the wholeindex• No chance to apply scale techniquesfor indexing:• Sharding• ReplicationConclusionHuge servers are required andperformance might be compromisedScenario: Alfresco + O(TB) dataFriday, 21 June 13
  • 12. Scaling Search in Alfresco• Quite a good architecture• Performance issues are solved• Different architectures depending on business requirements• However…• It does not cover some use cases or scenarios• It does not leverage Cloud benefits or latest technologies• With huge data volume there are other approaches• How can we solve limitations and enhance benefits?Alfresco + Solr approachFriday, 21 June 13
  • 13. Scaling Search in Alfresco• Decouples Search solution from Alfresco• Allow to implement different Search solutions• Allow to change Search solution without changing anything in Alfresco• Not even a property!• Provides an API to integrate it with Alfresco as search engine• Even other repository vendors! E.g. Filesystem, Sharepoint,Documentum, Filenet, Drupal…• And preserve security permissions in the results• Alfresco permissions are indexed and used during searchIt’s included in our Semantic solution: Sensefy!The ZAIZI SolutionFriday, 21 June 13
  • 14. Scaling Search in AlfrescoApache ManifoldCF• Open Source Apache Software Foundation Framework• Aim to connect source content repositories with target repositories orindexes• Based on the use of connectors• Repository Connectors• Output Connectors• Security connectors• Enforce target repositories to fulfill security policies from sourcesNot possible without…Friday, 21 June 13
  • 15. Scaling Search in Alfresco• Repository Connector:• Alfresco Repository Connector: New implementation• Removing dependency with Alfresco Solr API• Output connectors:• Cloud Search Output Connector: Design & Development• Elastic Search Output Connector: Improvements• Solr Cloud Output Connector: Configuration for Alfresco• Authority Connector• Alfresco Authority Connector: Design & Development• Similar approach to Alfresco Solr• Acl reads for Users and Groups in AlfrescoApache Manifold: What we have doneFriday, 21 June 13
  • 16. Scaling Search in Alfresco• Current Approach:• Each Alfresco has its own Searchsubsystem• They can’t share indexes• Implications:• Federated search is not an option• Results can’t be merged• If so, what resultset should be first?ConclusionResults could be presented to users indifferent tabs or “manually” merged.Not the best approachScenario: Several Alfresco instancesFriday, 21 June 13
  • 17. Scaling Search in Alfresco• Zaizi Approach:• Our solution like search box• Which manages a single index• Implications:• All documents are driven to same index• Users can select results from either allAlfresco instances or a subsetConclusionSearch can be done across repositoriessince search and index is managed byManifoldCould be Elastic Search, Solr Cloud,Amazon Cloud services or any that bettersuits customer requirementsScenario: Several Alfresco instancesFriday, 21 June 13
  • 18. Scaling Search in Alfresco• Current Approach:• Alfresco has its own Search subsystem• Other repository may have (or not) itsown Search subsystem• Implications:• Different data providers mean differentformats• E.g. Filesystem does not supportCMIS• Alfresco can’t reach external dataConclusionNo way to merge results and presentthem uniformly to end usersScenario: Alfresco + Other data providersFriday, 21 June 13
  • 19. Scaling Search in Alfresco• Zaizi Approach:• Both Alfresco and other repositoriesshare Search subsystem (Manifold)• Implications:• Alfresco and other providers results willhave same format in our Solution• They will speak ‘our’ language• E.g. Filesystem does not supportCMIS• Alfresco reaches external data whencommunicating with our solutionConclusionResults are present and accessiblebetween data providersScenario: Alfresco + Other data providersFriday, 21 June 13
  • 20. Scaling Search in Alfresco• Current Approach:• Alfresco has its own Search subsystem• All data is in one (or several if cluster)Solr instance• Implications:• Every Solr node manages the wholeindex• No chance to apply scale techniquesfor indexing:• Sharding• ReplicationConclusionHuge servers are required andperformance might be compromisedScenario: Alfresco + O(TB) dataFriday, 21 June 13
  • 21. Scaling Search in Alfresco• Zaizi Approach:• Alfresco uses our solution• Data is indexed in search solution whichbetter suits:• Amazon Cloud, Solr Cloud, ElasticSearch…• Implications:• Cloud Search solution manages index• Indexing techniques can be appliedaccording to use cases• Sharding, ReplicationConclusionSearch strategy can be adopted andeasily implemented with search solutionwhich better fitsScenario: Alfresco + O(TB) dataFriday, 21 June 13
  • 22. Scaling Search in AlfrescoCan extract, index and map information from any other sourcesApache Stanbol, RedLink, any other data enricherOur solution will gather everything in one placeDocuments, entities…Permissions are checked just onceEverything is in the same place, even user authorization capabilitiesPerformance and scalability is improvedFaceted search and other search capabilities are combined with suchpermission featureApache Manifold: Other benefitsFriday, 21 June 13
  • 23. DEMOFriday, 21 June 13
  • 24. Scaling Search in Alfresco• Zaizi solution allows searching and indexing in the most popular CloudSearch solutions• Other Search solutions can be integrated as well• Zaizi solution allows retrieving information from the most popularrepositories• Other Data providers can be integrated too• It solves plenty of current issues related search and indexing in Alfresco• Can be used outside Alfresco or even with Alfresco and any other datarepository• Zaizi solution manages permissions and security from the most popularrepositories and the latest Cloud search technologies• Fully supported by us!ConclusionsFriday, 21 June 13
  • 25. Scaling Search in AlfrescoConclusionsFriday, 21 June 13
  • 26. Scaling Search in Alfresco• Search engines Benchmarking• Alfresco search subsystem• Powerfull User Interface• Integration with Sensefy• Semantic search and enrichment• New repository connectors and output connectorsNext StepsFriday, 21 June 13
  • 27. Scaling Search in Alfresco• Dynamic Semantic Web Publishing using Crafter and Alfresco• When: Thursday, June 27, 2013 2:00 pm• Adaptive Content Centric Case Management for Financial Services usingAlfresco Workdesk – Process deals, loans, claims and cases faster andmore efficiently• When: Thursday, July 4, 2013 2:00 pmUpcoming webinarsFriday, 21 June 13
  • 28. Scaling Search in AlfrescoIf you want to know more, just contact us!ZAIZIenquiries@zaizi.com+44 020 3582 8330ContactFriday, 21 June 13
  • 29. Thank You!Friday, 21 June 13