• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Super size your search
 

Super size your search

on

  • 350 views

Using Apache ManifoldCF with Alfresco to provide a cloud scalable search solution using Apache Solr, Elastic Search or Amazon Cloud Search

Using Apache ManifoldCF with Alfresco to provide a cloud scalable search solution using Apache Solr, Elastic Search or Amazon Cloud Search

Statistics

Views

Total Views
350
Views on SlideShare
350
Embed Views
0

Actions

Likes
2
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Super size your search Super size your search Presentation Transcript

    • Super Size Your Search 14th November 2013 Fran Alvarez (Zaizi) #SummitNow
    • Agenda • • • • • • Myself & My company Background Our Solution Scenario Demo Conclusions #SummitNow #SummitNow
    • About me • Director of Zaizi Iberia and Chief Architect • Alfresco Certified Engineer • Responsible of large Alfresco architectures • Semantic Consultant for Sensefy • Alfresco Meetups Organizer #SummitNow #SummitNow
    • We are an Open Source Development Company that helps people work together more effectively HQ: London (UK) Singapore Seville (Spain) Colombo (Sri Lanka) #SummitNow #SummitNow
    • What we offer • Open Source System Integrator • Specialist in ECM • Platinum Alfresco partner • Best Systems Integrator Partner EMEA 2012 • Best Systems Integrator Partner EMEA 2013 • Million $ Club in 2013 • Support 24/7 #SummitNow #SummitNow
    • Background Let‟s put a bit of context #SummitNow #SummitNow
    • Those Old Days… • Only Lucene in Alfresco 3.4• Indexes were managed within Alfresco context • Permissions were checked after Lucene returned all results #SummitNow #SummitNow
    • Present • Solr as Search Subsystem • Indexes are managed outside Alfresco context • Permissions are checked at query time • No in-transaction index #SummitNow #SummitNow
    • Alfresco 4 is… Common Enemies • Find a single document • Return large data sets • Filter by permissions • Be fast! “Sometimes one superhero is not enough” #SummitNow #SummitNow
    • Alfresco + Solr Approach • Quite a good architecture • Takes care of both performance and usability • Flexibility in deployment and installations However… • Sometimes we just need to use something else #SummitNow #SummitNow
    • Future Don’t freak out dude! We can arrange something #SummitNow #SummitNow
    • Our solution • Use Apache ManifoldCF • Decoupled from Alfresco • Can be integrated with either Alfresco or any other repository vendor • Preserve security and permissions within results • API to manage Manifold Services • API for searching, decoupling Search engine chosen • Simple Bundled UI • Lots of Manifold Customization It‟s included in our Semantic solution: Sensefy! #SummitNow #SummitNow
    • Apache ManifoldCF • Open Source Apache SF Project • Get content from repos • Push content on search services • Based on “Connector” and “Job” concept • Crawling model (add, change, delete) • And respect permissions, bitch! #SummitNow #SummitNow
    • ManifoldCF Overview Repository 1 Repository 2 Apache ManifoldCF Authority Service Repository 3 user specific search results Search Server 1 Search Server 2 Search Server 3 Authority 1 Repository 4 Authority 2 #SummitNow #SummitNow
    • ManifoldCF – Architecture Repository Job Search Server ACLs #SummitNow #SummitNow
    • ManifoldCF – Architecture Repository Connector Repository Job Search Server ACLs #SummitNow #SummitNow
    • ManifoldCF – Architecture Repository Connector Repository Output Connector Job Search Server ACLs #SummitNow #SummitNow
    • ManifoldCF – Architecture Repository Connector Repository Output Connector Job Search Server ACLs Authority Connector #SummitNow #SummitNow
    • ManifoldCF – Architecture Repository Connector query to retrieve contents Repository Output Connector Job Search Server ACLs Authority Connector #SummitNow #SummitNow
    • ManifoldCF – Architecture Repository Connector query to retrieve contents Repository Output Connector metadata mapping content ingestion Job Search Server ACLs Authority Connector #SummitNow #SummitNow
    • ManifoldCF – Architecture Repository Connector query to retrieve contents Repository Output Connector metadata mapping content ingestion Job Search Server ACLs Authority Connector retrieve content ACEs #SummitNow #SummitNow
    • ManifoldCF – Architecture Repository Connector query to retrieve contents Repository Output Connector metadata mapping content ingestion Job Search Server ACLs Authority Connector retrieve content ACEs • verbal description • crawling model • scheduling #SummitNow #SummitNow
    • Our ManifoldCF Contribution • Alfresco Repository Connector: New implementation • Amazon Cloud Search Output Connector • Alfresco Authority Connector: Design & Development #SummitNow #SummitNow
    • Some of our most famous villains #SummitNow #SummitNow
    • Several Alfresco instances Current • Alfresco instances don‟t share indexes • Indexes can‟t be merged • Can‟t have federated search No good approach for presenting results to users #SummitNow #SummitNow
    • Several Alfresco instances Our solution • Once index to rule them all • Data origin is irrelevant (or not if we don‟t) Single search across repositories • You choose your search engine! #SummitNow #SummitNow
    • Alfresco + Other data providers Current • Alfresco Search subsystem != Other provider Search services • Alfresco can‟t reach external data No way to merge results uniformly to end users #SummitNow #SummitNow
    • Alfresco + Other data providers Our solution • Search engine is shared • All of them speak „our language‟ • Alfresco can reach external data through Results are present and accessible between data providers #SummitNow #SummitNow
    • Alfresco + O(TB) data Current • Alfresco Search subsystem • Single or clustered Solr • Every Solr instance manage its own index • No chance to apply scale techniques Huge server are required and performance might be compromised #SummitNow #SummitNow
    • Alfresco + O(TB) data Our Solution • Alfresco uses our index • Indexing techniques can be applied according to use cases • Sharding, Replication… Search strategy can be adopted with best suitable search solution #SummitNow #SummitNow
    • Other benefits • Extract, index and map information from any other sources • Putting them together in a single index • Permissions are checked just once • Search capabilities: facets, highlighting… Red Link Apache ManifoldCF Search Server Authority Service Alfresco Alfresco Permissions #SummitNow #SummitNow
    • Demo #SummitNow #SummitNow
    • Demo : Architecture #SummitNow #SummitNow
    • Demo: Who are these guys? Gareth Bale, footballer Real Madrid latest star Christian Bale, Actor Christopher Nolan‟s Batman #SummitNow #SummitNow
    • Conclusions • Searching & Indexing in most popular Cloud Search solutions • Retrieving information from most popular repositories and data providers altogether • Manage permission and security for data Fully supported by us! #SummitNow #SummitNow
    • Conclusions #SummitNow #SummitNow
    • What‟s coming How can we improve it, dude? - Powerful UI - New connectors - Large data volume benchmarking - Share integration #SummitNow #SummitNow
    • We are not Batman But we can be your Superhero Zaizi Ltd. enquiries@zaizi.com falvarez@zaizi.com (+44) 20-3582-8330 Fran Álvarez (+34) 666-424-364 #SummitNow #SummitNow
    • Thank you! • May you want to help us with this one? #SummitNow #SummitNow