Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to Gain Greater Business Intelligence from Lucene/Solr


Published on

Presented by Patrick Beaucamp | Bpm-Conseil - See conference video -

Vanilla, an Open Source business intelligence application by, offers unique features such as report indexing through an embedded Lucene integration. Using Vanilla and Lucene, developers can manage both report indexing and external document indexing, which ultimately saves end users time when they search for specific keywords such as "product code," or "customer code." Vanilla can build upon an existing Solr/Lucene installation that takes care of all the indexing processes while Vanilla takes care of the Reporting/Dashboard creation. During this presentation, attendees will learn how we moved from embed Lucene Api to a Solr/Lucene platform and all the technical and business benefits from this architecture in terms of clustering, caching and access mode.

Published in: Technology
  • Be the first to comment

How to Gain Greater Business Intelligence from Lucene/Solr

  1. 1. Patrick Beaucamp Founder of the Vanilla Project Mail : Patrick.beaucamp@bpm-conseil.comHow to Gain Greater Business Intelligence with Vanilla from Solr/Lucene LuceneRevolution, Boston 1
  2. 2. Presentation AgendaVanilla powered by Lucene- Report Indexation, Search Interface- External document management- evolution & constraintsStep to Solr/Lucene Adoption- Indexation, Storage, Search- Embeded Solr/Lucene- External Solr/Lucene PlatformKeys Benefit for Vanilla powered by Solr/Lucene- Cluster Architecture- Cache Mechanism- Support for enhanced search language LuceneRevolution, Boston 2
  3. 3. Some Vanilla FeaturesFlash maps and charts : Reports, Cubes and Dashboard Vanilla Apps : Android and Iphone LuceneRevolution, Boston 3
  4. 4. Vanilla Powered by Lucene (1/6)Vanilla is a full Business Intelligence Platform that provide :- Reporting, Olap, Dashboard, Kpi, Maps Visualisation- Etl, Workflow, Document Management search Engine LuceneRevolution, Boston 4
  5. 5. Vanilla Powered by Lucene (2/6)Report Indexation- Search engine is Apache Lucene (summer 2010)- External Document & Vanilla Report are indexed- Different Indexation strategy for documents : – No indexation – Real Time indexation – Late Indexation2 modules to manage indexation strategy- Enterprise Services to set document property- Norparena to Manage Indexation LuceneRevolution, Boston 5
  6. 6. Vanilla Powered by Lucene (3/6)Search Interface- Search Interface available from Vanilla Portal- Search against Lucene index (inside Vanilla)- Search result is combined with Security on documents – List contains all documents – Documents are ordered based on popularity LuceneRevolution, Boston 6
  7. 7. Vanilla Powered by Lucene (4/6)External document management- various document format are available (Lucene)- additional properties can be set on documents, for lateruseage in search criteria- check In / check Out on document for versioning- search is run on the latest document version LuceneRevolution, Boston 7
  8. 8. Vanilla Powered by Lucene (5/6)Evolution and constraints- No clustering available for search engine (embeded Api),as opposed to Vanilla Report Services- Limitation in language and keywords (internal search)- No cache to manage search resultset, as opposed toVanilla dataset, powered by Memcached - request from customers to be compliant with enterprisesearch engine → need to setup an external searcharchitecture LuceneRevolution, Boston 8
  9. 9. Vanilla Powered by Lucene (6/6) Embeded Lucene Api inside Vanilla Platform - Video LuceneRevolution, Boston 9
  10. 10. Step to Solr/Lucene Adoption (1/9) Solr/Lucene is the natural evolution of any embeded Lucene platformSolr Version : 3.5IndexationVanilla Lucene Index can be transfert & read by a Solr/Lucene(a Solr/Lucene index is not usable inside Vanilla Platform)StorageVanilla search Indexed can be managed by a Solr/Lucene platformSearchSearch language is compliant LuceneRevolution, Boston 10
  11. 11. Step to Solr/Lucene Adoption (2/9) Embeded Solr/Lucene inside Vanilla PlatformNo need for any changed in Vanilla code : use of solrj ApiImmediatly provide additional features such as new KeywordsPotential upgrade to Solr/Lucene Enterprise LuceneRevolution, Boston 11
  12. 12. Step to Solr/Lucene Adoption (3/9)From Embeded Lucene to Embeded Solr/Lucene inside Vanilla Platform LuceneRevolution, Boston 12
  13. 13. Step to Solr/Lucene Adoption (4/9) Embeded Solr/Lucene inside Vanilla Platform - Video LuceneRevolution, Boston 13
  14. 14. Step to Solr/Lucene Adoption (5/9) Solr/Lucene Platform with a Vanilla PlatformNeed for changes in Vanilla code, to separate document management, indexation& search Api → 10 man days workloadDocument Management ApiEasy to move to any Cmis compliancyIndexation & Search ApiSolr/Lucene oriented & compliant, but now open to any other Search Platform LuceneRevolution, Boston 14
  15. 15. Step to Solr/Lucene Adoption (6/9) Coding BeforeExample of Code (Api) Before the split - Direct use of the Lucene Api - Parse the document content using Apache TIKA - Generate Lucenes queries LuceneRevolution, Boston 15
  16. 16. Step to Solr/Lucene Adoption (7/9) Coding AfterExample of Code (Api) After the split - Easy to use Solrj Api - Distributed search - Indexation with automatic parsing (using Apache Tika) LuceneRevolution, Boston 16
  17. 17. Step to Solr/Lucene Adoption (8/9) Solr/Lucene Platform with Vanilla Platform - Screenshot LuceneRevolution, Boston 17
  18. 18. Step to Solr/Lucene Adoption (9/9) Solr/Lucene Platform with Vanilla Platform - Video LuceneRevolution, Boston 18
  19. 19. Key Benefits for Vanilla Powered by Solr/Lucene (1/4)Clustering Search Architecture, outside of VanillaSearch results clustering implementation (CarrotClusteringEngine) is based on theCarrot2 framework. LuceneRevolution, Boston 19
  20. 20. Key Benefits for Vanilla Powered by Solr/Lucene (2/4)Additional query language to perform searchSolr Uses the Lucene Search Library and Extends it!- A Real Data Schema, with Numeric Types, Dynamic Fields, Unique Keys- Powerful Extensions to the Lucene Query Language- Faceted Search and Filtering- Geospatial Search- Advanced, Configurable Text Analysis LuceneRevolution, Boston 20
  21. 21. Key Benefits for Vanilla Powered by Solr/Lucene (3/4)New methods to manage result set (binary, Xml, Json)Solr enterprise search server with a REST-like API.You put documents in it (called "indexing") via XML, JSON or binary over HTTP.You query it via HTTP GET and receive XML, JSON, or binary results- Advanced Full-Text Search Capabilities- Optimized for High Volume Web Traffic- Standards Based Open Interfaces - XML,JSON and HTTP LuceneRevolution, Boston 21
  22. 22. Key Benefits for Vanilla Powered by Solr/Lucene (4/4)Cache MechanismSolr caches are associated with an Index SearcherThree cache implementations :solr.LRUCache (LRU = Least Recently Used in memory),solr.FastLRUCache,solr.LFUCache (Least Frequenty Used)Many configuration parameters for cache optimisation LuceneRevolution, Boston 22
  23. 23. Next StepsUpgrade to Solr 4.0New features for Document cycle ManagementRoadmap for better Internationalisation :- 10 languages available (not Japaneese)- Search Translation management LuceneRevolution, Boston 23
  24. 24. Documentations and tutorials available on our Web and Thanks for your attention LuceneRevolution, Boston 24