SlideShare a Scribd company logo
Sören Schneider, Alkacon Software 
WORKSHOP TRACK Using the SOLR Collector 
27.11.2014
1.Brief Introduction Into Solr 
2.Common Mistakes Using OpenCms & Solr 
3.Using the Solr Collector (DEMO) 
4.Spellchecking in OpenCms Using Solr 
Agenda
●Solr is a very versatile and powerfool search engine that supports various features 
●This functionality comes with the price of increased complexity to handle Solr 
●Many customizations available 
●All fields composing a single document are typed 
Brief Solr Introduction
●Data structures of Solr‘s documents are defined the file schema.xml 
●Performing changes on this file requires reindexing 
●Dynamic Fields cope with that limitiation 
●Can be used without being explicitely defined in the schema using wildcards 
Defining Solr‘s Data Structure
Solr: Indexing Content 
a: date 
b: text 
c: string 
Solr processing (through analyzers, filters and tokenizers) 
a: date 
b: string 
c: string
●„Direct“ usage of OpenCms & Solr requires a basic understanding of Solr 
●Use proper datatypes in respect of individual usecase, gain knowledge of filters 
●Know the query syntax (for appropriate datatypes) 
●Most common mistakes of OpenCms users result in insufficient knowledge of Solr basics 
OpenCms & Solr
1.Using inproper types 
●„text“ vs „string“ 
●Formulating correct queries 
2.Issues regarding mapping OpenCms <->Solr 
3.(Encoding Problems) 
Common Mistakes Using Solr & OpenCms
●String 
●Stores its content as exact string 
●No tokenization / processing is being performed 
●Useful when searching for exact value 
●Text 
●Tokenization and processing is performed 
●Useful when a part of the content is searched for 
„text“ vs „string“
●OpenCms‘s copies the entire XML content into a single(!) locale-aware Solr field of type „text“ for each locale 
●Particular information of a resource is made searchable in OpenCms using two approaches 
●Automatic mapping of properties to Solr fields 
●Manual definintion of mappings 
Making Your Content Searchable
Indexing Content w/o Searchsettings 
Solr processing (through analyzers, filters and tokenizers) 
x: text 
a: date 
b: string 
c: string
Indexing Content with Searchsettings 
a: date 
b: text 
c: string 
Solr processing (through analyzers, filters and tokenizers) 
a: date 
b: string 
c: string
●Mapping happens in the scheme of the appropriate resource type 
●Excerpt 
Solr – OpenCms Interaction: Mapping 
<xsd:schema 
… 
<xsd:annotation 
<xsd:appinfo 
<searchsettings> 
<searchsetting element= "City" searchcontent="true"> 
<solrfield targetfield= "city" sourcefield="_s" 
</searchsetting> … 
Resource type element name
Element Mapping Attributes 
Attribute Name 
Effect on the Solr Field 
targetfield* 
The resulting name 
locale 
Write content only for specific locale 
sourcefield 
Defines the resulting type 
copyfields 
Copies the value to a different field 
default 
Sets a default value 
boost 
Sets a boost for the field
●Users complain about problems regarding certain Characters – mostly German Umlauts – in Solr results 
●In nearly all cases the sole problem lies within the integration of Solr to the servlet cotainer which is not happening in UTF-8 
●Extra note for Tomcat users: Please check whether you appended the required attributes all appropriate „<Connector>“s ;-) 
Using UTF-8 in Solr
●Live Demo 
15 
Live Demo 
Demo 
Demo 
Demo 
Demo 
デモ
WYSIWYG Spellchecker
●The Spellchecker has been realized using Solr 
●Solr already provides a flexible component named „SpellCheckComponent“ 
●This component supports inline spellchecking of Solr queries 
●Source for suggestions can be specified by Solr fields or text files 
WYSIWIG Spellchecker
●The „SpellCheckComponent“ is widely used to implement the „Did you mean?“-feature known by popular search engines 
●The component is 
●Reliable and mature 
●Fast 
●Plus, Solr is already available in OpenCms 
Why using Solr as Spellchecker
●If both usecases use the same component, how do the implementations actually differ? 
●„Did you mean?“ builds source of suggested words based on the entire data, the search runs on. Usually only a single hit is returned. 
●The WYSIWYG spellchecker builds ist source of suggestions based on a data that solely contains the dictionary for a single language 
Differences Between Usecases in Regards of Implementation
●Spellchecking has been realized using another Solr core that resides in WEB-INF/spellcheck 
●As the only purpose of this core is to contain spellcheck information, the schema.xml file is as simple as it gets 
●Why using another Solr core instead of the default core that‘s used by OpenCms? 
●Dictionaries are stored as one Solr index per language 
How to model this scenario using Solr?
●Sadly, the spellchecking interfaces of tinyMCE and Solr are incompatible 
Problems regarding tinyMCE and Solr 
Solr 
tinyMCE
Comparison Spellcheck Responses 
{ 
"id":"c0", 
"result":{„hsoue":[„house„, „has“]} 
} 
"spellcheck":{ "suggestions":[ „hsoue",{"numFound":5, "startOffset":0, "endOffset":4, "origFreq":0, "suggestion":[{"word":„house","freq": 53}, {"word":"has","freq":271}, … ]}, "correctlySpelled",false, "collation","hsue„ ]},
●A new component had to be realized in OpenCms that basically 
●Accepts spellcheck requests from tinyMCE 
●Handles tinyMCE and Solr communication and message conversion 
●Checks and (re-)builds spellcheck indices 
●The appropriate code is found in org.opencms.search.solr.spellcheck 
Glueing the Pieces together
●Dictionaries can be edited easily in OpenCms 
●Those indices are automatically filled by flat text files, one word per line 
●Support for multiple languages 
●To access the dicts, have a look at the directory org.opencms.workplace.spellcheck/resources/ 
Spellchecker in OpenCms
●Adding a new language 
1.Create new Solr field in schema.xml 
2.Create new dictionary file inside VFS 
3.Restart OpenCms 
●Adding words to the custom dict 
Extending the Spellchecker
●Any Questions? 
26 
Any Questions? 
Fragen? 
Questions ? 
Questiones? 
¿Preguntas? 
質問
Sören Schneider 
Alkacon Software GmbH 
http://www.alkacon.com 
http://www.opencms.org 
Thank you very much for your attention! 
27

More Related Content

What's hot

Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDB
Antonios Giannopoulos
 
Php classes in mumbai
Php classes in mumbaiPhp classes in mumbai
Php classes in mumbai
aadi Surve
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_new
Antonios Giannopoulos
 
Umleitung: a tiny mochiweb/CouchDB app
Umleitung: a tiny mochiweb/CouchDB appUmleitung: a tiny mochiweb/CouchDB app
Umleitung: a tiny mochiweb/CouchDB app
Lenz Gschwendtner
 
PHP7 Presentation
PHP7 PresentationPHP7 Presentation
PHP7 Presentation
David Sanchez
 
Salesforce CLI Cheat Sheet
Salesforce CLI Cheat Sheet Salesforce CLI Cheat Sheet
Salesforce CLI Cheat Sheet
Keir Bowden
 
MuleSoft ESB Scripting Example
MuleSoft ESB Scripting ExampleMuleSoft ESB Scripting Example
MuleSoft ESB Scripting Example
akashdprajapati
 
Accessing mongo DB In Mule ESB
Accessing mongo DB In Mule ESBAccessing mongo DB In Mule ESB
Accessing mongo DB In Mule ESB
Srinu Prasad
 
Django Rest Framework and React and Redux, Oh My!
Django Rest Framework and React and Redux, Oh My!Django Rest Framework and React and Redux, Oh My!
Django Rest Framework and React and Redux, Oh My!
Eric Palakovich Carr
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a boss
Francisco Ribeiro
 
PHP 5.6 New and Deprecated Features
PHP 5.6  New and Deprecated FeaturesPHP 5.6  New and Deprecated Features
PHP 5.6 New and Deprecated Features
Mark Niebergall
 
10 Most Important Features of New PHP 5.6
10 Most Important Features of New PHP 5.610 Most Important Features of New PHP 5.6
10 Most Important Features of New PHP 5.6
Webline Infosoft P Ltd
 
Php Unit With Zend Framework Zendcon09
Php Unit With Zend Framework   Zendcon09Php Unit With Zend Framework   Zendcon09
Php Unit With Zend Framework Zendcon09
Michelangelo van Dam
 
Becoming A Drupal Master Builder
Becoming A Drupal Master BuilderBecoming A Drupal Master Builder
Becoming A Drupal Master Builder
Philip Norton
 
Xmla4js
Xmla4jsXmla4js
Xmla4js
Roland Bouman
 
Catalyst MVC
Catalyst MVCCatalyst MVC
Catalyst MVC
Sheeju Alex
 
A WordPress workshop at Cefalo
A WordPress workshop at Cefalo A WordPress workshop at Cefalo
A WordPress workshop at Cefalo
Beroza Paul
 

What's hot (20)

Managing data and operation distribution in MongoDB
Managing data and operation distribution in MongoDBManaging data and operation distribution in MongoDB
Managing data and operation distribution in MongoDB
 
Php classes in mumbai
Php classes in mumbaiPhp classes in mumbai
Php classes in mumbai
 
Sharding in MongoDB 4.2 #what_is_new
 Sharding in MongoDB 4.2 #what_is_new Sharding in MongoDB 4.2 #what_is_new
Sharding in MongoDB 4.2 #what_is_new
 
Umleitung: a tiny mochiweb/CouchDB app
Umleitung: a tiny mochiweb/CouchDB appUmleitung: a tiny mochiweb/CouchDB app
Umleitung: a tiny mochiweb/CouchDB app
 
Tp web
Tp webTp web
Tp web
 
PHP7 Presentation
PHP7 PresentationPHP7 Presentation
PHP7 Presentation
 
Salesforce CLI Cheat Sheet
Salesforce CLI Cheat Sheet Salesforce CLI Cheat Sheet
Salesforce CLI Cheat Sheet
 
MuleSoft ESB Scripting Example
MuleSoft ESB Scripting ExampleMuleSoft ESB Scripting Example
MuleSoft ESB Scripting Example
 
Ajax
AjaxAjax
Ajax
 
Accessing mongo DB In Mule ESB
Accessing mongo DB In Mule ESBAccessing mongo DB In Mule ESB
Accessing mongo DB In Mule ESB
 
Django Rest Framework and React and Redux, Oh My!
Django Rest Framework and React and Redux, Oh My!Django Rest Framework and React and Redux, Oh My!
Django Rest Framework and React and Redux, Oh My!
 
web2py:Web development like a boss
web2py:Web development like a bossweb2py:Web development like a boss
web2py:Web development like a boss
 
PHP 5.6 New and Deprecated Features
PHP 5.6  New and Deprecated FeaturesPHP 5.6  New and Deprecated Features
PHP 5.6 New and Deprecated Features
 
Lumberjack XPath 101
Lumberjack XPath 101Lumberjack XPath 101
Lumberjack XPath 101
 
10 Most Important Features of New PHP 5.6
10 Most Important Features of New PHP 5.610 Most Important Features of New PHP 5.6
10 Most Important Features of New PHP 5.6
 
Php Unit With Zend Framework Zendcon09
Php Unit With Zend Framework   Zendcon09Php Unit With Zend Framework   Zendcon09
Php Unit With Zend Framework Zendcon09
 
Becoming A Drupal Master Builder
Becoming A Drupal Master BuilderBecoming A Drupal Master Builder
Becoming A Drupal Master Builder
 
Xmla4js
Xmla4jsXmla4js
Xmla4js
 
Catalyst MVC
Catalyst MVCCatalyst MVC
Catalyst MVC
 
A WordPress workshop at Cefalo
A WordPress workshop at Cefalo A WordPress workshop at Cefalo
A WordPress workshop at Cefalo
 

Similar to OpenCms Days 2014 - Using the SOLR collector

Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
Saumitra Srivastav
 
Solr5
Solr5Solr5
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and Spark
Lucidworks
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6DEEPAK KHETAWAT
 
Solr 8 interview
Solr 8 interview Solr 8 interview
Solr 8 interview
Alihossein shahabi
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
Paul Borgermans
 
Getting Started with Solr
Getting Started with SolrGetting Started with Solr
Getting Started with Solr
Travis Carlson
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr WorkshopJSGB
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
Erik Hatcher
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and Spark
Lucidworks
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)
Igor Talevski
 
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Lucidworks
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
Rahul Jain
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
Lucidworks (Archived)
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
lucenerevolution
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
Erik Hatcher
 
Webinar: What's New in Solr 7
Webinar: What's New in Solr 7 Webinar: What's New in Solr 7
Webinar: What's New in Solr 7
Lucidworks
 
CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)
Hon Weng Chong
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
Trey Grainger
 

Similar to OpenCms Days 2014 - Using the SOLR collector (20)

Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Solr5
Solr5Solr5
Solr5
 
Data Science with Solr and Spark
Data Science with Solr and SparkData Science with Solr and Spark
Data Science with Solr and Spark
 
Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6Basics of Solr and Solr Integration with AEM6
Basics of Solr and Solr Integration with AEM6
 
Solr 8 interview
Solr 8 interview Solr 8 interview
Solr 8 interview
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
 
Getting Started with Solr
Getting Started with SolrGetting Started with Solr
Getting Started with Solr
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and Spark
 
AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)AngularJS 1.x - your first application (problems and solutions)
AngularJS 1.x - your first application (problems and solutions)
 
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
Rebuilding Solr 6 Examples - Layer by Layer: Presented by Alexandre Rafalovit...
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Webinar: What's New in Solr 7
Webinar: What's New in Solr 7 Webinar: What's New in Solr 7
Webinar: What's New in Solr 7
 
CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)CoreML for NLP (Melb Cocoaheads 08/02/2018)
CoreML for NLP (Melb Cocoaheads 08/02/2018)
 
Objective-C
Objective-CObjective-C
Objective-C
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 

More from Alkacon Software GmbH & Co. KG

OpenCms Days 2016: Multilingual websites with OpenCms
OpenCms Days 2016:   Multilingual websites with OpenCmsOpenCms Days 2016:   Multilingual websites with OpenCms
OpenCms Days 2016: Multilingual websites with OpenCms
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2016: Participation and transparency portals with OpenCms
OpenCms Days 2016: Participation and transparency portals with OpenCmsOpenCms Days 2016: Participation and transparency portals with OpenCms
OpenCms Days 2016: Participation and transparency portals with OpenCms
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2016: OpenCms at the swiss seismological service
OpenCms Days 2016: OpenCms at the swiss seismological serviceOpenCms Days 2016: OpenCms at the swiss seismological service
OpenCms Days 2016: OpenCms at the swiss seismological service
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2016: Next generation content repository
OpenCms Days 2016: Next generation content repository OpenCms Days 2016: Next generation content repository
OpenCms Days 2016: Next generation content repository
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2016: Keynote - Introducing OpenCms 10.5
OpenCms Days 2016:   Keynote - Introducing OpenCms 10.5OpenCms Days 2016:   Keynote - Introducing OpenCms 10.5
OpenCms Days 2016: Keynote - Introducing OpenCms 10.5
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 OpenCms X marks the spot
OpenCms Days 2015 OpenCms X marks the spotOpenCms Days 2015 OpenCms X marks the spot
OpenCms Days 2015 OpenCms X marks the spot
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 Next generation repository
OpenCms Days 2015  Next generation repositoryOpenCms Days 2015  Next generation repository
OpenCms Days 2015 Next generation repository
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 Creating Apps for the OpenCms 10 workplace
OpenCms Days 2015  Creating Apps for the OpenCms 10 workplace OpenCms Days 2015  Creating Apps for the OpenCms 10 workplace
OpenCms Days 2015 Creating Apps for the OpenCms 10 workplace
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 OCEE explained
OpenCms Days 2015 OCEE explainedOpenCms Days 2015 OCEE explained
OpenCms Days 2015 OCEE explained
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 Workflow using Docker and Jenkins
OpenCms Days 2015 Workflow using Docker and JenkinsOpenCms Days 2015 Workflow using Docker and Jenkins
OpenCms Days 2015 Workflow using Docker and Jenkins
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 Modern templates with nested containers
OpenCms Days 2015 Modern templates with nested containersOpenCms Days 2015 Modern templates with nested containers
OpenCms Days 2015 Modern templates with nested containers
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 Hidden features of OpenCms
OpenCms Days 2015 Hidden features of OpenCmsOpenCms Days 2015 Hidden features of OpenCms
OpenCms Days 2015 Hidden features of OpenCms
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 OpenGovernment
OpenCms Days 2015 OpenGovernmentOpenCms Days 2015 OpenGovernment
OpenCms Days 2015 OpenGovernment
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 OpenCms at erarta
OpenCms Days 2015 OpenCms at erarta OpenCms Days 2015 OpenCms at erarta
OpenCms Days 2015 OpenCms at erarta
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 How do you develop for OpenCms?
OpenCms Days 2015 How do you develop for OpenCms?OpenCms Days 2015 How do you develop for OpenCms?
OpenCms Days 2015 How do you develop for OpenCms?
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals companyOpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals company
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2014 - How Techem handles international customer portals
OpenCms Days 2014 - How Techem handles international customer portalsOpenCms Days 2014 - How Techem handles international customer portals
OpenCms Days 2014 - How Techem handles international customer portals
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and GruntOpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2014 - OpenCms cloud setup with the FI-TS
OpenCms Days 2014 - OpenCms cloud setup with the FI-TSOpenCms Days 2014 - OpenCms cloud setup with the FI-TS
OpenCms Days 2014 - OpenCms cloud setup with the FI-TS
Alkacon Software GmbH & Co. KG
 
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
Alkacon Software GmbH & Co. KG
 

More from Alkacon Software GmbH & Co. KG (20)

OpenCms Days 2016: Multilingual websites with OpenCms
OpenCms Days 2016:   Multilingual websites with OpenCmsOpenCms Days 2016:   Multilingual websites with OpenCms
OpenCms Days 2016: Multilingual websites with OpenCms
 
OpenCms Days 2016: Participation and transparency portals with OpenCms
OpenCms Days 2016: Participation and transparency portals with OpenCmsOpenCms Days 2016: Participation and transparency portals with OpenCms
OpenCms Days 2016: Participation and transparency portals with OpenCms
 
OpenCms Days 2016: OpenCms at the swiss seismological service
OpenCms Days 2016: OpenCms at the swiss seismological serviceOpenCms Days 2016: OpenCms at the swiss seismological service
OpenCms Days 2016: OpenCms at the swiss seismological service
 
OpenCms Days 2016: Next generation content repository
OpenCms Days 2016: Next generation content repository OpenCms Days 2016: Next generation content repository
OpenCms Days 2016: Next generation content repository
 
OpenCms Days 2016: Keynote - Introducing OpenCms 10.5
OpenCms Days 2016:   Keynote - Introducing OpenCms 10.5OpenCms Days 2016:   Keynote - Introducing OpenCms 10.5
OpenCms Days 2016: Keynote - Introducing OpenCms 10.5
 
OpenCms Days 2015 OpenCms X marks the spot
OpenCms Days 2015 OpenCms X marks the spotOpenCms Days 2015 OpenCms X marks the spot
OpenCms Days 2015 OpenCms X marks the spot
 
OpenCms Days 2015 Next generation repository
OpenCms Days 2015  Next generation repositoryOpenCms Days 2015  Next generation repository
OpenCms Days 2015 Next generation repository
 
OpenCms Days 2015 Creating Apps for the OpenCms 10 workplace
OpenCms Days 2015  Creating Apps for the OpenCms 10 workplace OpenCms Days 2015  Creating Apps for the OpenCms 10 workplace
OpenCms Days 2015 Creating Apps for the OpenCms 10 workplace
 
OpenCms Days 2015 OCEE explained
OpenCms Days 2015 OCEE explainedOpenCms Days 2015 OCEE explained
OpenCms Days 2015 OCEE explained
 
OpenCms Days 2015 Workflow using Docker and Jenkins
OpenCms Days 2015 Workflow using Docker and JenkinsOpenCms Days 2015 Workflow using Docker and Jenkins
OpenCms Days 2015 Workflow using Docker and Jenkins
 
OpenCms Days 2015 Modern templates with nested containers
OpenCms Days 2015 Modern templates with nested containersOpenCms Days 2015 Modern templates with nested containers
OpenCms Days 2015 Modern templates with nested containers
 
OpenCms Days 2015 Hidden features of OpenCms
OpenCms Days 2015 Hidden features of OpenCmsOpenCms Days 2015 Hidden features of OpenCms
OpenCms Days 2015 Hidden features of OpenCms
 
OpenCms Days 2015 OpenGovernment
OpenCms Days 2015 OpenGovernmentOpenCms Days 2015 OpenGovernment
OpenCms Days 2015 OpenGovernment
 
OpenCms Days 2015 OpenCms at erarta
OpenCms Days 2015 OpenCms at erarta OpenCms Days 2015 OpenCms at erarta
OpenCms Days 2015 OpenCms at erarta
 
OpenCms Days 2015 How do you develop for OpenCms?
OpenCms Days 2015 How do you develop for OpenCms?OpenCms Days 2015 How do you develop for OpenCms?
OpenCms Days 2015 How do you develop for OpenCms?
 
OpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals companyOpenCms Days 2015 Arkema, a leading chemicals company
OpenCms Days 2015 Arkema, a leading chemicals company
 
OpenCms Days 2014 - How Techem handles international customer portals
OpenCms Days 2014 - How Techem handles international customer portalsOpenCms Days 2014 - How Techem handles international customer portals
OpenCms Days 2014 - How Techem handles international customer portals
 
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and GruntOpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
OpenCms Days 2014 - Enhancing OpenCms front end development with Sass and Grunt
 
OpenCms Days 2014 - OpenCms cloud setup with the FI-TS
OpenCms Days 2014 - OpenCms cloud setup with the FI-TSOpenCms Days 2014 - OpenCms cloud setup with the FI-TS
OpenCms Days 2014 - OpenCms cloud setup with the FI-TS
 
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
OpenCms Days 2014 - OpenCms Module Development and Deployment with IntelliJ, ...
 

Recently uploaded

Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
Roshan Dwivedi
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 

Recently uploaded (20)

Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 

OpenCms Days 2014 - Using the SOLR collector

  • 1. Sören Schneider, Alkacon Software WORKSHOP TRACK Using the SOLR Collector 27.11.2014
  • 2. 1.Brief Introduction Into Solr 2.Common Mistakes Using OpenCms & Solr 3.Using the Solr Collector (DEMO) 4.Spellchecking in OpenCms Using Solr Agenda
  • 3. ●Solr is a very versatile and powerfool search engine that supports various features ●This functionality comes with the price of increased complexity to handle Solr ●Many customizations available ●All fields composing a single document are typed Brief Solr Introduction
  • 4. ●Data structures of Solr‘s documents are defined the file schema.xml ●Performing changes on this file requires reindexing ●Dynamic Fields cope with that limitiation ●Can be used without being explicitely defined in the schema using wildcards Defining Solr‘s Data Structure
  • 5. Solr: Indexing Content a: date b: text c: string Solr processing (through analyzers, filters and tokenizers) a: date b: string c: string
  • 6. ●„Direct“ usage of OpenCms & Solr requires a basic understanding of Solr ●Use proper datatypes in respect of individual usecase, gain knowledge of filters ●Know the query syntax (for appropriate datatypes) ●Most common mistakes of OpenCms users result in insufficient knowledge of Solr basics OpenCms & Solr
  • 7. 1.Using inproper types ●„text“ vs „string“ ●Formulating correct queries 2.Issues regarding mapping OpenCms <->Solr 3.(Encoding Problems) Common Mistakes Using Solr & OpenCms
  • 8. ●String ●Stores its content as exact string ●No tokenization / processing is being performed ●Useful when searching for exact value ●Text ●Tokenization and processing is performed ●Useful when a part of the content is searched for „text“ vs „string“
  • 9. ●OpenCms‘s copies the entire XML content into a single(!) locale-aware Solr field of type „text“ for each locale ●Particular information of a resource is made searchable in OpenCms using two approaches ●Automatic mapping of properties to Solr fields ●Manual definintion of mappings Making Your Content Searchable
  • 10. Indexing Content w/o Searchsettings Solr processing (through analyzers, filters and tokenizers) x: text a: date b: string c: string
  • 11. Indexing Content with Searchsettings a: date b: text c: string Solr processing (through analyzers, filters and tokenizers) a: date b: string c: string
  • 12. ●Mapping happens in the scheme of the appropriate resource type ●Excerpt Solr – OpenCms Interaction: Mapping <xsd:schema … <xsd:annotation <xsd:appinfo <searchsettings> <searchsetting element= "City" searchcontent="true"> <solrfield targetfield= "city" sourcefield="_s" </searchsetting> … Resource type element name
  • 13. Element Mapping Attributes Attribute Name Effect on the Solr Field targetfield* The resulting name locale Write content only for specific locale sourcefield Defines the resulting type copyfields Copies the value to a different field default Sets a default value boost Sets a boost for the field
  • 14. ●Users complain about problems regarding certain Characters – mostly German Umlauts – in Solr results ●In nearly all cases the sole problem lies within the integration of Solr to the servlet cotainer which is not happening in UTF-8 ●Extra note for Tomcat users: Please check whether you appended the required attributes all appropriate „<Connector>“s ;-) Using UTF-8 in Solr
  • 15. ●Live Demo 15 Live Demo Demo Demo Demo Demo デモ
  • 17. ●The Spellchecker has been realized using Solr ●Solr already provides a flexible component named „SpellCheckComponent“ ●This component supports inline spellchecking of Solr queries ●Source for suggestions can be specified by Solr fields or text files WYSIWIG Spellchecker
  • 18. ●The „SpellCheckComponent“ is widely used to implement the „Did you mean?“-feature known by popular search engines ●The component is ●Reliable and mature ●Fast ●Plus, Solr is already available in OpenCms Why using Solr as Spellchecker
  • 19. ●If both usecases use the same component, how do the implementations actually differ? ●„Did you mean?“ builds source of suggested words based on the entire data, the search runs on. Usually only a single hit is returned. ●The WYSIWYG spellchecker builds ist source of suggestions based on a data that solely contains the dictionary for a single language Differences Between Usecases in Regards of Implementation
  • 20. ●Spellchecking has been realized using another Solr core that resides in WEB-INF/spellcheck ●As the only purpose of this core is to contain spellcheck information, the schema.xml file is as simple as it gets ●Why using another Solr core instead of the default core that‘s used by OpenCms? ●Dictionaries are stored as one Solr index per language How to model this scenario using Solr?
  • 21. ●Sadly, the spellchecking interfaces of tinyMCE and Solr are incompatible Problems regarding tinyMCE and Solr Solr tinyMCE
  • 22. Comparison Spellcheck Responses { "id":"c0", "result":{„hsoue":[„house„, „has“]} } "spellcheck":{ "suggestions":[ „hsoue",{"numFound":5, "startOffset":0, "endOffset":4, "origFreq":0, "suggestion":[{"word":„house","freq": 53}, {"word":"has","freq":271}, … ]}, "correctlySpelled",false, "collation","hsue„ ]},
  • 23. ●A new component had to be realized in OpenCms that basically ●Accepts spellcheck requests from tinyMCE ●Handles tinyMCE and Solr communication and message conversion ●Checks and (re-)builds spellcheck indices ●The appropriate code is found in org.opencms.search.solr.spellcheck Glueing the Pieces together
  • 24. ●Dictionaries can be edited easily in OpenCms ●Those indices are automatically filled by flat text files, one word per line ●Support for multiple languages ●To access the dicts, have a look at the directory org.opencms.workplace.spellcheck/resources/ Spellchecker in OpenCms
  • 25. ●Adding a new language 1.Create new Solr field in schema.xml 2.Create new dictionary file inside VFS 3.Restart OpenCms ●Adding words to the custom dict Extending the Spellchecker
  • 26. ●Any Questions? 26 Any Questions? Fragen? Questions ? Questiones? ¿Preguntas? 質問
  • 27. Sören Schneider Alkacon Software GmbH http://www.alkacon.com http://www.opencms.org Thank you very much for your attention! 27