Your SlideShare is downloading. ×
Apache solr liferay
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Apache solr liferay


Published on

A 2009 presentation which I just found in archives

A 2009 presentation which I just found in archives

Published in: Technology

1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Apache Solr Enterprise search platform from the Apache Lucene projectRivet Logic Corporation1800 Alexander Bell DriveSuite 400Reston, VA 20191Ph: 703.955.3480 Fax: 703.234.7711
  • 2. What is Solr? ● Search Server ● Built upon Apache Lucene ● Fast, very ● Scalable, query load and collection size ● Interoperable ● Extensible ● Lucene power exposed over HTTP ● Spell checking, highlighting, faceting and etc. ● Caching ● Replication ● Distributed search
  • 3. How stuff works?
  • 4. schema.xml● Field types ○ <fieldType name="text" class="solr.TextField" indexed="true" />● Fields ○ <field name="technologies" type="text" indexed="true" stored="true" multiValued="true"/>● Unique key (optional) ○ <uniqueKey>id</uniqueKey>● copy fields ○ <copyField source="developers" dest="df"/>● dynamic fields ○ <dynamicField name="*_dt" type="date" indexed="true" stored="true"/>● similarity configuration ○ Similarity is the scoring routine for each document vs. a query
  • 5. solrconfig.xml● Lucene indexing parameters ○ <mergeFactor>10</mergeFactor> ○ <ramBufferSizeMB>32</ramBufferSizeMB>● Cache settings ○ <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount=" 32"/>● Request handler configuration ○ <requestHandler name="dismax" class="solr.SearchHandler" >● HTTP cache settings ○ <httpCaching lastModifiedFrom="openTime" etagSeed="Solr">● Search components, response writers, query parsers ○ <searchComponent name="spellcheck" class="solr.SpellCheckComponent"> ○ <queryResponseWriter name="velocity" class="org.apache.solr.request. VelocityResponseWriter"/> ○ <queryParser name="lucene" class=""/>
  • 6. Request Handler<requestHandler name="/itas" class="solr.SearchHandler"> <lst name="defaults"> <str name="v.template">browse</str> <str name=""></str> <str name="title">Solritas</str> <str name="wt">velocity</str> <str name="defType">dismax</str> <str name="q.alt">*:*</str> <str name="rows">10</str> <str name="fl">*,score</str> <str name="facet">on</str> <str name="facet.field">df</str> <str name="facet.mincount">1</str> <str name="hl">true</str> <str name="hl.fl">developers</str> <str name="qf"> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4 </str> </lst> </requestHandler>
  • 7. Response Writer● A Response Writer generates the formatted response of a search.● The wt parameter selects the Response Writer to be used● json, php, phps, python, ruby, xml, xslt, velocity <queryResponseWriter name="xslt" class="org.apache.solr.request.XSLTResponseWriter"> <int name="xsltCacheLifetimeSeconds">5</int> </queryResponseWriter>
  • 8. Analyzers, Tokenizers, Filters● The Analyzer class is a native Lucene concept that determines how tokens are produced from a piece of text <fieldType name="nametext" class="solr.TextField"> <analyzer class="org.apache.lucene.analysis.WhitespaceAnalyzer"/> </fieldType>● The job of a tokenizer is to break up a stream of text into tokens● A token looks at each Token in the stream sequentially and decides whether to pass it along, replace it or discard it <fieldType name="text" class="solr.TextField"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StandardFilterFactory"/> </analyzer> </fieldType>
  • 9. Other features● Highlighting ○ &hl=true&hl.fl=developers● Synonyms ○ <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>● Spell check ○ The spell check component can return a list of alternative spelling suggestions. ○ <searchComponent name="spellcheck" class="solr.SpellCheckComponent">● Content Streams ○ Allows Solr server to fetch local or remote data itself. Must enable remote streaming in solrconfig.xml● Solr Cell ○ leveraging Tika, extracts and indexes rich documents such as Word, PDF, HTML, and many other types● More like this ○
  • 10. Indexing with solrJSolrServer solr = new CommonsHttpSolrServer( new URL("http://localhost:8983/solr"));SolrInputDocument doc = new SolrInputDocument();doc.addField("id", "EXAMPLEDOC01");doc.addField("title", "NOVAJUG SolrJ Example");solr.add(doc);solr.commit(); // after a batch, not per documentsolr.optimize(); // periodically, if/when needed
  • 11. Data Import Handler● Indexes relational database, XML data, and e-mail sources● Supports full and incremental/delta indexing● Highly extensible with custom data sources, transformers, etc●
  • 12. Replication● Master is polled● Replicant pulls Lucene index and optionally also Solr configuration files● Query throughput scaling: replicate and load balance●
  • 13. Demo● Download solr ○● Start solr ○ cd <solr_home>/example ○ java -jar start.jar● Post documents ○ cd <solr_home>/example/exampledocs ○ java -jar post.jar *.xml ○ java -jar post.jar cw.xml● Access Solr ○ http://localhost:8983/solr/admin/● Querying solr ○ http://localhost:8983/solr/select/?q=binesh ○ http://localhost:8983/solr/select/?q=binny ○ http://localhost:8983/solr/select/?q=binesh&facet=true&facet.field=df&facet.mincount=1 ○ http://localhost:8983/solr/itas/● Luke ○
  • 14. Liferay + Solr: Motivation● Centralizing search index in clustered Liferay environment● Performance improvement ○ Re-indexing costs too much for large DBs ○ Often time indexes of Liferay deployments in a cluster are not synchronized
  • 15. Liferay + Solr: Configuration 1Install Solr ( up environment variables ● SOLR_HOME = /${solr installed folder} ● JAVA_OPTS = "$JAVA_OPTS -Dsolr.solr.home=$SOLR_HOME/example/solr/data"solr.xml ● Place the file under ${tomcat}/conf/Catalina/localhost/ with following content <?xml version="1.0" encoding="utf-8"> <Context docBase="$SOLR_HOME/apache-solr-1.4.0.war" debug="0" crossContext="true"> <Environment name="solr/home" type="java.lang.String" value="$SOLR_HOME" override="true" /> </Context>
  • 16. Liferay + Solr: Configuration 2schema.xml ● This file tells Solr how to index the data coming from Liferay, and can be customized for your installation. ● Copy this file from solr-web plugin to $SOLR_HOME/conf (you may have to create the conf directory) in your Solr home folder.... <fields><field name="comments" type="text" indexed="true" stored="true" /><field name="content" type="text" indexed="true" stored="true" /><field name="description" type="text" indexed="true" stored="true" /><field name="name" type="text" indexed="true" stored="true" /><field name="properties" type="text" indexed="true" stored="true" /><field name="title" type="text" indexed="true" stored="true" /><field name="uid" type="string" indexed="true" stored="true" /><field name="url" type="text" indexed="true" stored="true" /><field name="userName" type="text" indexed="true" stored="true" /><field name="version" type="text" indexed="true" stored="true" /><dynamicField name="*" type="string" indexed="true" stored="true" /></fields><uniqueKey>uid</uniqueKey><defaultSearchField>content</defaultSearchField> ... <copyField source="comments" dest="content"/> ... ...
  • 17. Liferay + Solr: Configuration 3Copy WAR file ● Copy the WAR file $SOLR_HOME/dist/apache-solr-${solr.version}.war into $SOLR_HOME/example; where ${solr.version} represents Solr version number, i.e., 1.4.0.Start Liferay/tomcat ● Solr will be picked up and "solr" will be deployed automatically under ${tomcat}/webapps folderInstall solr-web Liferay plugin ● Latest Liferay plugin can be checked out from the following location ● Build the checked out plugin and deploy it
  • 18. Liferay + Solr: Configuration 4Final Step ● We need to rebuild Liferay search indexes ● Control Panel > Server Administration
  • 19. Liferay + Solr: How it works solr-spring.xml (from solr-web plugin) ... <bean id="solrServer" class=""> <constructor-arg type="java.lang.String" value="http://localhost:8080/solr" /> </bean> <bean id="indexSearcher.solr" class=""><property name="solrServer" ref="solrServer" /> </bean> <bean id="indexWriter.solr" class=""><property name="commit" value="true" /><property name="solrServer" ref="solrServer" /> </bean> ...
  • 20. Liferay + Solr: Back to the default?● Simply undeploy solr-web plugin● Rebuild search indexes using the control panel described in the previous step