Introduction of Solr

  1. 1. Solr5 Phoebe Shih 2016.10.11
  2. 2. Introduction Open Source Enterprise Search Platform Solr4 -> Solr5 : from war to java server applcation Full-Text Search Real-Time indexing Dynamic clustering ... REST APIs
  3. 3. Full Text Search Document/Text Retrieval Inverted Index aid help fox Documents Terms
  4. 4. Installation Solr-5.5.2 Java Runtime Environment (JRE) version 1.7 or higher ./bin/solr start http://localhost:8983/solr/
  5. 5. Quick Start Create a Core ./bin/solr create -c {core} Manage Schema {solr-home-directory}/server/solr/{core}/conf/managed-schema Add Documents ./bin/post -c {core} example/exampledocs/*.xml Query http://localhost:8983/solr/{core}/select?q=*:*
  6. 6. Configuration jetty {solr-home-directory}/etc/jetty.xml log {solr-home-directory}/resources/ server {solr-home-directory}/solr/solr.xml solr {solr-home-directory}/solr/{core}/conf/solrconfig.xml schema {solr-home-directory}/solr/{core}/conf/managed-schema
  7. 7. Schema <schema name="example-data-driven-schema" version="1.6"> <field name="rowkey_id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <copyField source="sourceFieldName" dest="destinationFieldName"/> <dynamicField name="*_i" type="int" indexed="true" stored="true"/> <uniqueKey>rowkey_id</uniqueKey> <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/> </schema>
  8. 8. jieba {solr-home-directory}/solr-webapp/webapp/WEB-INF/lib analyzer-solr-1.0.jar jieba-analysis-1.0.0.jar {solr-home-directory}/solr/{core}/conf/managed-schema http://localhost:8983/solr/#/{core}/analysis <fieldType name="text_jieba" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="analyzer.solr5.jieba.JiebaTokenizerFactory" segMode="SEARCH" userDict="/etc/solr/dict_big.txt"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="6"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English"/> </analyzer> <analyzer type="query"> <tokenizer class="analyzer.solr5.jieba.JiebaTokenizerFactory" segMode="SEARCH" userDict="/etc/solr/dict_big.txt"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English"/> </analyzer> </fieldType>
  9. 9. Reindex using DataImport {solr-home-directory}/solr-webapp/webapp/WEB-INF/lib {pkg-folder}/dist/solr-dataimporthandler-{version}.jar {pkg-folder}/dist/solr-dataimporthandler-extras-{version}.jar {folder}/server/solr/{core}/conf/solrconfig.xml {folder}/server/solr/{core}/conf/data-config.xml http://localhost:8983/solr/#/{core}/dataimport//dataimport <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> <str name="clean">false</str> </lst> </requestHandler> <dataConfig> <document> <entity name="local" processor="SolrEntityProcessor" url="http://localhost:8983/solr/{core}" query="*:*" fl="*,old_version:_version_" rows="2000"/> </document> </dataConfig>
  10. 10. SolrCloud Cluster & Node Node 1 Node 2 Node 2 Cluster sSolr Server
  11. 11. SolrCloud Core & Collection Node3 userchat Node2 userpost Node1 postchat core Collection of Chat
  12. 12. SolrCloud Shard & Replica Collection Shard1 Leader Shard2 Replica Replica Leader
  13. 13. SolrCloud: Zookeeper 2F+1 machines conf/zoo.cfg tickTime=2000 dataDir=/var/lib/zookeeper clientPort=2181 server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888 /var/lib/zookeeper/myid 1 bin/ start
  14. 14. SolrCloud Start in Cloud Mode ./bin/solr start -c -z [zkhost] Create a Collection ./bin/solr create [-c name] [-d confdir] [-n configName] [- shards #] [-replicationFactor #]
  15. 15. Directory Structure Standalone Cloud <solr-home-directory> ├── solr.xml ├── core1 │ ├── │ ├── conf │ │ ├── solrconfig.xml │ │ └── managed-schema │ └── data └── core2 ├── ├── conf │ ├── solrconfig.xml │ └── managed-schema └── data <solr-home-directory> ├── solr.xml ├── core1 │ ├── │ └── data └── core2 ├── └── data
  16. 16. Taking to Production Install lsof sudo yum install lsof Extract Install Script tar zxvf solr-5.5.2.tgz solr-5.5.2/bin/ -- strip-components=2 Solr Parameters Install & Start sudo ./ solr-{version}.tgz Default: sudo ./ solr-{version}.tgz -i /opt -d /var/solr -u solr -s solr -p 8983
  17. 17. Reference Solr Reference Guide olr+Reference+Guide jieba Java jieba for Solr5