Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Solr5

122 views

Published on

Introduction of Solr

Published in: Software
  • Be the first to comment

  • Be the first to like this

Solr5

  1. 1. Solr5 Phoebe Shih 2016.10.11
  2. 2. Introduction Open Source Enterprise Search Platform Solr4 -> Solr5 : from war to java server applcation Full-Text Search Real-Time indexing Dynamic clustering ... REST APIs
  3. 3. Full Text Search Document/Text Retrieval Inverted Index aid help fox Documents Terms
  4. 4. Installation Solr-5.5.2 Java Runtime Environment (JRE) version 1.7 or higher http://lucene.apache.org/solr/downloads.html ./bin/solr start http://localhost:8983/solr/
  5. 5. Quick Start Create a Core ./bin/solr create -c {core} Manage Schema {solr-home-directory}/server/solr/{core}/conf/managed-schema Add Documents ./bin/post -c {core} example/exampledocs/*.xml Query http://localhost:8983/solr/{core}/select?q=*:*
  6. 6. Configuration jetty {solr-home-directory}/etc/jetty.xml log {solr-home-directory}/resources/log4j.properties server {solr-home-directory}/solr/solr.xml solr {solr-home-directory}/solr/{core}/conf/solrconfig.xml schema {solr-home-directory}/solr/{core}/conf/managed-schema
  7. 7. Schema <schema name="example-data-driven-schema" version="1.6"> <field name="rowkey_id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <copyField source="sourceFieldName" dest="destinationFieldName"/> <dynamicField name="*_i" type="int" indexed="true" stored="true"/> <uniqueKey>rowkey_id</uniqueKey> <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/> </schema>
  8. 8. jieba {solr-home-directory}/solr-webapp/webapp/WEB-INF/lib analyzer-solr-1.0.jar jieba-analysis-1.0.0.jar {solr-home-directory}/solr/{core}/conf/managed-schema http://localhost:8983/solr/#/{core}/analysis <fieldType name="text_jieba" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="analyzer.solr5.jieba.JiebaTokenizerFactory" segMode="SEARCH" userDict="/etc/solr/dict_big.txt"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="6"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English"/> </analyzer> <analyzer type="query"> <tokenizer class="analyzer.solr5.jieba.JiebaTokenizerFactory" segMode="SEARCH" userDict="/etc/solr/dict_big.txt"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English"/> </analyzer> </fieldType>
  9. 9. Reindex using DataImport {solr-home-directory}/solr-webapp/webapp/WEB-INF/lib {pkg-folder}/dist/solr-dataimporthandler-{version}.jar {pkg-folder}/dist/solr-dataimporthandler-extras-{version}.jar {folder}/server/solr/{core}/conf/solrconfig.xml {folder}/server/solr/{core}/conf/data-config.xml http://localhost:8983/solr/#/{core}/dataimport//dataimport <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> <str name="clean">false</str> </lst> </requestHandler> <dataConfig> <document> <entity name="local" processor="SolrEntityProcessor" url="http://localhost:8983/solr/{core}" query="*:*" fl="*,old_version:_version_" rows="2000"/> </document> </dataConfig>
  10. 10. SolrCloud Cluster & Node Node 1 Node 2 Node 2 Cluster sSolr Server
  11. 11. SolrCloud Core & Collection Node3 userchat Node2 userpost Node1 postchat core Collection of Chat
  12. 12. SolrCloud Shard & Replica Collection Shard1 Leader Shard2 Replica Replica Leader
  13. 13. SolrCloud: Zookeeper 2F+1 machines conf/zoo.cfg tickTime=2000 dataDir=/var/lib/zookeeper clientPort=2181 server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888 /var/lib/zookeeper/myid 1 bin/zkServer.sh start
  14. 14. SolrCloud Start in Cloud Mode ./bin/solr start -c -z [zkhost] Create a Collection ./bin/solr create [-c name] [-d confdir] [-n configName] [- shards #] [-replicationFactor #]
  15. 15. Directory Structure Standalone Cloud <solr-home-directory> ├── solr.xml ├── core1 │ ├── core.properties │ ├── conf │ │ ├── solrconfig.xml │ │ └── managed-schema │ └── data └── core2 ├── core.properties ├── conf │ ├── solrconfig.xml │ └── managed-schema └── data <solr-home-directory> ├── solr.xml ├── core1 │ ├── core.properties │ └── data └── core2 ├── core.properties └── data
  16. 16. Taking to Production Install lsof sudo yum install lsof Extract Install Script tar zxvf solr-5.5.2.tgz solr-5.5.2/bin/install_solr_service.sh -- strip-components=2 Solr Parameters solr.in.sh Install & Start sudo ./install_solr_service.sh solr-{version}.tgz Default: sudo ./install_solr_service.sh solr-{version}.tgz -i /opt -d /var/solr -u solr -s solr -p 8983
  17. 17. Reference Solr Reference Guide https://cwiki.apache.org/confluence/display/solr/Apache+S olr+Reference+Guide jieba Java https://github.com/huaban/jieba-analysis jieba for Solr5 https://github.com/sing1ee/analyzer-solr

×