APACHE SOLRA LITTLE INTRODUCTION TO SEARCH ENGINES.
TECHNOLOGIESMainly, We are gonna talk about these“technologies”:● Apache SOLR● Maven● Rest Services, Java, Lucene, Google, andmany more...
Lucene● Library which provides high performance text operations.● Inverted index.● Plain text.● Different operations with text i.e: delete articles.● Since Java 1.4● Indexes stored in disc.
Lucene● Less dependency with index size.● Higher text search flexibility. Synonyms, phonetics searches...● Ranking. All these features does Lucene better than DataBases when text searches are necessary in our applications.
SOLR● Http access to Lucene● Caches to improve performane● Admin web interface● XML configuration● Faceted search● Distributed service● Solrj client
SOLR Basic search configurationParameter Descriptionq What is searched (Name: Music)rows Max results shownfacets Faceted searchsort It sorts the results i.e: name ascfl Fields returned by the serverfq Caching results... More and more parameters
SOLR Advanced search:● Its configurated through the xml files. – Tokens – Stemming – Synonyms – Stop words – N-Gamas
SOLRThe Solr Home directory typically contains the following subdirectories... conf/ This directory is mandatory and must contain your solrconfig.xml and schema.xml. Any other optional configuration files would also be kept here. data/ This directory is the default location where Solr will keep your index, and is used by the replication scripts for dealing with snapshots. You can override this location in the solrconfig.xml and scripts.conf files. Solr will create this directory if it does not already exist. lib/ This directory is optional. If it exists, Solr will load any Jars found in this directory and use them to resolve any "plugins" specified in your solrconfig.xml or schema.xml (ie: Analyzers, Request Handlers, etc...). Alternatively you can use the <lib> syntax in solrconfig.xml to direct Solr to your plugins. See the example solrconfig.xml file for details. bin/ This directory is optional. It is the default location used for keeping the replication scripts .