By Tapan Avasthi
Getting Started with APACHE SOLR 1
Apache SOLR Introduction and
In layman language, search engine is place, programs that retrieve a list of
document at same place with specific keyword which we entered. The
purpose search engine is to promote your expertise, your business to rest of
globe to gain revenue from them.
Apache SOLR 4.0 is specially designed for accurate, enhance, rapid search
prospectively. Here, we discuss few key features which is recently introduced
in SOLR 4.0. It built upon JAVA Technology Library called Apache Lucene.
Apache SOLR is most popular search engine for web platform because it can
an indexing server and search multiple task and give recommendation content
based on the search query from end users. It works basically work with two
elements HTTP and XML.
Apache SOLR leverage better user experience, pagination, indexing data,
sorting, highlighting, auto suggestion, data modeling, etc.
Apache SOLR 4.0 comes with some new features which helps to perform
more accurate search and enhance search criteria for real business market.
o Partial Document Update
o Easy Replication with Apache Zoo Keeper
2 Getting Started with APACHE SOLR
Apache SOLR Configuration
A prerequisite installation for quickly Apache SOLR up and running in
your system as below.
Java is a programming language and free computing platform. Java is a
programming language expressly designed for use in
the distributed environment of the Internet.
If you are using Linux flavor or Mac OS then it can be provided by
service vendor itself in installation package. If you are using Windows operating
system then you may download install Java in your environment manually.
Download latest version of Java from
downloads-1880260.html and choose the package according your environment
bit operating system. Afterwards, set JAVA_HOME in your system
environment variable as below:
Go to Control PanelAll Control Panel ItemsSystem and click on
Advanced system settings.
Then click on Environment Variable button advanced tab.
Getting Started with APACHE SOLR 3
Then add JAVA_HOME in path variable also as below.
Once your Java path set, now you are ready to install Apache SOLR in
your operating system.
o Download SOLR updated version from
4 Getting Started with APACHE SOLR
o Extract, install SOLR in the system.(Default it comes with Jetty
as the Application Server)
o Launch the SOLR with below command from windows
o Java -jar start.jar
o Run the below command from Linux terminal.
o Extract with below command
tar -zxvf apache-solr-3.4.0.tgz
Go to cd apache-solr-3.4.0
o Start Apache SOLR as below
o Go to cd /example
o java -jar start.jar
Getting Started with APACHE SOLR 5
Congratulations! You have installed and running Apache SOLR in
6 Getting Started with APACHE SOLR
Indexing Your Data Using SOLR
Apache SOLR offers Real time indexing, Index replication as automated,
Logging functionality, Automation failure and recovery mode, Multiple search
indexes, Server statistics logging, Full text searching, Load balanced querying,
Scalability, flexibility and extensibility, faceted search. Apache SOLR
considered as the server-ization of Lucene.
Before proceed further towards Indexing Data using Apache SOLR, let’s take
a look the layout of SOLR.
Layout of Apache SOLR
The example home directory of SOLR is default denotes example/solr. It
contains the following:
o Bin - Files for more advanced setup are placed here
o Conf - Contains files which help set the Solr configurations
o Conf/schema.xml - This is the schema for the index including field
type definitions for given dataset.
o Conf/solrconfig.xml - This is the primary Solr configuration file.
o Data - It contains the actual Lucene index data in binary format.
o Lib - The additional Solr plug in jar files can be placed here.
Getting Started with APACHE SOLR 7
Apache Solr can index data using four mechanisms namely:
o Solr’s native XML
o CSV (Character Separated Value) which is a character separated value
format (often a comma)
o Rich documents like PDF, XLS, DOC and PPT
o Solr-Binary is analogous to Solr-XML – it contains the same data in
8 Getting Started with APACHE SOLR
Your Apache SOLR server up and running but wont contains any data yet.
We can modify SOLR index by posting commands to SOLR like add, update,
select, delete documents. Default, SOLR package comes with sample data and
its located at /exampledocs directory.
We have stored our data to be searched in the example/exampledocs folder.
First of all ensure that the SOLR server is running from the previous step,
then type the following:
o exampledocs$ java -jar post.jar *.xml
o The response should be something like:
o SimplePostTool: version 1.2
o SimplePostTool: WARNING: Make sure your XML documents are
encoded in UTF-8, other encodings are not currently supported
o SimplePostTool: POSTing files to http://localhost:8983/solr/update
o SimplePostTool: POSTing file xmllisa1.xml
o SimplePostTool: COMMITting Solr index changes.
Now, you can able to search the data which was indexed through apache solr
query tab from admin panel of apache solr.
Getting Started with APACHE SOLR 9
Experimenting with Text Analysis
Let’s be comfortable with Solr’s analysis page, which is an experimentation
and a troubleshooting tool that is absolutely indispensable. With this facility,
you will be able to try different combination of configuration to verify
whether you get the desired result or not. This facility is very helpful when
you are facing an issue to find out certain result with particular queries are not
matching text that you feel they should. Just, have a look in Solr's Admin
Page. You'll see a link named [ANALYSIS].Enter the text which i entered in
the below snapshot.
Now, click on Analyze button. You’ll see below output in Index Analyzer.