Big data - Solr Integration

BIG DATA
Kulandaivel Ramalingam
ESB Architect

Solr
create directory eg: velsearch
create directory solr_vel under velsearch
copy solrconfig.xml and schema.xml to solr_vel
Create Instance Directory eg: vel-search :
solrctl instancedir --create vel-search solr_vel/
The below output would appear:
Uploading configs from solr_vel//conf to
<hostname>:<port>/solr. This may take up to a minute.

Solr
Create collection:
solrctl collection --create vel-search -s 1
Now data directory created in HDFS:
Instance:
/var/lib/solr/vel-search_shard1_replica1
Data:
hdfs://<hostname>:<port>/solr/vel-search/core_node1/data
Index:
hdfs://<hostname>:<port>/solr/vel-search/core_node1/data/index

Solr
Generate data in csv format from Hive database
Make sure that "" in each column is appeared
as in the example below:
id,name,location
"1","Vel","India"
"2","Ram","US"
"3","Kul","UK"

Solr
Upload it to HDFS
Ensure that reviews.conf has the below values:
commands : [
{
readCSV {
separator : ","
columns : [id,name,location]
ignoreFirstLine : true
quoteChar : """
trim : true
charset : UTF-8
}
}

Solr
Run the below indexing command:
hadoop jar /usr/lib/solr/contrib/mr/search-mr-*-job.jar
org.apache.solr.hadoop.MapReduceIndexerTool -D
'mapred.child.java.opts=-Xmx500m' --log4j
/usr/share/doc/search*/examples/solr-nrt/
log4j.properties --morphline-file reviews.conf --
output-dir hdfs://<hostname>:<port>/tmp/load --
verbose --go-live --zk-host <hostname>:<port>/solr --
collection vel-search
hdfs://<hostname>:<port>/user/cloudera/query_result_
copy.csv

Solr
The below message wil appear at the end of the executionof the above command:
37097 [main] INFO org.apache.solr.client.solrj.impl.HttpClientUtil - Creating new http client, config:
37119 [main] INFO org.apache.solr.common.cloud.ConnectionManager - Waiting for client to connect to ZooKeeper
37121 [main-EventThread] INFO org.apache.solr.common.cloud.ConnectionManager - Watcher
org.apache.solr.common.cloud.ConnectionManager@dd606a name:ZooKeeperConnection
Watcher:localhost:2181/solr got event WatchedEvent state:SyncConnected type:None path:null path:null type:None
37121 [main] INFO org.apache.solr.common.cloud.ConnectionManager - Client is connected to ZooKeeper
37122 [main] INFO org.apache.solr.common.cloud.ZkStateReader - Updating cluster state from ZooKeeper...
37957 [main] INFO org.apache.solr.hadoop.GoLive - Done committing live merge
37958 [main] INFO org.apache.solr.hadoop.GoLive - Live merging of index shards into Solr cluster took 2.032 secs
37958 [main] INFO org.apache.solr.hadoop.GoLive - Live merging completed successfully
37958 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Succeeded with job: jobName:
org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper, jobId: job_1409686197310_0004
37958 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Success. Done. Program took 38.025 secs.
Goodbye.

Solr
Go to Solr admin for testing:
http://192.168.137.134:8983/solr/#/vel-search_
shard1_replica1/query
Give the primary key as query for search:

Big data - Solr Integration

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Similar to Big data - Solr Integration

Similar to Big data - Solr Integration (20)

More from rkulandaivel

More from rkulandaivel (6)

Recently uploaded

Recently uploaded (20)

Big data - Solr Integration

Editor's Notes