SlideShare a Scribd company logo
1 of 9
BIG DATA 
Kulandaivel Ramalingam 
ESB Architect
Solr 
create directory eg: velsearch 
create directory solr_vel under velsearch 
copy solrconfig.xml and schema.xml to solr_vel 
Create Instance Directory eg: vel-search : 
solrctl instancedir --create vel-search solr_vel/ 
The below output would appear: 
Uploading configs from solr_vel//conf to 
<hostname>:<port>/solr. This may take up to a minute.
Solr 
Create collection: 
solrctl collection --create vel-search -s 1 
Now data directory created in HDFS: 
Instance: 
/var/lib/solr/vel-search_shard1_replica1 
Data: 
hdfs://<hostname>:<port>/solr/vel-search/core_node1/data 
Index: 
hdfs://<hostname>:<port>/solr/vel-search/core_node1/data/index
Solr 
Generate data in csv format from Hive database 
Make sure that "" in each column is appeared 
as in the example below: 
id,name,location 
"1","Vel","India" 
"2","Ram","US" 
"3","Kul","UK"
Solr 
Upload it to HDFS 
Ensure that reviews.conf has the below values: 
commands : [ 
{ 
readCSV { 
separator : "," 
columns : [id,name,location] 
ignoreFirstLine : true 
quoteChar : """ 
trim : true 
charset : UTF-8 
} 
}
Solr 
Run the below indexing command: 
hadoop jar /usr/lib/solr/contrib/mr/search-mr-*-job.jar 
org.apache.solr.hadoop.MapReduceIndexerTool -D 
'mapred.child.java.opts=-Xmx500m' --log4j 
/usr/share/doc/search*/examples/solr-nrt/ 
log4j.properties --morphline-file reviews.conf -- 
output-dir hdfs://<hostname>:<port>/tmp/load -- 
verbose --go-live --zk-host <hostname>:<port>/solr -- 
collection vel-search 
hdfs://<hostname>:<port>/user/cloudera/query_result_ 
copy.csv
Solr 
The below message wil appear at the end of the executionof the above command: 
37097 [main] INFO org.apache.solr.client.solrj.impl.HttpClientUtil - Creating new http client, config: 
37119 [main] INFO org.apache.solr.common.cloud.ConnectionManager - Waiting for client to connect to ZooKeeper 
37121 [main-EventThread] INFO org.apache.solr.common.cloud.ConnectionManager - Watcher 
org.apache.solr.common.cloud.ConnectionManager@dd606a name:ZooKeeperConnection 
Watcher:localhost:2181/solr got event WatchedEvent state:SyncConnected type:None path:null path:null type:None 
37121 [main] INFO org.apache.solr.common.cloud.ConnectionManager - Client is connected to ZooKeeper 
37122 [main] INFO org.apache.solr.common.cloud.ZkStateReader - Updating cluster state from ZooKeeper... 
37957 [main] INFO org.apache.solr.hadoop.GoLive - Done committing live merge 
37958 [main] INFO org.apache.solr.hadoop.GoLive - Live merging of index shards into Solr cluster took 2.032 secs 
37958 [main] INFO org.apache.solr.hadoop.GoLive - Live merging completed successfully 
37958 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Succeeded with job: jobName: 
org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper, jobId: job_1409686197310_0004 
37958 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Success. Done. Program took 38.025 secs. 
Goodbye.
Solr 
Go to Solr admin for testing: 
http://192.168.137.134:8983/solr/#/vel-search_ 
shard1_replica1/query 
Give the primary key as query for search:
QUESTIONS & THANKS

More Related Content

What's hot

Lightweight DAS components in Perl
Lightweight DAS components in PerlLightweight DAS components in Perl
Lightweight DAS components in Perlguestbab097
 
InfiniFlux collector
InfiniFlux collectorInfiniFlux collector
InfiniFlux collectorInfiniFlux
 
Oracle 12c RAC Database Software Install and Create Database
Oracle 12c RAC Database Software Install and Create DatabaseOracle 12c RAC Database Software Install and Create Database
Oracle 12c RAC Database Software Install and Create DatabaseMonowar Mukul
 
Log grid root
Log grid rootLog grid root
Log grid rootopenmi
 
Vancouver presentation
Vancouver presentationVancouver presentation
Vancouver presentationColleen_Murphy
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 
JDD 2017: Nginx + Lua = OpenResty (Marcin Stożek)
JDD 2017: Nginx + Lua = OpenResty (Marcin Stożek)JDD 2017: Nginx + Lua = OpenResty (Marcin Stożek)
JDD 2017: Nginx + Lua = OpenResty (Marcin Stożek)PROIDEA
 
What's new in Rails 4
What's new in Rails 4What's new in Rails 4
What's new in Rails 4Lucas Caton
 
Windows PowerShell Basics – How To Create powershell for loop
Windows PowerShell Basics – How To Create powershell for loopWindows PowerShell Basics – How To Create powershell for loop
Windows PowerShell Basics – How To Create powershell for loopVCP Muthukrishna
 
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014gethue
 
Shell Script Disk Usage Report and E-Mail Current Threshold Status
Shell Script  Disk Usage Report and E-Mail Current Threshold StatusShell Script  Disk Usage Report and E-Mail Current Threshold Status
Shell Script Disk Usage Report and E-Mail Current Threshold StatusVCP Muthukrishna
 
Application Logging With The ELK Stack
Application Logging With The ELK StackApplication Logging With The ELK Stack
Application Logging With The ELK Stackbenwaine
 
MySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELKMySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELKI Goo Lee
 
How To Connect to Active Directory User Validation
How To Connect to Active Directory User ValidationHow To Connect to Active Directory User Validation
How To Connect to Active Directory User ValidationVCP Muthukrishna
 
Solr 4: Run Solr in SolrCloud Mode on your local file system.
Solr 4: Run Solr in SolrCloud Mode on your local file system.Solr 4: Run Solr in SolrCloud Mode on your local file system.
Solr 4: Run Solr in SolrCloud Mode on your local file system.gutierrezga00
 

What's hot (20)

MySQLi
MySQLiMySQLi
MySQLi
 
Lightweight DAS components in Perl
Lightweight DAS components in PerlLightweight DAS components in Perl
Lightweight DAS components in Perl
 
InfiniFlux collector
InfiniFlux collectorInfiniFlux collector
InfiniFlux collector
 
Oracle 12c RAC Database Software Install and Create Database
Oracle 12c RAC Database Software Install and Create DatabaseOracle 12c RAC Database Software Install and Create Database
Oracle 12c RAC Database Software Install and Create Database
 
Log grid root
Log grid rootLog grid root
Log grid root
 
Vancouver presentation
Vancouver presentationVancouver presentation
Vancouver presentation
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
JDD 2017: Nginx + Lua = OpenResty (Marcin Stożek)
JDD 2017: Nginx + Lua = OpenResty (Marcin Stożek)JDD 2017: Nginx + Lua = OpenResty (Marcin Stożek)
JDD 2017: Nginx + Lua = OpenResty (Marcin Stożek)
 
What's new in Rails 4
What's new in Rails 4What's new in Rails 4
What's new in Rails 4
 
4 sesame
4 sesame4 sesame
4 sesame
 
Oracle Golden Gate
Oracle Golden GateOracle Golden Gate
Oracle Golden Gate
 
Windows PowerShell Basics – How To Create powershell for loop
Windows PowerShell Basics – How To Create powershell for loopWindows PowerShell Basics – How To Create powershell for loop
Windows PowerShell Basics – How To Create powershell for loop
 
Pyrax talk
Pyrax talkPyrax talk
Pyrax talk
 
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
 
Shell Script Disk Usage Report and E-Mail Current Threshold Status
Shell Script  Disk Usage Report and E-Mail Current Threshold StatusShell Script  Disk Usage Report and E-Mail Current Threshold Status
Shell Script Disk Usage Report and E-Mail Current Threshold Status
 
Application Logging With The ELK Stack
Application Logging With The ELK StackApplication Logging With The ELK Stack
Application Logging With The ELK Stack
 
12c installation
12c installation12c installation
12c installation
 
MySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELKMySQL Slow Query log Monitoring using Beats & ELK
MySQL Slow Query log Monitoring using Beats & ELK
 
How To Connect to Active Directory User Validation
How To Connect to Active Directory User ValidationHow To Connect to Active Directory User Validation
How To Connect to Active Directory User Validation
 
Solr 4: Run Solr in SolrCloud Mode on your local file system.
Solr 4: Run Solr in SolrCloud Mode on your local file system.Solr 4: Run Solr in SolrCloud Mode on your local file system.
Solr 4: Run Solr in SolrCloud Mode on your local file system.
 

Viewers also liked

La Proprietà Industriale ed Intellettuale strumento di valorizzazione della p...
La Proprietà Industriale ed Intellettuale strumento di valorizzazione della p...La Proprietà Industriale ed Intellettuale strumento di valorizzazione della p...
La Proprietà Industriale ed Intellettuale strumento di valorizzazione della p...Bugnion Spa
 
Digital Cameras Sponsored
Digital Cameras SponsoredDigital Cameras Sponsored
Digital Cameras SponsoredYedaHon
 
Monocentric and Polycentric Patterns in Spatial Models of Agglomeration
Monocentric and Polycentric Patterns in Spatial Models of AgglomerationMonocentric and Polycentric Patterns in Spatial Models of Agglomeration
Monocentric and Polycentric Patterns in Spatial Models of AgglomerationMinoru Osawa
 
Overcoming impostor syndrome
Overcoming impostor syndromeOvercoming impostor syndrome
Overcoming impostor syndromeTimea Turdean
 
Consapevolezza uso social_libre_pisa_ep
Consapevolezza uso social_libre_pisa_epConsapevolezza uso social_libre_pisa_ep
Consapevolezza uso social_libre_pisa_epEmma Pietrafesa
 
Women's political participation in Europe and the CIS
Women's political participation in Europe and the CISWomen's political participation in Europe and the CIS
Women's political participation in Europe and the CISUNDP Eurasia
 
KELLY GLOBAL WORKFORCE INDEX 2015 - WOMEN IN STEM
KELLY GLOBAL WORKFORCE INDEX 2015 - WOMEN IN STEMKELLY GLOBAL WORKFORCE INDEX 2015 - WOMEN IN STEM
KELLY GLOBAL WORKFORCE INDEX 2015 - WOMEN IN STEMKelly Services Italia
 
Marc Firestone: Money Saving Tips For New Parents
Marc Firestone: Money Saving Tips For New ParentsMarc Firestone: Money Saving Tips For New Parents
Marc Firestone: Money Saving Tips For New ParentsMarc Firestone
 
25 nov netneutrality_pietrafesa
25 nov netneutrality_pietrafesa25 nov netneutrality_pietrafesa
25 nov netneutrality_pietrafesaEmma Pietrafesa
 
MIRU2016 チュートリアル
MIRU2016 チュートリアルMIRU2016 チュートリアル
MIRU2016 チュートリアルShunsuke Ono
 
Tricks of the Transformation Trade: Disruptive Disintermediation, Agility Age...
Tricks of the Transformation Trade: Disruptive Disintermediation, Agility Age...Tricks of the Transformation Trade: Disruptive Disintermediation, Agility Age...
Tricks of the Transformation Trade: Disruptive Disintermediation, Agility Age...UMT
 
Cloud Access Security Broker (CASB)
Cloud Access Security Broker (CASB) Cloud Access Security Broker (CASB)
Cloud Access Security Broker (CASB) rkulandaivel
 
Recommendations Reboot: Improving RIO with Experimentation
Recommendations Reboot: Improving RIO with ExperimentationRecommendations Reboot: Improving RIO with Experimentation
Recommendations Reboot: Improving RIO with ExperimentationOptimizely
 
SASによるテキスト・アナリティクス入門
SASによるテキスト・アナリティクス入門SASによるテキスト・アナリティクス入門
SASによるテキスト・アナリティクス入門SAS Institute Japan
 

Viewers also liked (17)

The Vietnam War 6th Period Guyer
The Vietnam War 6th Period GuyerThe Vietnam War 6th Period Guyer
The Vietnam War 6th Period Guyer
 
Korean War 7th Period Guyer
Korean War 7th Period GuyerKorean War 7th Period Guyer
Korean War 7th Period Guyer
 
La Proprietà Industriale ed Intellettuale strumento di valorizzazione della p...
La Proprietà Industriale ed Intellettuale strumento di valorizzazione della p...La Proprietà Industriale ed Intellettuale strumento di valorizzazione della p...
La Proprietà Industriale ed Intellettuale strumento di valorizzazione della p...
 
Digital Cameras Sponsored
Digital Cameras SponsoredDigital Cameras Sponsored
Digital Cameras Sponsored
 
La evaluación de la calidad de las revistas o la evaluación del sistema de co...
La evaluación de la calidad de las revistas o la evaluación del sistema de co...La evaluación de la calidad de las revistas o la evaluación del sistema de co...
La evaluación de la calidad de las revistas o la evaluación del sistema de co...
 
Monocentric and Polycentric Patterns in Spatial Models of Agglomeration
Monocentric and Polycentric Patterns in Spatial Models of AgglomerationMonocentric and Polycentric Patterns in Spatial Models of Agglomeration
Monocentric and Polycentric Patterns in Spatial Models of Agglomeration
 
Overcoming impostor syndrome
Overcoming impostor syndromeOvercoming impostor syndrome
Overcoming impostor syndrome
 
Consapevolezza uso social_libre_pisa_ep
Consapevolezza uso social_libre_pisa_epConsapevolezza uso social_libre_pisa_ep
Consapevolezza uso social_libre_pisa_ep
 
Women's political participation in Europe and the CIS
Women's political participation in Europe and the CISWomen's political participation in Europe and the CIS
Women's political participation in Europe and the CIS
 
KELLY GLOBAL WORKFORCE INDEX 2015 - WOMEN IN STEM
KELLY GLOBAL WORKFORCE INDEX 2015 - WOMEN IN STEMKELLY GLOBAL WORKFORCE INDEX 2015 - WOMEN IN STEM
KELLY GLOBAL WORKFORCE INDEX 2015 - WOMEN IN STEM
 
Marc Firestone: Money Saving Tips For New Parents
Marc Firestone: Money Saving Tips For New ParentsMarc Firestone: Money Saving Tips For New Parents
Marc Firestone: Money Saving Tips For New Parents
 
25 nov netneutrality_pietrafesa
25 nov netneutrality_pietrafesa25 nov netneutrality_pietrafesa
25 nov netneutrality_pietrafesa
 
MIRU2016 チュートリアル
MIRU2016 チュートリアルMIRU2016 チュートリアル
MIRU2016 チュートリアル
 
Tricks of the Transformation Trade: Disruptive Disintermediation, Agility Age...
Tricks of the Transformation Trade: Disruptive Disintermediation, Agility Age...Tricks of the Transformation Trade: Disruptive Disintermediation, Agility Age...
Tricks of the Transformation Trade: Disruptive Disintermediation, Agility Age...
 
Cloud Access Security Broker (CASB)
Cloud Access Security Broker (CASB) Cloud Access Security Broker (CASB)
Cloud Access Security Broker (CASB)
 
Recommendations Reboot: Improving RIO with Experimentation
Recommendations Reboot: Improving RIO with ExperimentationRecommendations Reboot: Improving RIO with Experimentation
Recommendations Reboot: Improving RIO with Experimentation
 
SASによるテキスト・アナリティクス入門
SASによるテキスト・アナリティクス入門SASによるテキスト・アナリティクス入門
SASによるテキスト・アナリティクス入門
 

Similar to Big data - Solr Integration

Terraform 0.9 + good practices
Terraform 0.9 + good practicesTerraform 0.9 + good practices
Terraform 0.9 + good practicesRadek Simko
 
PuppetConf 2016: The Challenges with Container Configuration – David Lutterko...
PuppetConf 2016: The Challenges with Container Configuration – David Lutterko...PuppetConf 2016: The Challenges with Container Configuration – David Lutterko...
PuppetConf 2016: The Challenges with Container Configuration – David Lutterko...Puppet
 
Challenges of container configuration
Challenges of container configurationChallenges of container configuration
Challenges of container configurationlutter
 
How to implement a gdpr solution in a cloudera architecture
How to implement a gdpr solution in a cloudera architectureHow to implement a gdpr solution in a cloudera architecture
How to implement a gdpr solution in a cloudera architectureTiago Simões
 
Troubleshooting Tips from a Docker Support Engineer
Troubleshooting Tips from a Docker Support EngineerTroubleshooting Tips from a Docker Support Engineer
Troubleshooting Tips from a Docker Support EngineerJeff Anderson
 
Troubleshooting Tips from a Docker Support Engineer - Jeff Anderson, Docker
Troubleshooting Tips from a Docker Support Engineer - Jeff Anderson, DockerTroubleshooting Tips from a Docker Support Engineer - Jeff Anderson, Docker
Troubleshooting Tips from a Docker Support Engineer - Jeff Anderson, DockerDocker, Inc.
 
Exploring the Future of Helm
Exploring the Future of HelmExploring the Future of Helm
Exploring the Future of HelmMatthew Farina
 
Drupal, Memcache and Solr on Windows
Drupal, Memcache and Solr on WindowsDrupal, Memcache and Solr on Windows
Drupal, Memcache and Solr on WindowsAlessandro Pilotti
 
Setup oracle golden gate 11g replication
Setup oracle golden gate 11g replicationSetup oracle golden gate 11g replication
Setup oracle golden gate 11g replicationKanwar Batra
 
Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...
Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...
Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...Accumulo Summit
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusterslucenerevolution
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersRafał Kuć
 
Step by Step to Install oracle grid 11.2.0.3 on solaris 11.1
Step by Step to Install oracle grid 11.2.0.3 on solaris 11.1Step by Step to Install oracle grid 11.2.0.3 on solaris 11.1
Step by Step to Install oracle grid 11.2.0.3 on solaris 11.1Osama Mustafa
 

Similar to Big data - Solr Integration (20)

Terraform 0.9 + good practices
Terraform 0.9 + good practicesTerraform 0.9 + good practices
Terraform 0.9 + good practices
 
PuppetConf 2016: The Challenges with Container Configuration – David Lutterko...
PuppetConf 2016: The Challenges with Container Configuration – David Lutterko...PuppetConf 2016: The Challenges with Container Configuration – David Lutterko...
PuppetConf 2016: The Challenges with Container Configuration – David Lutterko...
 
Challenges of container configuration
Challenges of container configurationChallenges of container configuration
Challenges of container configuration
 
How to implement a gdpr solution in a cloudera architecture
How to implement a gdpr solution in a cloudera architectureHow to implement a gdpr solution in a cloudera architecture
How to implement a gdpr solution in a cloudera architecture
 
Troubleshooting Tips from a Docker Support Engineer
Troubleshooting Tips from a Docker Support EngineerTroubleshooting Tips from a Docker Support Engineer
Troubleshooting Tips from a Docker Support Engineer
 
Troubleshooting Tips from a Docker Support Engineer - Jeff Anderson, Docker
Troubleshooting Tips from a Docker Support Engineer - Jeff Anderson, DockerTroubleshooting Tips from a Docker Support Engineer - Jeff Anderson, Docker
Troubleshooting Tips from a Docker Support Engineer - Jeff Anderson, Docker
 
Helm @ Orchestructure
Helm @ OrchestructureHelm @ Orchestructure
Helm @ Orchestructure
 
Exploring the Future of Helm
Exploring the Future of HelmExploring the Future of Helm
Exploring the Future of Helm
 
Freeradius edir
Freeradius edirFreeradius edir
Freeradius edir
 
Rhel6
Rhel6Rhel6
Rhel6
 
Drupal, Memcache and Solr on Windows
Drupal, Memcache and Solr on WindowsDrupal, Memcache and Solr on Windows
Drupal, Memcache and Solr on Windows
 
Docker
DockerDocker
Docker
 
Linux configer
Linux configerLinux configer
Linux configer
 
Setup oracle golden gate 11g replication
Setup oracle golden gate 11g replicationSetup oracle golden gate 11g replication
Setup oracle golden gate 11g replication
 
Terraform Cosmos DB
Terraform Cosmos DBTerraform Cosmos DB
Terraform Cosmos DB
 
Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...
Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...
Accumulo Summit 2015: Attempting to answer unanswerable questions: Key manage...
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
Step by Step to Install oracle grid 11.2.0.3 on solaris 11.1
Step by Step to Install oracle grid 11.2.0.3 on solaris 11.1Step by Step to Install oracle grid 11.2.0.3 on solaris 11.1
Step by Step to Install oracle grid 11.2.0.3 on solaris 11.1
 
Docker tlv
Docker tlvDocker tlv
Docker tlv
 

More from rkulandaivel

Get the best productivity from employee
Get the best productivity from employeeGet the best productivity from employee
Get the best productivity from employeerkulandaivel
 
Mule Anypoint API Gateway
Mule Anypoint API GatewayMule Anypoint API Gateway
Mule Anypoint API Gatewayrkulandaivel
 
Linux Administation
Linux AdministationLinux Administation
Linux Administationrkulandaivel
 

More from rkulandaivel (6)

Get the best productivity from employee
Get the best productivity from employeeGet the best productivity from employee
Get the best productivity from employee
 
Mule Anypoint API Gateway
Mule Anypoint API GatewayMule Anypoint API Gateway
Mule Anypoint API Gateway
 
Storm v0.2
Storm v0.2Storm v0.2
Storm v0.2
 
Maven POM
Maven POMMaven POM
Maven POM
 
Visual vm
Visual vmVisual vm
Visual vm
 
Linux Administation
Linux AdministationLinux Administation
Linux Administation
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Big data - Solr Integration

  • 1. BIG DATA Kulandaivel Ramalingam ESB Architect
  • 2. Solr create directory eg: velsearch create directory solr_vel under velsearch copy solrconfig.xml and schema.xml to solr_vel Create Instance Directory eg: vel-search : solrctl instancedir --create vel-search solr_vel/ The below output would appear: Uploading configs from solr_vel//conf to <hostname>:<port>/solr. This may take up to a minute.
  • 3. Solr Create collection: solrctl collection --create vel-search -s 1 Now data directory created in HDFS: Instance: /var/lib/solr/vel-search_shard1_replica1 Data: hdfs://<hostname>:<port>/solr/vel-search/core_node1/data Index: hdfs://<hostname>:<port>/solr/vel-search/core_node1/data/index
  • 4. Solr Generate data in csv format from Hive database Make sure that "" in each column is appeared as in the example below: id,name,location "1","Vel","India" "2","Ram","US" "3","Kul","UK"
  • 5. Solr Upload it to HDFS Ensure that reviews.conf has the below values: commands : [ { readCSV { separator : "," columns : [id,name,location] ignoreFirstLine : true quoteChar : """ trim : true charset : UTF-8 } }
  • 6. Solr Run the below indexing command: hadoop jar /usr/lib/solr/contrib/mr/search-mr-*-job.jar org.apache.solr.hadoop.MapReduceIndexerTool -D 'mapred.child.java.opts=-Xmx500m' --log4j /usr/share/doc/search*/examples/solr-nrt/ log4j.properties --morphline-file reviews.conf -- output-dir hdfs://<hostname>:<port>/tmp/load -- verbose --go-live --zk-host <hostname>:<port>/solr -- collection vel-search hdfs://<hostname>:<port>/user/cloudera/query_result_ copy.csv
  • 7. Solr The below message wil appear at the end of the executionof the above command: 37097 [main] INFO org.apache.solr.client.solrj.impl.HttpClientUtil - Creating new http client, config: 37119 [main] INFO org.apache.solr.common.cloud.ConnectionManager - Waiting for client to connect to ZooKeeper 37121 [main-EventThread] INFO org.apache.solr.common.cloud.ConnectionManager - Watcher org.apache.solr.common.cloud.ConnectionManager@dd606a name:ZooKeeperConnection Watcher:localhost:2181/solr got event WatchedEvent state:SyncConnected type:None path:null path:null type:None 37121 [main] INFO org.apache.solr.common.cloud.ConnectionManager - Client is connected to ZooKeeper 37122 [main] INFO org.apache.solr.common.cloud.ZkStateReader - Updating cluster state from ZooKeeper... 37957 [main] INFO org.apache.solr.hadoop.GoLive - Done committing live merge 37958 [main] INFO org.apache.solr.hadoop.GoLive - Live merging of index shards into Solr cluster took 2.032 secs 37958 [main] INFO org.apache.solr.hadoop.GoLive - Live merging completed successfully 37958 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Succeeded with job: jobName: org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper, jobId: job_1409686197310_0004 37958 [main] INFO org.apache.solr.hadoop.MapReduceIndexerTool - Success. Done. Program took 38.025 secs. Goodbye.
  • 8. Solr Go to Solr admin for testing: http://192.168.137.134:8983/solr/#/vel-search_ shard1_replica1/query Give the primary key as query for search:

Editor's Notes

  1. This template can be used as a starter file for presenting training materials in a group setting. Sections Right-click on a slide to add sections. Sections can help to organize your slides or facilitate collaboration between multiple authors. Notes Use the Notes section for delivery notes or to provide additional details for the audience. View these notes in Presentation View during your presentation. Keep in mind the font size (important for accessibility, visibility, videotaping, and online production) Coordinated colors Pay particular attention to the graphs, charts, and text boxes. Consider that attendees will print in black and white or grayscale. Run a test print to make sure your colors work when printed in pure black and white and grayscale. Graphics, tables, and graphs Keep it simple: If possible, use consistent, non-distracting styles and colors. Label all graphs and tables.
  2. Give a brief overview of the presentation. Describe the major focus of the presentation and why it is important. Introduce each of the major topics. To provide a road map for the audience, you can repeat this Overview slide throughout the presentation, highlighting the particular topic you will discuss next.
  3. Give a brief overview of the presentation. Describe the major focus of the presentation and why it is important. Introduce each of the major topics. To provide a road map for the audience, you can repeat this Overview slide throughout the presentation, highlighting the particular topic you will discuss next.
  4. Give a brief overview of the presentation. Describe the major focus of the presentation and why it is important. Introduce each of the major topics. To provide a road map for the audience, you can repeat this Overview slide throughout the presentation, highlighting the particular topic you will discuss next.
  5. Give a brief overview of the presentation. Describe the major focus of the presentation and why it is important. Introduce each of the major topics. To provide a road map for the audience, you can repeat this Overview slide throughout the presentation, highlighting the particular topic you will discuss next.
  6. Give a brief overview of the presentation. Describe the major focus of the presentation and why it is important. Introduce each of the major topics. To provide a road map for the audience, you can repeat this Overview slide throughout the presentation, highlighting the particular topic you will discuss next.
  7. Give a brief overview of the presentation. Describe the major focus of the presentation and why it is important. Introduce each of the major topics. To provide a road map for the audience, you can repeat this Overview slide throughout the presentation, highlighting the particular topic you will discuss next.
  8. Give a brief overview of the presentation. Describe the major focus of the presentation and why it is important. Introduce each of the major topics. To provide a road map for the audience, you can repeat this Overview slide throughout the presentation, highlighting the particular topic you will discuss next.
  9. Microsoft Confidential