SlideShare a Scribd company logo
1 of 17
andrew.janowczyk@searchbox.com
Solr is
◦ Blazing fast open source enterprise search platform
◦ Lucene-based Search Server
◦ Written in Java
◦ Has REST-like HTTP/XML and JSON APIs
◦ Extensive plugin architecture
http://lucene.apache.org/solr/
 Allows for the development of plugins which
provide advanced operations
 Types of plugins:
◦ RequestHandlers
 Uses url parameters and returns own response
◦ SearchComponents
 Responses are embedded in other responses (such as
/select)
◦ ProcessFactory
 Response is stored into a field along with the
document during index time
 A quick tutorial on how to program a
RequestHandler to
◦ Be initialized
◦ Parse configuration file arguments
◦ Do something useful, (counts some words in query)
◦ Format and return response
 We’ll name our plugin “DemoPlugin” and
show how to stick it into the solrconfig.xml
for loading
 In the next slide, we’ll specify a list of variables
called “words”, and each list subtype is a string
“word”
 We want to load these specific words and then
count them on all subsequent queries.
 Ex: config file has “body”, “fish”, “dog”
 Query is: dog body body body fish fish fish fish
orange
 Result should be:
◦ body=3.0
◦ fish=4.0
◦ dog=1.0
<requestHandler name=“/newendpoint"
class="com.searchbox.DemoPlugin">
<lst name=“words">
<str name=“word">body</str>
<str name=“word">fish</str>
<str name=“word">dog</str>
</lst>
</requestHandler>
Variables will be loaded from this section
during the init method discussed later
 We can see that we’re asking for Solr to load
com.searchbox.DemoPlugin. This will be the
output of our project in .jar file format
 Copy the .jar file to the lib directory in the
Solr installation so that Solr can find it.
 That’s it!
package com.searchbox;
import java.util.HashMap;
import java.util.List;
import org.apache.solr.common.SolrException;
import org.apache.solr.common.params.CommonParams;
import org.apache.solr.common.params.SolrParams;
import org.apache.solr.common.util.NamedList;
import org.apache.solr.common.util.SimpleOrderedMap;
import org.apache.solr.handler.RequestHandlerBase;
import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.response.SolrQueryResponse;
import org.apache.solr.search.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public class DemoPlugin extends RequestHandlerBase {
private static Logger LOGGER = LoggerFactory.getLogger(DemoPlugin.class);
volatile long numRequests;
volatile long totalTime;
volatile long numErrors;
List<String> words;
 Initialization is called when the plugin is first
loaded
 This most commonly occurs when Solr is
started up
 At this point we can load things from file
(models, serialized objects, etc)
 Have access to the variables set in
solrconfig.xml
 We have selected to pass a list called “words”
and have also provided the list “fish”, ”body”,
”cat” of words we’d like to count.
 During initialization we need to load this list
from solrconfig.xml and store it locally
@Override
public void init(NamedList params) {
words= (NamedList)params.get(“words”)).getAll(“word”);
if (words.isEmpty()) {
throw new
SolrException(SolrException.ErrorCode.SERVER_ERROR,
"Need to specify at least one word in requestHandler config!");}
}
super.init(params); //pass the rest of the init up
}
Notice that we’ve loaded the list “words” and
then all of its attributes called “word” and put
them into the class level variable words.
@Override
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception
{
numRequests++;
long startTime = System.currentTimeMillis();
try {
HashMap<String, Double> counts = new HashMap<String, Double>();
SolrParams params = req.getParams();
String q = params.get(CommonParams.Q); //get the q param from url
for (String string : q.split(" ")) {
if (words.contains(string)) {
Double oldcount = counts.containsKey(string) ? counts.get(string) : 0;
counts.put(string, oldcount + 1);
}
}
• We start off by keeping track in a volatile variable the number of requests we’ve seen (for use later
in statistics), and we’d like to know how long the process takes so we note the time.
• Next we initialize our local variable which will contain our word counts
• Next we get the “q” parameter from the URL which was sent to us
• We do a very silly split by space to break it into words, and iterate through each of the words. If the
word is in our “words” variable, we keep a running total of the number of times it appears
NamedList<Double> results = new NamedList<Double>();
for (String word : words) {
results.add(word, counts.get(word));
}
rsp.add("results", results);
} catch (Exception e) {
numErrors++;
LOGGER.error(e.getMessage());
} finally {
totalTime += System.currentTimeMillis() - startTime;
}
}
• Now that we’ve looked at all of the strings, and our process is done we need to return the results.
• We create a namedlist of type double to hold the counts, and then iterate through our words adding them
to the response
• Finally, we add our result list to the Solr response variable rsp
• We also see the other end of the catch statement, which is used to collect error counts and print the error
to the Solr logger
• Finally we add the time it took to the total time
@Override
public String getDescription() {
return "Searchbox DemoPlugin";
}
@Override
public String getVersion() {
return "1.0";
}
@Override
public String getSource() {
return "http://www.searchbox.com";
}
@Override
public NamedList<Object> getStatistics() {
NamedList all = new SimpleOrderedMap<Object>();
all.add("requests", "" + numRequests);
all.add("errors", "" + numErrors);
all.add("totalTime(ms)", "" + totalTime);
return all;
}
• In order to have a production grade plugin, users expect to see certain pieces of information
available in their Solr admin panel
• Description, version and source are just Strings
• We see getStatistics() actually uses the volatile variables we were keeping track of before, sticks
them into another named list and returns them. These appear under the statistics panel in Solr.
• That’s it!
http://192.168.56.101:8983/solr/core_name/newendpoint?q=dog%20body%20body%20body%20fish%20fis
h%20fish%20fish%20orange
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<lst name="results">
<double name="body">3.0</double>
<double name="fish">4.0</double>
<double name="dog">1.0</double>
</lst>
</response>
• Because we’ve overridden the
getStatistics() method, we can get real-
time stats from the admin panel!
Happy Developing!
Full Source Code available at:
http://www.searchbox.com/developing-a-request-handler-for-solr

More Related Content

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Recently uploaded (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Develop a solr request handler plugin

  • 2. Solr is ◦ Blazing fast open source enterprise search platform ◦ Lucene-based Search Server ◦ Written in Java ◦ Has REST-like HTTP/XML and JSON APIs ◦ Extensive plugin architecture http://lucene.apache.org/solr/
  • 3.  Allows for the development of plugins which provide advanced operations  Types of plugins: ◦ RequestHandlers  Uses url parameters and returns own response ◦ SearchComponents  Responses are embedded in other responses (such as /select) ◦ ProcessFactory  Response is stored into a field along with the document during index time
  • 4.  A quick tutorial on how to program a RequestHandler to ◦ Be initialized ◦ Parse configuration file arguments ◦ Do something useful, (counts some words in query) ◦ Format and return response  We’ll name our plugin “DemoPlugin” and show how to stick it into the solrconfig.xml for loading
  • 5.  In the next slide, we’ll specify a list of variables called “words”, and each list subtype is a string “word”  We want to load these specific words and then count them on all subsequent queries.  Ex: config file has “body”, “fish”, “dog”  Query is: dog body body body fish fish fish fish orange  Result should be: ◦ body=3.0 ◦ fish=4.0 ◦ dog=1.0
  • 6. <requestHandler name=“/newendpoint" class="com.searchbox.DemoPlugin"> <lst name=“words"> <str name=“word">body</str> <str name=“word">fish</str> <str name=“word">dog</str> </lst> </requestHandler> Variables will be loaded from this section during the init method discussed later
  • 7.  We can see that we’re asking for Solr to load com.searchbox.DemoPlugin. This will be the output of our project in .jar file format  Copy the .jar file to the lib directory in the Solr installation so that Solr can find it.  That’s it!
  • 8. package com.searchbox; import java.util.HashMap; import java.util.List; import org.apache.solr.common.SolrException; import org.apache.solr.common.params.CommonParams; import org.apache.solr.common.params.SolrParams; import org.apache.solr.common.util.NamedList; import org.apache.solr.common.util.SimpleOrderedMap; import org.apache.solr.handler.RequestHandlerBase; import org.apache.solr.request.SolrQueryRequest; import org.apache.solr.response.SolrQueryResponse; import org.apache.solr.search.*; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public class DemoPlugin extends RequestHandlerBase { private static Logger LOGGER = LoggerFactory.getLogger(DemoPlugin.class); volatile long numRequests; volatile long totalTime; volatile long numErrors; List<String> words;
  • 9.  Initialization is called when the plugin is first loaded  This most commonly occurs when Solr is started up  At this point we can load things from file (models, serialized objects, etc)  Have access to the variables set in solrconfig.xml
  • 10.  We have selected to pass a list called “words” and have also provided the list “fish”, ”body”, ”cat” of words we’d like to count.  During initialization we need to load this list from solrconfig.xml and store it locally
  • 11. @Override public void init(NamedList params) { words= (NamedList)params.get(“words”)).getAll(“word”); if (words.isEmpty()) { throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Need to specify at least one word in requestHandler config!");} } super.init(params); //pass the rest of the init up } Notice that we’ve loaded the list “words” and then all of its attributes called “word” and put them into the class level variable words.
  • 12. @Override public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception { numRequests++; long startTime = System.currentTimeMillis(); try { HashMap<String, Double> counts = new HashMap<String, Double>(); SolrParams params = req.getParams(); String q = params.get(CommonParams.Q); //get the q param from url for (String string : q.split(" ")) { if (words.contains(string)) { Double oldcount = counts.containsKey(string) ? counts.get(string) : 0; counts.put(string, oldcount + 1); } } • We start off by keeping track in a volatile variable the number of requests we’ve seen (for use later in statistics), and we’d like to know how long the process takes so we note the time. • Next we initialize our local variable which will contain our word counts • Next we get the “q” parameter from the URL which was sent to us • We do a very silly split by space to break it into words, and iterate through each of the words. If the word is in our “words” variable, we keep a running total of the number of times it appears
  • 13. NamedList<Double> results = new NamedList<Double>(); for (String word : words) { results.add(word, counts.get(word)); } rsp.add("results", results); } catch (Exception e) { numErrors++; LOGGER.error(e.getMessage()); } finally { totalTime += System.currentTimeMillis() - startTime; } } • Now that we’ve looked at all of the strings, and our process is done we need to return the results. • We create a namedlist of type double to hold the counts, and then iterate through our words adding them to the response • Finally, we add our result list to the Solr response variable rsp • We also see the other end of the catch statement, which is used to collect error counts and print the error to the Solr logger • Finally we add the time it took to the total time
  • 14. @Override public String getDescription() { return "Searchbox DemoPlugin"; } @Override public String getVersion() { return "1.0"; } @Override public String getSource() { return "http://www.searchbox.com"; } @Override public NamedList<Object> getStatistics() { NamedList all = new SimpleOrderedMap<Object>(); all.add("requests", "" + numRequests); all.add("errors", "" + numErrors); all.add("totalTime(ms)", "" + totalTime); return all; } • In order to have a production grade plugin, users expect to see certain pieces of information available in their Solr admin panel • Description, version and source are just Strings • We see getStatistics() actually uses the volatile variables we were keeping track of before, sticks them into another named list and returns them. These appear under the statistics panel in Solr. • That’s it!
  • 15. http://192.168.56.101:8983/solr/core_name/newendpoint?q=dog%20body%20body%20body%20fish%20fis h%20fish%20fish%20orange <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">0</int> </lst> <lst name="results"> <double name="body">3.0</double> <double name="fish">4.0</double> <double name="dog">1.0</double> </lst> </response>
  • 16. • Because we’ve overridden the getStatistics() method, we can get real- time stats from the admin panel!
  • 17. Happy Developing! Full Source Code available at: http://www.searchbox.com/developing-a-request-handler-for-solr