SlideShare a Scribd company logo
Lifecycle of a
Solr Search
Request
Chris "Hoss" Hostetter - 2017-09-14
https://home.apache.org/~hossman/rev2017/
https://twitter.com/_hossman
https://www.lucidworks.com/
Abstract:
This intermediate session for existing Solr users will provide a
Deep Dive look into the lifecycle of a Solr Search Request. We
will drill down through each layer of code, discussing what
happens at each stage -- including when & how inter-node
communication takes place in a multi-node SolrCloud cluster.
Along the way, we will also review the various places where
users can configure existing (or custom written) plugins to
override or amend the default behavior.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
1 of 24 10/4/17, 4:32 PM
Agenda
Deep Dive look into the lifecycle of 4 Solr Search Requests...
Single Node: Single SolrCore
Simple Query1.
Facet Query2.
SolrCloud: 2 Shards + 2 Replicas
Simple Query3.
Facet Query4.
...and where various types of Plugins can be used.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
2 of 24 10/4/17, 4:32 PM
Simple Query
Single Node: Single SolrCore
bin/solr -e techproducts
http://localhost:8983/solr/techproducts/select
? q = ipod
& sort = inStock desc, score desc
& fl = id, name
& rows = 10
This sample paginated query is based off of the techproducts
example configs & data that have been included in ever release of Solr
since it was first open sourced.
I have a nostalgic affection for this silly little dataset.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
3 of 24 10/4/17, 4:32 PM
HTTP (Jetty)
SolrDispatchFilter
Solr Webapp/solr ➔
CoreContainer
/techproducts ➔ SolrCore
/select? ➔ RequestHandler
SolrCore
foo
SolrCore
etc...
wt=json ➔ ResponseWriter
...:8983/solr/techproducts/select?...
UI:HTML,Javascript,
Images,CSS
SolrCore
techproducts
Purple: The HTTP layer, currently implemented by Jetty
Blue: Solr runs as "webapp" inside the Jetty Servlet container (but
that's just an implementation detail)
Black: The key pieces of the Solr webapp: misc "flat files" that power
the Solr UI, and the SolrDispatchFilter which is responsible
for mapping all HTTP request/responses into their internal Solr
representations and executing them
Red: CoreContainer is singleton responsible to managing the
lifecycle of SolrCores
Green: each SolrCore encapsulates the configs & data for a single
"index" (which in a SolrCloud configuration would be a replica of
some shard or some collection)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
4 of 24 10/4/17, 4:32 PM
SolrCore: techproducts
SolrRequestHandlers SearchComponents
QueryComponent: query
- prepare()
- df=text&q=ipod ➔ Query
- etc...
- process()
- etc...
SearchHandler: /select
- initParams
- df = text (default)
- components (implicit)
- query
- etc...
SearchHandler: /etc...
UpdateRequestHandler : /etc...
FacetComponent: facet
etc...
Green: The SolrCore used for this (HTTP) request
Black: Named instances of (plugable) SolrRequestHandlers.
SearchHandler is the most common, and it uses a configurable
list of SearchComponents
Red: Named instances of (plugable) SearchComponents,
QueryComponent is the only one used in this simple request
All SearchComponents implement prepare() & process()
methods, which are called by SearchHandler
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
5 of 24 10/4/17, 4:32 PM
SolrIndexSearcher
query
IndexSchema
- SchemaFields ➔ FieldTypes
QueryComponent.prepare()
+ rows=10 ➔ ok?
fl=id,name ➔ ok?
/ q ➔ LuceneQParser
LuceneQParser + (df=text ➔ text) + "ipod" ➔ TermQuery
( "inStock desc" ➔ bool ➔ BoolField.getSortField(inStock,desc)
+ "score desc" ➔ SortField.SCORE ) ➔ Sort
TextField: text
- Analyzer
- Similarity
- etc...
TextField: etc..
- Analyzer
- Similarity
- etc...
BoolField: bool
- Analyzer
- Similarity
- getSortField
- etc...
LuceneQParser
DismaxQParser
etc...
Red: QueryComponent.prepare() and it's basic logic for
validating & parsing the basic request params
Green: Named instances of (pluggable) QParserPlugins for
parsing query strings (q & fq params). Here the (implicit) default
LuceneQParser
Orange: The IndexSchema which contains...
Named SchemaFields (or dynamicFields) which map
to...
Purple: Named instances of (pluggable) FieldTypes which
dictate how the field names mapped to them are parsed,
indexed, sorted, queried, etc...
Blue: The SolrIndexSearcher is ultimately what will be
queried with these parsed queries & sort objects
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
6 of 24 10/4/17, 4:32 PM
SolrIndexSearcher.search(...)
window(start, rows, windowSize)
(queryResultCache? | Index) ➔ DocList
queryQueryComponent.process()
search(Query,filters[],start,rows,Sort,...) ➔ DocList
JsonResponseWriter
DocList {
+ searcher.doc(#)
➔ Stored Fields
}
➔ Bytes ➔ HTTP...
documentCache
queryResultCache
filterCache
IndexReader
- InvertedIndex
- Stored Fields XmlResponseWriter
etc...
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
7 of 24 10/4/17, 4:32 PM
Red: QueryComponent.process() which uses the
SolrIndexSearcher to execute the Query created by it's
prepare() method
Blue: the SolrIndexSearcher includes several caches in
addition to the InvertedIndex, and when executing a query, first
evaluates the start/rows requested to fit a configured "window size"
so that "page #2" type requests can result in a cache hit & re-use the
results computed for "page #1"
Orange: The low level InvertedIndex & The
queryResultCache that can be used in it's place when
executing basic searchers & the DocList containing a sorted
list of (internal) doc#s and their scores for the requested
start+rows of this query
Purple: The Stored Fields of the documents in the index & the
documentCache used by SolrIndexSearcher to
reduce disk reads when popular documents are frequently
matched by searches
Green: Named instances of (pluggable)
QueryResponseWriters which dictate how the data structures
produced once a request is processed get serialized into bytes (for
the HTTP response returned to the original client by Jetty)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
8 of 24 10/4/17, 4:32 PM
More Complex Query
Single Node: Single SolrCore
http://localhost:8983/solr/techproducts/select
? q = ipod
& fq = price:[* TO 1000]
& sort = div(popularity,price) asc,
score desc
& fl = id, name, why:[explain style=nl]
& facet = true
& facet.field = cat
This slightly more interesting query builds off the previous example by:
Adding a "filter query" on the (numeric) price field
Changing the primary sort criteria to be a mathematical function
against 2 fields
Requesting an additional psuedo-field explaining the score of each
document
Faceting on the "cat" (aka: category) field
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
9 of 24 10/4/17, 4:32 PM
HTTP (Jetty)
SolrDispatchFilter
Solr Webapp/solr ➔
CoreContainer
/techproducts ➔ SolrCore
/select? ➔ RequestHandler
SolrCore
foo
SolrCore
etc...
wt=json ➔ ResponseWriter
...:8983/solr/techproducts/select?...
UI:HTML,Javascript,
Images,CSS
SolrCore
techproducts
The HTTP, Webapp, DispatchFilter, CoreContainer, SolrCore, and
RequestHandler layers all function exactly as in our previous (simpler)
example. It's only once the SearchHandler starts looping over the
components that things get more interesting....
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
10 of 24 10/4/17, 4:32 PM
query
IndexSchema
- SchemaFields ➔ FieldTypes
QueryComponent.prepare()
etc...
"price:[* TO 1000]" ➔ float
➔ PointRangeQuery(...) ➔ filters[]
div(popularity,price)
➔ ValueSource(IntFieldSource,...)
FloatPointField: float
- ValueSource
- getRangeQuery()
- etc...
IntPointField: int
- ValueSource
- etc...
FacetComponent.prepare()
facet=true ✔
facet.field=cat ➔ ok?
needDocSet = true
SolrIndexSearcher
div()
sum()
etc...
Most items identical to those shown in the "simple" query are omitted for
brevity. Of the new items shown here...
Red: In addition to some additional logic in
QueryComponent.prepare() method (to parse the filter
query and more complex sort) we know also see the
FacetComponent.prepare() method, which does it's own
validation & sets a flag indicating that it needs extra info (the
DocSet) once SolrIndexSearcher is asked to execute the
Query
Green: Named instances of (pluggable) ValueSourceParsers
for parsing function strings -- used here in our sort, but could also be
used in queries
Orange: As before the IndexSchema, now showing that
FieldTypes are also responsible for providing the range query
(filter) and ValueSources (used by the functions)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
11 of 24 10/4/17, 4:32 PM
SolrIndexSearcher
queryQueryComponent.process()
search(...) ➔〈DocList,DocSet〉
etc...
JsonResponseWriter
DocList {
+ searcher.doc(#)
➔ Stored Fields
+ [explain ...]
}
+ Facet Counts
➔ Bytes ➔ HTTP...
ExplainAugmenter
ChildDocTransformer
queryFacetComponent.process()
For Each "cat" Index Terms:
➔ Intersect with DocSet
SubQueryAugmenter
etc...
searcher.explain(#)
documentCache
queryResultCache
filterCache
IndexReader
- InvertedIndex
- Stored Fields
Most items identical to those shown in the "simple" query are omitted for
brevity. Of the new items shown here...
Red: Now when QueryComponent.process() executes the
search, the "needsDocSet" flag set by
FacetComponent.prepare() is also used.
FacetComponent.process() can then use the resulting
DocSet (an unordered set of all matching doc# -- regardless of sort)
to compute the facet counts.
Olive: Named instances of (pluggable) DocTransformers (or
Augmenters) which can be used to annotate individual documents
returned in the results. For this query in particular we see the
ExplainAugmenter which uses the SolrIndexSearcher to
get a (debugging) data structure "explaining" how the score of each
document was computed.
Green: the JsonResponseWriter not only returns the Stored
Fields of each document, but also the results of any
DocTransformers. It also serializes the Facet Counts.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
12 of 24 10/4/17, 4:32 PM
Simple Query
SolrCloud: 4 Nodes, 2 Shards, 2 Replicas
bin/solr -e cloud
...
http://localhost:8983/solr/techproducts/select
? q = ipod
& sort = inStock desc, score desc
& fl = id, name
& rows = 10
This is the same as or original simple query, still using the
techproducts sample configs & data, but from here on we'll assume
we're using a 4 node SolrCloud cluster, with the techproducts
collection configured to have 2 shards, with a replication factor of 2.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
13 of 24 10/4/17, 4:32 PM
SolrDispatchFilter
/techproducts ➔ tech_s1_r2
Jetty: http://host1:8983
SolrDispatchFilter
/techproducts ?➔ host4
Jetty: http://host3:8983
SolrDispatchFilter
/techproducts ?➔ tech_s2_r2
Jetty: http://host2:8983
SolrDispatchFilter
/techproducts ➔ tech_s2_r1
Jetty: http://host4:8983
techproducts
tech_s1_r2
foo
foo_s1_r1
foo
foo_s2_r1
techproducts
tech_s1_r1
techproducts
tech_s2_r1
foo
foo_s1_r2
techproducts
tech_s2_r2
foo
foo_s2_r2
Purple: 4 Jetty instances, running on (the same port 8983 of) 4
different hosts
Black: The 4 SolrDispatchFilters running inside each of
these 4 Jetty instances, and how each of them resolves requests for
the techproducts collection.
Green the individual SolrCores (which are each a replica of some
shard of a collection) running in each Solr node. Note that for the
purposes of illustrating the diff possible ways a Solr request may be
routed, host3 does not contain any SolrCores that are part of the
techproducts collection.
(Other Layers such as the Solr webapp and the CoreContainer have
been omitted to save space)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
14 of 24 10/4/17, 4:32 PM
coordinator shard1
QueryComponent:
prepare() + process()
α: q=ipod&fl=id&fsv=true
➔ top ids + sort values
β1: ids=X,Y,Z&fl=name ➔ ...
shard2
QueryComponent:
prepare() + process()
α: q=ipod&fl=id&fsv=true
➔ top ids + sort values
β2: ids=A,..,G&fl=name ➔ ...
SearchHandler: /select
Repeat until done:
query.distributedProcess
➔ ShardRequests (α,β)
Loop: ShardRequests
query.handleResponse
QueryComponent:
distributedProcess()
α: shard top10 + sort values
β: full fl for final top10 ids
FacetComponent
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
15 of 24 10/4/17, 4:32 PM
Purple: The HTTP Layer showing 3 hosts: an arbitrary 'coordinator'
node, and 2 nodes each hosting a replica of the 2 shards for the
collection
Black: SearchHandler. On the coordinator node,
SearchHandler executes new logic to execute sub-requests
created by it's SearchComponents to arbitrarily selected replicas
of each shard. On the replicas handling these sub-requests, the
SearchHandler processes these requests just as if they were
simple (single node) queries.
Red: SearchComponent methods. On the coordinator node
SearchHandler loops over every component calling
SearchComponent.distributedProcess() to
create/modify sub-requests for the individual shards, and then calls
SearchComponent.handleResponse() to merge the
results from each shard and decide if/when/what additional
information may be needed. This process repeats until all calls to
distributedProcess() on all SearchComponents
indicate that they are finished.
Green & Blue: The 2 stages (α & β) of shard sub-requests needed to
process this simple query. Note that the α-requests are identical for
both shards, but the β-requests are slightly different to request the
fl fields for the matches specific to that shard.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
16 of 24 10/4/17, 4:32 PM
Shard Request α
q=ipod&fl=id&fsv=true&rows=10
sort=inStock desc, score desc numFound=42+314=356
Z, Zebra
F, Frog
B, Boat
D, Deer
C, Car
X, X-Ray
G, Gong
A, Apple
Y, Yo-Yo
E, Ear
Merged
Shard 1
numFound=42
F〈true,6〉
B〈true,6〉
D〈true,5〉
C〈true,3〉
G〈true,2〉
A〈true,1〉
E〈false,5〉
Shard 2
numFound=314
Z〈true,6〉
X〈true,3〉
Y〈false,9〉 Shard Request β
q=ipod&ids=...&fl=name
Shard 1
A, Apple
B, Boat
C, Car
D, Deer
E, Ear
F, Frog
G, Gong
Shard 2
X, X-Ray
Y, Yo-Yo
Z, Zebra
Here we see hypothetical α request+responses, hypothetical β
requests+responses, & the final Merged results from both -- showing how
the IDs and sort values from the α request are used to determine which
documents will be in the final results, and in which order. For these specific
documents, the β requests+responses fill in the fl fields for the final
client.
Red & Blue: The responses from shard1 & shard2 for the α request
Green & Purple: The responses from shard1 & shard2 for the β
request
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
17 of 24 10/4/17, 4:32 PM
Complex Query*
SolrCloud: 4 Node, 2 Shards, 2 Replicas
http://localhost:8983/solr/techproducts/select
? q = ipod
& sort = inStock desc, score desc
& fl = id, name
& facet = true
& facet.field = cat
In the interest of time, this query is not as "Complex" as the "Complex"
Single Core query we looked at before. I've omitted things like fq params,
sorting on functions, and the use of DocTransformers in the fl
because nothing about how those are handled in a Single Core query
changes when they are requested by a coordinator node in a SolrCloud
query.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
18 of 24 10/4/17, 4:32 PM
coordinator shard1
QueryComponent:
prepare() + process()
α: q=ipod&fl=id&fsv=true
➔ top ids + sort values
β1: ids=X,Y,Z&fl=name ➔...
FacetComponent:
prepare() + process()
α: facet.limit=N + extra
➔ top terms w/counts
β1: ..._terms=aa,qq,... ➔...
QueryComponent:
distributedProcess()
α: shard top10 + sort values
β: full fl for final top10 ids
shard2
FacetComponent:
distributedProcess()
α: facet.field=cat
w/facet.limit overrequest
β: request missing counts
for final top terms
SearchHandler: /select
➔ ShardRequests (α, β)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
19 of 24 10/4/17, 4:32 PM
Purple: The HTTP Layer showing 3 hosts: an arbitrary 'coordinator'
node, and 2 nodes each hosting a replica of the 2 shards for the
collection. To save space, the (largely redundant) details of the
requests to shard2 are not shown.
Black: SearchHandler. To save space, the details (shown in
previous diagrams) regarding how SearchHandler processes
requests when acting as a coordinator have been omitted -- the key
thing to note is that even with the added complexity of the
FacetComponent, there are still only 2 stages of sub-requests to
each shard (α & β)
Red: SearchComponent methods:
QueryComponent behaves exactly as before
Now that FacetComponent is in use, it can modify the sub-
requests created by QueryComponent to "piggy back" on
them and request additional information from each shard.
Green & Blue: The 2 stages (α & β) of shard sub-requests needed to
process this query. Although the details of the requests to shard2 are
omitted for brevity, the α-requests are identical for both shards, and
(as before) the β-requests are slightly different to request both the
the fl fields for the document matches specific to that shard, as well
as the facet counts for any "candidate" terms that were not included
in the α response from that shard.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
20 of 24 10/4/17, 4:32 PM
Shard Request α
facet.field=cat
facet.limit=N+OVERREQUEST
Shard Request β
facet.field={!_terms=...}cat
auto: 253 (3 + 250)
lawn: 190 (20 + 170)
...
DVD: 102 (5 + 97)
Final (Merge α+β)Shard 1
games: 40
...
lawn: 20
books: 10
DVD: 5
...
beach: 4
toys: 3
Shard 2
auto: 250
lawn: 170
...
food: 100
DVD: 97
...
books: 90
clothing: 90
Shard 1
auto: 3
food: 0
Shard 2
games: 45
N
auto: 250-253 (? + 250)
lawn: 190 (20 + 170)
...
games: 40-130 (40 + ?)
food: 100-103 (? + 100)
DVD: 102 (5 + 97)
...
Merge α
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
21 of 24 10/4/17, 4:32 PM
Here we see the additional information involved in α & β
requests+responses+merging for our more complex queries compared to
what we looked at before. The information requested & merged by
QueryComponent is omitted for brevity, and we focus solely on how
FacetComponent modifies those requests to "overrequest" the
original facet.limit and what it does with the results.
In the α request, over-request additional terms from each shard beyond
what the user asked for; In the β request, ask each shard for the details
about any terms that are "candidates" for the final results but where NOT
already returned by this shard in the α response.
Each term that is a candidate for the final response is shown in a unique
color. Black/Grey is used to indicate terms where incomplete information
is available to the coordinator, but enough is known to be confident that
they can't possibly be candidates for the final results. Faded terms (in
italics) show at what stage the coordinating FacetComponent knows
that particular term can be eliminated for consideration.
(While the "..." ellipses are used to denote the possibility of many
additional terms depending on the value of facet.limit=N (which
defaults to 100), viewers may find the easiest way to understand how
these results are merged & refined is to assume N=3 and imagine the
ellipses do not exist in the diagram)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
22 of 24 10/4/17, 4:32 PM
Q & A
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
23 of 24 10/4/17, 4:32 PM
Me
https://twitter.com/_hossman
My Company
https://www.lucidworks.com/
These Slides
https://home.apache.org/~hossman/rev2017/
Solr Docs & Mailing List
https://lucene.apache.org/solr/resources.html
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
24 of 24 10/4/17, 4:32 PM

More Related Content

What's hot

Integrating the Solr search engine
Integrating the Solr search engineIntegrating the Solr search engine
Integrating the Solr search engine
th0masr
 
Solr as a Spark SQL Datasource
Solr as a Spark SQL DatasourceSolr as a Spark SQL Datasource
Solr as a Spark SQL Datasource
Chitturi Kiran
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
Erik Hatcher
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
Saumitra Srivastav
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conferenceErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
Erik Hatcher
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Alexandre Rafalovitch
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development TutorialErik Hatcher
 
Apache Solr
Apache SolrApache Solr
Apache Solr
Minh Tran
 
Faster Data Analytics with Apache Spark using Apache Solr
Faster Data Analytics with Apache Spark using Apache SolrFaster Data Analytics with Apache Spark using Apache Solr
Faster Data Analytics with Apache Spark using Apache Solr
Chitturi Kiran
 
Building your own search engine with Apache Solr
Building your own search engine with Apache SolrBuilding your own search engine with Apache Solr
Building your own search engine with Apache Solr
Biogeeks
 
Solr Metrics - Andrzej Białecki, Lucidworks
Solr Metrics - Andrzej Białecki, LucidworksSolr Metrics - Andrzej Białecki, Lucidworks
Solr Metrics - Andrzej Białecki, Lucidworks
Lucidworks
 
Introduction to Apache Lucene/Solr
Introduction to Apache Lucene/SolrIntroduction to Apache Lucene/Solr
Introduction to Apache Lucene/SolrRahul Jain
 
Webinar: Solr & Spark for Real Time Big Data Analytics
Webinar: Solr & Spark for Real Time Big Data AnalyticsWebinar: Solr & Spark for Real Time Big Data Analytics
Webinar: Solr & Spark for Real Time Big Data Analytics
Lucidworks
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
Erik Hatcher
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
Erik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Spark Summit
 
Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1
YI-CHING WU
 
Solr 4
Solr 4Solr 4
Solr 4
Erik Hatcher
 

What's hot (20)

Integrating the Solr search engine
Integrating the Solr search engineIntegrating the Solr search engine
Integrating the Solr search engine
 
Solr as a Spark SQL Datasource
Solr as a Spark SQL DatasourceSolr as a Spark SQL Datasource
Solr as a Spark SQL Datasource
 
Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conference
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
Rebuilding Solr 6 examples - layer by layer (LuceneSolrRevolution 2016)
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
 
Apache Solr
Apache SolrApache Solr
Apache Solr
 
Faster Data Analytics with Apache Spark using Apache Solr
Faster Data Analytics with Apache Spark using Apache SolrFaster Data Analytics with Apache Spark using Apache Solr
Faster Data Analytics with Apache Spark using Apache Solr
 
Building your own search engine with Apache Solr
Building your own search engine with Apache SolrBuilding your own search engine with Apache Solr
Building your own search engine with Apache Solr
 
Solr Metrics - Andrzej Białecki, Lucidworks
Solr Metrics - Andrzej Białecki, LucidworksSolr Metrics - Andrzej Białecki, Lucidworks
Solr Metrics - Andrzej Białecki, Lucidworks
 
Introduction to Apache Lucene/Solr
Introduction to Apache Lucene/SolrIntroduction to Apache Lucene/Solr
Introduction to Apache Lucene/Solr
 
Webinar: Solr & Spark for Real Time Big Data Analytics
Webinar: Solr & Spark for Real Time Big Data AnalyticsWebinar: Solr & Spark for Real Time Big Data Analytics
Webinar: Solr & Spark for Real Time Big Data Analytics
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)Integrating Spark and Solr-(Timothy Potter, Lucidworks)
Integrating Spark and Solr-(Timothy Potter, Lucidworks)
 
Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1Introduction to Lucene and Solr - 1
Introduction to Lucene and Solr - 1
 
Solr 4
Solr 4Solr 4
Solr 4
 

Similar to Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks

Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Kai Chan
 
IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys" IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys"
DataArt
 
Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction
Sajindbg Dbg
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring Data
Eric Bottard
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
Erik Hatcher
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr WorkshopJSGB
 
Solr introduction
Solr introductionSolr introduction
Solr introduction
Lap Tran
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialSourcesense
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
MapR Technologies
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
Bertrand Delacretaz
 
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Lucidworks
 
Solr 101
Solr 101Solr 101
Solr 101
Findwise
 
20150210 solr introdution
20150210 solr introdution20150210 solr introdution
20150210 solr introdution
Xuan-Chao Huang
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going strong
lucenerevolution
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
Ruben Verborgh
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Information Retrieval - Data Science Bootcamp
Information Retrieval - Data Science BootcampInformation Retrieval - Data Science Bootcamp
Information Retrieval - Data Science Bootcamp
Kais Hassan, PhD
 
Apache Solr + ajax solr
Apache Solr + ajax solrApache Solr + ajax solr
Apache Solr + ajax solr
Net7
 
Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014
Alexandre Rafalovitch
 

Similar to Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks (20)

Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)
 
IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys" IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys"
 
Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction Coffee at DBG- Solr introduction
Coffee at DBG- Solr introduction
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring Data
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Solr introduction
Solr introductionSolr introduction
Solr introduction
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr Tutorial
 
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael HausenblasBerlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
Berlin Buzz Words - Apache Drill by Ted Dunning & Michael Hausenblas
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
 
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
 
Solr 101
Solr 101Solr 101
Solr 101
 
20150210 solr introdution
20150210 solr introdution20150210 solr introdution
20150210 solr introdution
 
Solr at zvents 6 years later & still going strong
Solr at zvents   6 years later & still going strongSolr at zvents   6 years later & still going strong
Solr at zvents 6 years later & still going strong
 
Querying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern FragmentsQuerying federations 
of Triple Pattern Fragments
Querying federations 
of Triple Pattern Fragments
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Information Retrieval - Data Science Bootcamp
Information Retrieval - Data Science BootcampInformation Retrieval - Data Science Bootcamp
Information Retrieval - Data Science Bootcamp
 
Apache Solr + ajax solr
Apache Solr + ajax solrApache Solr + ajax solr
Apache Solr + ajax solr
 
Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014
 

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
Lucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
Lucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
Lucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Lucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Lucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
Lucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Lucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Lucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
Lucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
Lucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 

Recently uploaded (20)

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 

Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks

  • 1. Lifecycle of a Solr Search Request Chris "Hoss" Hostetter - 2017-09-14 https://home.apache.org/~hossman/rev2017/ https://twitter.com/_hossman https://www.lucidworks.com/ Abstract: This intermediate session for existing Solr users will provide a Deep Dive look into the lifecycle of a Solr Search Request. We will drill down through each layer of code, discussing what happens at each stage -- including when & how inter-node communication takes place in a multi-node SolrCloud cluster. Along the way, we will also review the various places where users can configure existing (or custom written) plugins to override or amend the default behavior. Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 1 of 24 10/4/17, 4:32 PM
  • 2. Agenda Deep Dive look into the lifecycle of 4 Solr Search Requests... Single Node: Single SolrCore Simple Query1. Facet Query2. SolrCloud: 2 Shards + 2 Replicas Simple Query3. Facet Query4. ...and where various types of Plugins can be used. Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 2 of 24 10/4/17, 4:32 PM
  • 3. Simple Query Single Node: Single SolrCore bin/solr -e techproducts http://localhost:8983/solr/techproducts/select ? q = ipod & sort = inStock desc, score desc & fl = id, name & rows = 10 This sample paginated query is based off of the techproducts example configs & data that have been included in ever release of Solr since it was first open sourced. I have a nostalgic affection for this silly little dataset. Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 3 of 24 10/4/17, 4:32 PM
  • 4. HTTP (Jetty) SolrDispatchFilter Solr Webapp/solr ➔ CoreContainer /techproducts ➔ SolrCore /select? ➔ RequestHandler SolrCore foo SolrCore etc... wt=json ➔ ResponseWriter ...:8983/solr/techproducts/select?... UI:HTML,Javascript, Images,CSS SolrCore techproducts Purple: The HTTP layer, currently implemented by Jetty Blue: Solr runs as "webapp" inside the Jetty Servlet container (but that's just an implementation detail) Black: The key pieces of the Solr webapp: misc "flat files" that power the Solr UI, and the SolrDispatchFilter which is responsible for mapping all HTTP request/responses into their internal Solr representations and executing them Red: CoreContainer is singleton responsible to managing the lifecycle of SolrCores Green: each SolrCore encapsulates the configs & data for a single "index" (which in a SolrCloud configuration would be a replica of some shard or some collection) Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 4 of 24 10/4/17, 4:32 PM
  • 5. SolrCore: techproducts SolrRequestHandlers SearchComponents QueryComponent: query - prepare() - df=text&q=ipod ➔ Query - etc... - process() - etc... SearchHandler: /select - initParams - df = text (default) - components (implicit) - query - etc... SearchHandler: /etc... UpdateRequestHandler : /etc... FacetComponent: facet etc... Green: The SolrCore used for this (HTTP) request Black: Named instances of (plugable) SolrRequestHandlers. SearchHandler is the most common, and it uses a configurable list of SearchComponents Red: Named instances of (plugable) SearchComponents, QueryComponent is the only one used in this simple request All SearchComponents implement prepare() & process() methods, which are called by SearchHandler Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 5 of 24 10/4/17, 4:32 PM
  • 6. SolrIndexSearcher query IndexSchema - SchemaFields ➔ FieldTypes QueryComponent.prepare() + rows=10 ➔ ok? fl=id,name ➔ ok? / q ➔ LuceneQParser LuceneQParser + (df=text ➔ text) + "ipod" ➔ TermQuery ( "inStock desc" ➔ bool ➔ BoolField.getSortField(inStock,desc) + "score desc" ➔ SortField.SCORE ) ➔ Sort TextField: text - Analyzer - Similarity - etc... TextField: etc.. - Analyzer - Similarity - etc... BoolField: bool - Analyzer - Similarity - getSortField - etc... LuceneQParser DismaxQParser etc... Red: QueryComponent.prepare() and it's basic logic for validating & parsing the basic request params Green: Named instances of (pluggable) QParserPlugins for parsing query strings (q & fq params). Here the (implicit) default LuceneQParser Orange: The IndexSchema which contains... Named SchemaFields (or dynamicFields) which map to... Purple: Named instances of (pluggable) FieldTypes which dictate how the field names mapped to them are parsed, indexed, sorted, queried, etc... Blue: The SolrIndexSearcher is ultimately what will be queried with these parsed queries & sort objects Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 6 of 24 10/4/17, 4:32 PM
  • 7. SolrIndexSearcher.search(...) window(start, rows, windowSize) (queryResultCache? | Index) ➔ DocList queryQueryComponent.process() search(Query,filters[],start,rows,Sort,...) ➔ DocList JsonResponseWriter DocList { + searcher.doc(#) ➔ Stored Fields } ➔ Bytes ➔ HTTP... documentCache queryResultCache filterCache IndexReader - InvertedIndex - Stored Fields XmlResponseWriter etc... Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 7 of 24 10/4/17, 4:32 PM
  • 8. Red: QueryComponent.process() which uses the SolrIndexSearcher to execute the Query created by it's prepare() method Blue: the SolrIndexSearcher includes several caches in addition to the InvertedIndex, and when executing a query, first evaluates the start/rows requested to fit a configured "window size" so that "page #2" type requests can result in a cache hit & re-use the results computed for "page #1" Orange: The low level InvertedIndex & The queryResultCache that can be used in it's place when executing basic searchers & the DocList containing a sorted list of (internal) doc#s and their scores for the requested start+rows of this query Purple: The Stored Fields of the documents in the index & the documentCache used by SolrIndexSearcher to reduce disk reads when popular documents are frequently matched by searches Green: Named instances of (pluggable) QueryResponseWriters which dictate how the data structures produced once a request is processed get serialized into bytes (for the HTTP response returned to the original client by Jetty) Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 8 of 24 10/4/17, 4:32 PM
  • 9. More Complex Query Single Node: Single SolrCore http://localhost:8983/solr/techproducts/select ? q = ipod & fq = price:[* TO 1000] & sort = div(popularity,price) asc, score desc & fl = id, name, why:[explain style=nl] & facet = true & facet.field = cat This slightly more interesting query builds off the previous example by: Adding a "filter query" on the (numeric) price field Changing the primary sort criteria to be a mathematical function against 2 fields Requesting an additional psuedo-field explaining the score of each document Faceting on the "cat" (aka: category) field Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 9 of 24 10/4/17, 4:32 PM
  • 10. HTTP (Jetty) SolrDispatchFilter Solr Webapp/solr ➔ CoreContainer /techproducts ➔ SolrCore /select? ➔ RequestHandler SolrCore foo SolrCore etc... wt=json ➔ ResponseWriter ...:8983/solr/techproducts/select?... UI:HTML,Javascript, Images,CSS SolrCore techproducts The HTTP, Webapp, DispatchFilter, CoreContainer, SolrCore, and RequestHandler layers all function exactly as in our previous (simpler) example. It's only once the SearchHandler starts looping over the components that things get more interesting.... Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 10 of 24 10/4/17, 4:32 PM
  • 11. query IndexSchema - SchemaFields ➔ FieldTypes QueryComponent.prepare() etc... "price:[* TO 1000]" ➔ float ➔ PointRangeQuery(...) ➔ filters[] div(popularity,price) ➔ ValueSource(IntFieldSource,...) FloatPointField: float - ValueSource - getRangeQuery() - etc... IntPointField: int - ValueSource - etc... FacetComponent.prepare() facet=true ✔ facet.field=cat ➔ ok? needDocSet = true SolrIndexSearcher div() sum() etc... Most items identical to those shown in the "simple" query are omitted for brevity. Of the new items shown here... Red: In addition to some additional logic in QueryComponent.prepare() method (to parse the filter query and more complex sort) we know also see the FacetComponent.prepare() method, which does it's own validation & sets a flag indicating that it needs extra info (the DocSet) once SolrIndexSearcher is asked to execute the Query Green: Named instances of (pluggable) ValueSourceParsers for parsing function strings -- used here in our sort, but could also be used in queries Orange: As before the IndexSchema, now showing that FieldTypes are also responsible for providing the range query (filter) and ValueSources (used by the functions) Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 11 of 24 10/4/17, 4:32 PM
  • 12. SolrIndexSearcher queryQueryComponent.process() search(...) ➔〈DocList,DocSet〉 etc... JsonResponseWriter DocList { + searcher.doc(#) ➔ Stored Fields + [explain ...] } + Facet Counts ➔ Bytes ➔ HTTP... ExplainAugmenter ChildDocTransformer queryFacetComponent.process() For Each "cat" Index Terms: ➔ Intersect with DocSet SubQueryAugmenter etc... searcher.explain(#) documentCache queryResultCache filterCache IndexReader - InvertedIndex - Stored Fields Most items identical to those shown in the "simple" query are omitted for brevity. Of the new items shown here... Red: Now when QueryComponent.process() executes the search, the "needsDocSet" flag set by FacetComponent.prepare() is also used. FacetComponent.process() can then use the resulting DocSet (an unordered set of all matching doc# -- regardless of sort) to compute the facet counts. Olive: Named instances of (pluggable) DocTransformers (or Augmenters) which can be used to annotate individual documents returned in the results. For this query in particular we see the ExplainAugmenter which uses the SolrIndexSearcher to get a (debugging) data structure "explaining" how the score of each document was computed. Green: the JsonResponseWriter not only returns the Stored Fields of each document, but also the results of any DocTransformers. It also serializes the Facet Counts. Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 12 of 24 10/4/17, 4:32 PM
  • 13. Simple Query SolrCloud: 4 Nodes, 2 Shards, 2 Replicas bin/solr -e cloud ... http://localhost:8983/solr/techproducts/select ? q = ipod & sort = inStock desc, score desc & fl = id, name & rows = 10 This is the same as or original simple query, still using the techproducts sample configs & data, but from here on we'll assume we're using a 4 node SolrCloud cluster, with the techproducts collection configured to have 2 shards, with a replication factor of 2. Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 13 of 24 10/4/17, 4:32 PM
  • 14. SolrDispatchFilter /techproducts ➔ tech_s1_r2 Jetty: http://host1:8983 SolrDispatchFilter /techproducts ?➔ host4 Jetty: http://host3:8983 SolrDispatchFilter /techproducts ?➔ tech_s2_r2 Jetty: http://host2:8983 SolrDispatchFilter /techproducts ➔ tech_s2_r1 Jetty: http://host4:8983 techproducts tech_s1_r2 foo foo_s1_r1 foo foo_s2_r1 techproducts tech_s1_r1 techproducts tech_s2_r1 foo foo_s1_r2 techproducts tech_s2_r2 foo foo_s2_r2 Purple: 4 Jetty instances, running on (the same port 8983 of) 4 different hosts Black: The 4 SolrDispatchFilters running inside each of these 4 Jetty instances, and how each of them resolves requests for the techproducts collection. Green the individual SolrCores (which are each a replica of some shard of a collection) running in each Solr node. Note that for the purposes of illustrating the diff possible ways a Solr request may be routed, host3 does not contain any SolrCores that are part of the techproducts collection. (Other Layers such as the Solr webapp and the CoreContainer have been omitted to save space) Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 14 of 24 10/4/17, 4:32 PM
  • 15. coordinator shard1 QueryComponent: prepare() + process() α: q=ipod&fl=id&fsv=true ➔ top ids + sort values β1: ids=X,Y,Z&fl=name ➔ ... shard2 QueryComponent: prepare() + process() α: q=ipod&fl=id&fsv=true ➔ top ids + sort values β2: ids=A,..,G&fl=name ➔ ... SearchHandler: /select Repeat until done: query.distributedProcess ➔ ShardRequests (α,β) Loop: ShardRequests query.handleResponse QueryComponent: distributedProcess() α: shard top10 + sort values β: full fl for final top10 ids FacetComponent Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 15 of 24 10/4/17, 4:32 PM
  • 16. Purple: The HTTP Layer showing 3 hosts: an arbitrary 'coordinator' node, and 2 nodes each hosting a replica of the 2 shards for the collection Black: SearchHandler. On the coordinator node, SearchHandler executes new logic to execute sub-requests created by it's SearchComponents to arbitrarily selected replicas of each shard. On the replicas handling these sub-requests, the SearchHandler processes these requests just as if they were simple (single node) queries. Red: SearchComponent methods. On the coordinator node SearchHandler loops over every component calling SearchComponent.distributedProcess() to create/modify sub-requests for the individual shards, and then calls SearchComponent.handleResponse() to merge the results from each shard and decide if/when/what additional information may be needed. This process repeats until all calls to distributedProcess() on all SearchComponents indicate that they are finished. Green & Blue: The 2 stages (α & β) of shard sub-requests needed to process this simple query. Note that the α-requests are identical for both shards, but the β-requests are slightly different to request the fl fields for the matches specific to that shard. Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 16 of 24 10/4/17, 4:32 PM
  • 17. Shard Request α q=ipod&fl=id&fsv=true&rows=10 sort=inStock desc, score desc numFound=42+314=356 Z, Zebra F, Frog B, Boat D, Deer C, Car X, X-Ray G, Gong A, Apple Y, Yo-Yo E, Ear Merged Shard 1 numFound=42 F〈true,6〉 B〈true,6〉 D〈true,5〉 C〈true,3〉 G〈true,2〉 A〈true,1〉 E〈false,5〉 Shard 2 numFound=314 Z〈true,6〉 X〈true,3〉 Y〈false,9〉 Shard Request β q=ipod&ids=...&fl=name Shard 1 A, Apple B, Boat C, Car D, Deer E, Ear F, Frog G, Gong Shard 2 X, X-Ray Y, Yo-Yo Z, Zebra Here we see hypothetical α request+responses, hypothetical β requests+responses, & the final Merged results from both -- showing how the IDs and sort values from the α request are used to determine which documents will be in the final results, and in which order. For these specific documents, the β requests+responses fill in the fl fields for the final client. Red & Blue: The responses from shard1 & shard2 for the α request Green & Purple: The responses from shard1 & shard2 for the β request Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 17 of 24 10/4/17, 4:32 PM
  • 18. Complex Query* SolrCloud: 4 Node, 2 Shards, 2 Replicas http://localhost:8983/solr/techproducts/select ? q = ipod & sort = inStock desc, score desc & fl = id, name & facet = true & facet.field = cat In the interest of time, this query is not as "Complex" as the "Complex" Single Core query we looked at before. I've omitted things like fq params, sorting on functions, and the use of DocTransformers in the fl because nothing about how those are handled in a Single Core query changes when they are requested by a coordinator node in a SolrCloud query. Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 18 of 24 10/4/17, 4:32 PM
  • 19. coordinator shard1 QueryComponent: prepare() + process() α: q=ipod&fl=id&fsv=true ➔ top ids + sort values β1: ids=X,Y,Z&fl=name ➔... FacetComponent: prepare() + process() α: facet.limit=N + extra ➔ top terms w/counts β1: ..._terms=aa,qq,... ➔... QueryComponent: distributedProcess() α: shard top10 + sort values β: full fl for final top10 ids shard2 FacetComponent: distributedProcess() α: facet.field=cat w/facet.limit overrequest β: request missing counts for final top terms SearchHandler: /select ➔ ShardRequests (α, β) Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 19 of 24 10/4/17, 4:32 PM
  • 20. Purple: The HTTP Layer showing 3 hosts: an arbitrary 'coordinator' node, and 2 nodes each hosting a replica of the 2 shards for the collection. To save space, the (largely redundant) details of the requests to shard2 are not shown. Black: SearchHandler. To save space, the details (shown in previous diagrams) regarding how SearchHandler processes requests when acting as a coordinator have been omitted -- the key thing to note is that even with the added complexity of the FacetComponent, there are still only 2 stages of sub-requests to each shard (α & β) Red: SearchComponent methods: QueryComponent behaves exactly as before Now that FacetComponent is in use, it can modify the sub- requests created by QueryComponent to "piggy back" on them and request additional information from each shard. Green & Blue: The 2 stages (α & β) of shard sub-requests needed to process this query. Although the details of the requests to shard2 are omitted for brevity, the α-requests are identical for both shards, and (as before) the β-requests are slightly different to request both the the fl fields for the document matches specific to that shard, as well as the facet counts for any "candidate" terms that were not included in the α response from that shard. Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 20 of 24 10/4/17, 4:32 PM
  • 21. Shard Request α facet.field=cat facet.limit=N+OVERREQUEST Shard Request β facet.field={!_terms=...}cat auto: 253 (3 + 250) lawn: 190 (20 + 170) ... DVD: 102 (5 + 97) Final (Merge α+β)Shard 1 games: 40 ... lawn: 20 books: 10 DVD: 5 ... beach: 4 toys: 3 Shard 2 auto: 250 lawn: 170 ... food: 100 DVD: 97 ... books: 90 clothing: 90 Shard 1 auto: 3 food: 0 Shard 2 games: 45 N auto: 250-253 (? + 250) lawn: 190 (20 + 170) ... games: 40-130 (40 + ?) food: 100-103 (? + 100) DVD: 102 (5 + 97) ... Merge α Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 21 of 24 10/4/17, 4:32 PM
  • 22. Here we see the additional information involved in α & β requests+responses+merging for our more complex queries compared to what we looked at before. The information requested & merged by QueryComponent is omitted for brevity, and we focus solely on how FacetComponent modifies those requests to "overrequest" the original facet.limit and what it does with the results. In the α request, over-request additional terms from each shard beyond what the user asked for; In the β request, ask each shard for the details about any terms that are "candidates" for the final results but where NOT already returned by this shard in the α response. Each term that is a candidate for the final response is shown in a unique color. Black/Grey is used to indicate terms where incomplete information is available to the coordinator, but enough is known to be confident that they can't possibly be candidates for the final results. Faded terms (in italics) show at what stage the coordinating FacetComponent knows that particular term can be eliminated for consideration. (While the "..." ellipses are used to denote the possibility of many additional terms depending on the value of facet.limit=N (which defaults to 100), viewers may find the easiest way to understand how these results are merged & refined is to assume N=3 and imagine the ellipses do not exist in the diagram) Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 22 of 24 10/4/17, 4:32 PM
  • 23. Q & A Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 23 of 24 10/4/17, 4:32 PM
  • 24. Me https://twitter.com/_hossman My Company https://www.lucidworks.com/ These Slides https://home.apache.org/~hossman/rev2017/ Solr Docs & Mailing List https://lucene.apache.org/solr/resources.html Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/ 24 of 24 10/4/17, 4:32 PM