Sunspot is a popular ruby library providing access to Apache Solr, the renounced text search engine. In these slides, we go display how you can use this gem in your app.
3. Apache Solr
Solr is an open source enterprise search platform, written in Java, from the Apache Lucene
project. Its major features include full-text search, hit highlighting, faceted search, real-time
indexing, dynamic clustering, database integration, NoSQL features and rich document (e.g., Word,
PDF) handling.
Providing distributed search and index replication, Solr is designed for scalability and Fault
tolerance.
Solr is the second-most popular enterprise search engine after Elasticsearch.
4. Major search features in solr:
● Full text.
● Phrases.
● Boosting.
● Scoping.
● Disjunctions and conjunctions.
● Pagination.
● Faceting.
○ Field facets.
○ Query facets.
○ Range facets.
● Ordering.
○ Order by function.
● Grouping.
● Geospatial.
● Highlighting.
● Stats.
● Dynamic fields.
Agenda
5. ➢ Text fields will be full-text
searchable. Other fields (e.g.,
integer and string) can be used
to scope queries.
Setting Up Objects
8. ➢ phrase searches are
represented as a double
quoted group of words.
➢ query_phrase_slop sets
the number of words that
may appear between the
words in a phrase.
Search In Depth - cont.
● Phrases
10. ➢ Fields not defined as text (e.g., integer, boolean,
time, etc...) can be used to scope (restrict) queries
before full-text matching is performed.
Search In Depth - cont.
● Scoping (Scalar Fields)
12. ➢ The results array that is
returned has methods mixed in
that allow it to operate
seamlessly with common
pagination libraries like
will_paginate and kaminari.
➢ By default, Sunspot requests
the first 30 results from Solr
Search In Depth - cont.
● Pagination
13. Faceting
➢ Faceting is a feature of Solr that determines the number of documents that match a given
search and an additional criterion. This allows you to build powerful drill-down interfaces for
search.
➢ Each facet returns zero or more rows, each of which represents a particular criterion
conjoined with the actual query being performed. For field facets, each row represents a
particular value for a given field. For query facets, each row represents an arbitrary scope;
the facet itself is just a means of logically grouping the scopes.
➢ By default Sunspot will only return the first 100 facet values. You can increase this limit, or
force it to return all facets by setting limit to -1.
17. ➢ By default Sunspot
orders results by
"score": the
Solr-determined
relevancy metric.
Sorting can be
customized with
the order_by
method.
Ordering
18. ➢ Solr supports
sorting on
multiple fields
using custom
functions (Solr 3.1
and above).
Ordering by function
19. ➢ Solr supports
grouping
documents,
similar to an SQL
GROUP BY.
➢ Grouping is only
supported on
string fields that
are not
multivalued. To
group on a field
of a different
type (e.g.,
integer), add a
denormalized
string type.
Grouping
24. ● Solr can return some statistics on indexed numeric fields. Fetching statistics for
average_rating.
○ Stats on multiple fields.
○ Faceting on stats.
Stats
25. Dynamic Fields
Dynamic fields allow Solr to index fields that you did not explicitly define in
your schema. This is useful if you discover you have forgotten to define one
or more fields. Dynamic fields can make your application less brittle by
providing some flexibility in the documents you can add to Solr.
Note: you can’t define a dynamic_text field. Hence, it is not possible to do a
fulltext search on dynamic fields.
26. class MyClass
searchable do
dynamic_integer :custom_category_ids, :multiple => true do
custom_categories.inject(Hash.new { |h, k| h[k] = [] }) do |map, custom_category|
map[custom_category.name] << custom_category_values_for(custom_category)
map
end
end
end
end
search = MyClass.search do
dynamic(:custom_categories) do
facet(some_custom_category.id)
end
end
facet = search.facet(:custom_categories, some_custom_category.name)
Dynamic Fields