This document discusses strategies for structuring a decoupled full-text search system in three parts. First, it introduces full-text search engines and their purpose of optimizing text searches and relevance ranking. Second, it warns of the "convenient trap" of directly coupling search logic to models, which can lead to complex and brittle code. Third, it proposes designing search features as decoupled modules to isolate search logic and enable independent testing of filters, facets and other components.
11. ENGENHO DE BUSCA
Feito com o objetivo de
realizar buscas e gerar
estatísticas destes dados
• Otimizado para lidar com texto
• Estrutura de índice granular
• Ranking de relevância
17. TÃO CONVENIENTE
ACOPLAR AO MODELO
Esse pessoal gosta de fazer
engenharia demais.
KISS – Keep It Stupidly Simple
Book
BookSearchable
.fulltext_search
Modelo
Módulo
Método
Filtro Filtro
Facet Facet
Query
Callbacks para indexação
. . .
21. ARMADILHA DE PERTO
• Método de classe com 87 linhas
• 7 filtros
• 4 facets
• Filtros implementados em
métodos de classe privados -
58 linhas
• Facets implementados inline
Exemplo real
22. ARMADILHA DE PERTO
• Filtros aninhados
• Manipulação dos filtros para
aplicá-los aos facets
correspondentes
• Lógica de paginação e
ordenação
• Uso de um único índice
Exemplo real
23. ARMADILHA DE PERTO
• Filtros
implementados em
métodos de classe
privados
module BookElasticsearch
# (...)
# (...)
filter: {
and: [
BookSearchable.inactives?(false),
BookSearchable.country(options[:country]),
BookSearchable.category_id(options[:category_id]),
BookSearchable.publishers(options[:publishers]),
BookSearchable.price(options[:price]),
]
},
# (...)
# (...)
def inactives?(status)
{ term: { inactive: status } }
end
def country(country)
{ term: { country: country } }
end
def category_id(category_id)
if category_id.present?
{ term: { categories_ids: category_id } }
else
{}
end
end
def publishers(publishers)
if publishers.present?
{ terms: { publisher: publishers } }
else
{}
end
end
# (...)
https://gist.github.com/julianalucena/
5aee5bbb8fb4fe4acdd4
25. ARMADILHA DE PERTO
• Impossibilidade de
isolar os testes
• Um filtro sempre
pode alterar o
retorno da busca e
influenciar no teste
de outro
require 'spec_helper'
describe Offer do
escribe '#search_str', 'should accept an options hash with these options', elasticsearch: true do
before(:each) do
reset_index_for Book
Rails.configuration.results_count = 10
end
describe 'publishers_names' do
let(:query) { Faker::Lorem.word }
let(:publisher1) { FactoryGirl.create(:active_publisher) }
let(:publisher2) { FactoryGirl.create(:active_publisher) }
before do
FactoryGirl.create(:book, name: query)
FactoryGirl.create(:book, name: "my #{query}", publisher: publisher1)
FactoryGirl.create(:book, name: "his #{query}", publisher: publisher2)
Book.index.refresh
end
it 'filters by multiple publishers' do
expect(Book.search_str(query).total).to eq 3
expect(Book.search_str(query,
{publishers_names: [publisher1.name, publisher2.name]}).total).to eq 2
end
end
it 'page' do
Rails.configuration.results_count = 1
FactoryGirl.create(:book, name: 'nonsolid one')
FactoryGirl.create(:book, name: 'nonsolid two')
Book.index.refresh
Book.search_str('nonsolid').total.should eq 2
Book.search_str('nonsolid').count.should eq 1
Book.search_str('nonsolid', {page: 1}).count.should eq 1
Book.search_str('nonsolid', {page: 2}).count.should eq 1
end
it 'size' do
FactoryGirl.create(:book, name: 'incinerator')
FactoryGirl.create(:book, name: 'incinerator clayton')
Book.index.refresh
Book.search_str('incinerator').total_pages.should eq 1
Book.search_str('incinerator', {size: 1}).total_pages.should eq 2
end
end
describe '#search_str', elasticsearch: true do
before(:each) do
reset_index_for Book
Rails.configuration.results_count = 10
inactive_publisher = FactoryGirl.create(:publisher, inactive: true)
FactoryGirl.create(:book, name: 'Ventoinha pblica')
FactoryGirl.create(:book, name: 'Ventoinha do fornecedor inativo', publisher: inactive_publisher)
Book.index.refresh
end
describe 'filters' do
describe "price filter" do
let!(:book) { FactoryGirl.create(:book, price: 20) }
it "returns books that price belongs to the searched range" do
FactoryGirl.create(:book, name: book.name, price: 10)
FactoryGirl.create(:book, name: book.name, price: 40)
Book.index.refresh
results = Book.search_str(book.name, price: { min: 20, max: 30 })
expect(results.total).to eq(1)
expect(results.first.id).to eq(book.id.to_s)
end
end
describe "category filter" do
let(:category) { FactoryGirl.create(:category) }
let(:child_category) { FactoryGirl.create(:category, parent: category) }
let!(:book) { FactoryGirl.create(:book, category: child_category) }
before do
FactoryGirl.create(:book, name: book.name)
Book.index.refresh
end
it "returns books that belongs to specified category" do
results = described_class.search_str(book.name, {
category_id: child_category.id
})
expect(results.first.id).to eq(book.id.to_s)
end
end
end
describe 'facets' do
shared_examples_for 'facet with price range' do |facet, info|
let(:search_attrs) { {} }
let(:book) { FactoryGirl.create(:book, price: 10) }
it 'count only books that price belongs to price range' do
FactoryGirl.create(:book, name: book.name, price: 40)
Book.index.refresh
conditions = { price: { min: 10, max: 20 } }
results = Book.search_str(book.name, conditions.merge(search_attrs))
expect(results.facets[facet.to_s][info.to_s]).to eq 1
end
end
shared_examples_for 'facet with category filter' do |facet, info|
let(:search_attrs) { {} }
let(:book) { FactoryGirl.create(:book, category: child_category) }
let(:category) { FactoryGirl.create(:category) }
let(:child_category) { FactoryGirl.create(:category, parent: category) }
it 'count only books that belongs to category down tree' do
FactoryGirl.create(:book, name: book.name)
Book.index.refresh
conditions = { category_id: category.id }
results = Book.search_str(book.name, conditions.merge(search_attrs))
expect(results.facets[facet.to_s][info.to_s]).to eq 1
end
end
shared_examples_for 'facet with publisher filter' do |facet, info|
let(:search_attrs) { {} }
let(:publisher) { FactoryGirl.create(:valid_publisher) }
let(:book) { FactoryGirl.create(:book, publisher: publisher) }
before do
FactoryGirl.create(:book, name: book.name)
Book.index.refresh
end
it 'counts only books that belongs to publisher' do
conditions = { publisher_name: [publisher.name] }
results = Book.search_str(book.name, conditions.merge(search_attrs))
expect(results.facets[facet.to_s][info.to_s]).to eq 1
end
end
it_should_behave_like 'facet with price range', :publisher_name, :total
it_should_behave_like 'facet with category filter', :publisher_name, :total
it_should_behave_like 'facet with price range', :category_id, :total
describe "facet price_statistics" do
let(:facets) { Book.search_str('keyboard').facets }
before do
Rails.configuration.results_count = 10
end
it "has price_statistics facet" do
expect(facets).to have_key('price_statistics')
end
it_should_behave_like 'facet with category filter',
:price_statistics, :count
context do
let(:facet) { facets['price_statistics'] }
it "has min statistics" do
expect(facet).to have_key('min')
end
it "has max statistics" do
expect(facet).to have_key('max')
end
end
end
describe 'facet category_id' do
let(:facet) do
described_class.search_str(book.name).facets['category_id']
end
let!(:book) { FactoryGirl.create(:book, category: child_category) }
let(:child_category) do
FactoryGirl.create(:category, parent: category)
end
let(:category) { FactoryGirl.create(:category) }
it "has qty of books per category from hierarchy" do
Book.index.refresh
expect(facet['terms']).to have(2).items
categories_ids = facet['terms'].map { |f| f['term'] }
expect(categories_ids).to
match_array([category.id, child_category.id])
quantities = facet['terms'].map { |f| f['count'] }
expect(quantities).to match_array([1, 1])
end
end
end
describe "ordering" do
let(:results) { described_class.search_str('Aa', order: order_params) }
describe "by any attribute" do
before do
FactoryGirl.create(:book, name: 'Aaz')
FactoryGirl.create(:book, name: 'Aaa')
Book.index.refresh
end
context do
let(:order_params) { { name: 'asc' } }
it "returns in ascending order" do
expect(results.first.name).to eq('Aaa')
end
end
context do
let(:order_params) { { name: 'desc' } }
it "returns in ascending order" do
expect(results.first.name).to eq('Aaz')
end
end
end
end
end
end
Testa Filtro A
Testa Paginação
Setup para vários testes
Testa Filtro B
Testa Filtro C
Testa Facets
Testa Facet A
Testa Facet B
Testa Ordenação
Testa Facets
27. LISTINHA
• Busca complexa de entender
• Baixa legibilidade
• Impossibilidade de isolar os testes
• Uma classe sabe como construir todos os filtros e facets
• Replicação de código ao precisar de filtros e facets em
buscas distintas
29. NECESSIDADES DO
SISTEMA DE BUSCA
• Definir a query de busca
• Definir filtros
• Definir facets
• Aplicar filtros por padrão
• Aplicar filtros opcionais
• Aplicar facets
• Aplicar paginação e ordenação
30. ESTRUTURA DO SISTEMA
DE BUSCA
KISS – Keep It Simple, Stupid
A busca só precisa saber:
• Definir a query
• Quais filtros e facets aplicar
BookSearch
CategoryFilter
Query
PublisherFilter
CategoryFacet PriceStatisticsFacet
Apenas Plain Old Ruby Objects
32. E QUEM VAI DEFINIR OS FILTROS E
FACETS?
Eles mesmos.
33. • Define interface similar a do Tire
• Aplica paginação e ordenação
BaseSearch
BookSearch
CountryFilter PriceFilter
PriceStatisticsFacet PublishersFacet
HqSearch
• Define a query
• Aplica filtros padrão e opcionais
• Aplica facets
• Define filtro reusável
• Define facet reusável
RESPONSABILIDADES
34. SUGESTÃO DE ORGANIZAÇÃO
DO SISTEMA DE BUSCA
bookstore (master) > tree app/services/text_search/
app/services/text_search/
base_search.rb
book_search.rb
hq_search.rb
facets
category_facet.rb
price_statistics_facet.rb
publisher_facet.rb
filters
active_filter.rb
country_filter.rb
category_filter.rb
price_filter.rb
publisher_filter.rb
35. TextSearch::BookSearch.search('Eu sou Malala').
filter(
country: ‘BR',
price: { max: 50 }
).with_facets.
order('price ASC’).
per_page(20).page(2)
Nova interface para buscar livros
• É possível fazer uma busca sem aplicar os filtros opcionais
• É possível fazer uma busca sem calcular os facets
• A ordenação e paginação são manipuladas de forma
similar ao kaminari
36. NOVA ESTRUTURA – FILTRO
• Sabe como construir
o filtro por Categoria
class CategoryFilter
# (...)
def apply!
return if category_id.blank?
filters[:categories_ids] = {
term: { categories_ids: category_id }
}
end
# (...)
end
https://gist.github.com/julianalucena/
34246b0c837fd163cc0f
37. NOVA ESTRUTURA – FACET
• Sabe como definir
facet de Categorias
• Sabe qual filtro deve
ser ignorado no
facet de Categorias
class CategoryFacet
# (...)
def apply!
facet_filters = filters.except(:categories_ids)
search.facet :category_id do
terms :categories_ids, size: 100, all_terms: false
unless facet_filters.empty?
facet_filter :and, facet_filters.values
end
end
end
# (...)
end
https://gist.github.com/julianalucena/
34246b0c837fd163cc0f
38. • Sabe como fazer a
query
• Sabe quais filtros
devem ser aplicados
NOVA ESTRUTURA – BUSCA class BookSearch < BaseSearch
# (...)
def search(term, country: 'BR', **options)
@search = Tire.search(search_indexes) do |s|
s.query do
boolean do
should do
match :description, term, operator: "AND", boost: 5
end
# (...)
end
end
end
Filters::ActiveFilter.apply!(filters, true)
Filters::CountryFilter.apply!(filters, country)
self
end
def filter(conditions)
Filters::PriceFilter.apply!(filters, conditions)
Filters::CategoryFilter.apply!(filters, conditions)
Filters::PublisherFilter.apply!(filters, conditions)
self
end
# (...)
end
https://gist.github.com/julianalucena/
34246b0c837fd163cc0f
39. • Sabe quais facets
devem ser aplicados
• Sabe o índice a ser
usado
NOVA ESTRUTURA – BUSCA
class BookSearch < BaseSearch
# (...)
def with_facets
Facets::PriceStatisticsFacet.apply!(@search, filters)
Facets::CategoryFacet.apply!(@search, filters)
Facets::PublisherFacet.apply!(@search, filters)
self
end
private
def search_indexes
[Book.index_name]
end
end
https://gist.github.com/julianalucena/
34246b0c837fd163cc0f
41. O QUE É NECESSÁRIO NO
TESTE?
• Permitir conexões ao
Elasticsearch
• Popular índice com documentos
• Atualizar índice
• Testar 😱
• Resetar índice
42. TESTE ISOLADO – FACTORY
DE BUSCA GENÉRICA
• Busca genérica que
retorna todos os
documentos do
índice
• Aplica filtros e
facets
GenericSearch
Query
all documents
index
CategoryFilter
Filtro a ser testado
Isolado
43. TESTE ISOLADO – FILTRO
• Busca genérica no
índice de Book
• Apenas o filtro
influencia nos itens
retornados
describe TextSearch::Filters::CategoryFilter do
include TextSearchHelpers
subject do
text_search_for(Book).add_filters do |filters, conditions|
described_class.apply!(filters, conditions)
end
end
after { reset_index_for Book }
let(:category) { FactoryGirl.create(:category) }
let!(:book) { FactoryGirl.create(:book, category: category) }
before do
FactoryGirl.create(:book)
refresh_index_for Book
end
it "returns books that belongs to specified category" do
results = subject.filter(category_id: category.id).results
expect(results.count).to eq(1)
expect(results.first.id).to eq(book.id.to_s)
end
end
https://gist.github.com/julianalucena/
34246b0c837fd163cc0f
44. TESTE ISOLADO – FACET
• Busca genérica no
índice de Book
• Apenas o facet
influencia nos
resultados
agregados
describe TextSearch::Facets::CategoryFacet do
include TextSearchHelpers
subject do
text_search_for(Book).add_facets do |search, filters|
described_class.apply!(search, filters)
end
end
after { reset_index_for Book }
let(:facets) { subject.with_facets.results.facets }
let(:facet) { facets['category_id'] }
let!(:book) { FactoryGirl.create(:book, category: category) }
let(:category) { FactoryGirl.create(:category) }
before { refresh_index_for Book }
it "has category_id facet" do
expect(facets).to have_key('category_id')
end
it "has qty of books per category" do
expect(facet['terms']).to have(1).items
categories_ids = facet['terms'].map { |f| f['term'] }
expect(categories_ids).to match_array([category.id])
quantities = facet['terms'].map { |f| f['count'] }
expect(quantities).to match_array([1])
end
#(...)
end
https://gist.github.com/julianalucena/
34246b0c837fd163cc0f
45. TESTE ISOLADO – SEARCH
• Verifica se a query
retorna os itens
corretos
describe TextSearch::BookSearch do
describe "#search" do
it "return self" do
expect(subject.search('term')).to eq(subject)
end
context do
let!(:book) { FactoryGirl.create(:book) }
before { refresh_index_for Book }
after { reset_index_for Book}
it "matches with book's name" do
results = subject.search(book.name).results
expect(results).to have(1).item
expect(results.first.name).to eq(book.name)
end
# (...)
end
# (...)
end
# (...)
https://gist.github.com/julianalucena/
34246b0c837fd163cc0f
46. TESTE ISOLADO – SEARCH
• Verifica se os filtros
e facets são
aplicados
describe TextSearch::BookSearch do
describe "#search" do
describe "default filters" do
it "applies InactiveFilter with flag: true" do
expect(TextSearch::Filters::ActiveFilter).to
receive(:apply!).
with(an_instance_of(Hash), true)
subject.search('term')
end
# (...)
end
end
describe "#filter" do
let(:conditions) { double(Hash, :[] => nil) }
it "applies PriceFilter with passed conditions" do
expect(TextSearch::Filters::PriceFilter).to
receive(:apply!).
with(an_instance_of(Hash), conditions)
subject.search('term').filter(conditions)
end
# (...)
end
describe '#with_facets', elasticsearch: true do
describe "publisher facet" do
it "applies PublisherFacet" do
expect(TextSearch::Facets::PublisherFacet).to
receive(:apply!)
subject.search('term').with_facets
end
end
end
# (...)
https://gist.github.com/julianalucena/
34246b0c837fd163cc0f
47. O QUE MELHOROU?
• Baixa complexidade
• Melhor legibilidade
• Filtros e facets reusáveis
• Testes direcionados e isolados
• Possibilidade de usar mais de um índice sem ficar confuso
• Busca 99% desacoplada do modelo
49. PARA POR AQUI?
• Remover menção aos modelos nos testes e buscas (usar
nome do índice)
• Inserir direto no Elasticsearch ao invés de usar o
FactoryGirl + indexação feita pelo callback do modelo
• 💡 FactoryDocument
50. PARA POR AQUI?
• Desacoplar indexação do modelo
• 💡
• Estrutura com suporte a diversos backends de busca
• Lógica de indexação desacoplada do modelo
51. O QUE VOCÊS ME DIZEM?
Look icon created by Sebastian Langer
from the Noun Project