Faceting with Lucene 
Block Join Query
Agenda 
1. Why we need special faceting for 
Block Join queries? 
1. Proposed Block Join facet component.
Introducing myself 
PRIVILEGED AND CONFIDENTIAL 
Oleg Savrasov, PhD 
A programmer 
Working for Grid Dynamics 
(griddynamics.com) 
Work and live in Saint-Petersburg, 
Russia
Online shopping
Jerrica is looking for a dress
Huge amount of dresses
Facet filters help 
Facet 
filters 
Reduced 
amount
Tasks to be solved 
● Performant Search 
● Facet 
calculation/filtering 
FacetComponent ?
Product has many SKU
Aggregated facet counts 
Facets should 
count products, 
not SKU. 
Expected 
facets: 
COLOR 
Blue : 1 
Red : 1 
SIZE 
S : 1 
M : 1
Flat documents don’t help 
False positive match for 
+COLOR:Blue +SIZE:M
Separate SKU documents 
q = *:* 
facet.field = COLOR 
facet.field = SIZE 
COLOR 
Blue : 1 
Red : 2 
SIZE 
S : 2 
M : 1 
Wrong 
numbers! 
There is 
only one 
product
Search products only 
q = *:* 
fq = scope:product 
facet.field = COLOR 
facet.field = SIZE 
COLOR : 0 
SIZE : 0 
No such 
fields in 
product 
documents
Aggregated facet counts 
Facets should 
count products, 
not SKU. 
Expected 
facets: 
COLOR 
Blue : 1 
Red : 1 
SIZE 
S : 1 
M : 1
Solr Block Join Support (since Lucene 3.4.0) 
Green 
Blue 
Yello 
w 
Yello 
w 
Blue 
Green 
Product 
Green 
Yello 
w 
Product 
Green 
Blue 
Yello 
w 
Yello 
w 
Product 
docId 
Child docs Parent doc 
Query: {!parent which="scope:product"}COLOR:Blue 
1 1 1 
Block 
1 1 
scope:product 
COLOR:Blue 
1 
ToParentQuery 1 1
SOLR-5743 Faceting with Block Join support 
● Create BlockJoinFacetComponent 
● ToParentQuery is expected 
● Facet counts should correspond to 
amount of parent documents 
● Only DocValues fields are 
supported
Faceting over DocSet slices 
Green 
Blue 
Yello 
w 
Yello 
w 
Blue 
Green 
Product 
Green 
Yello 
w 
Product 
Green 
Blue 
Yello 
w 
Yello 
w 
Product 
docId 
0 1 0 0 1 0 1 
DocSet Slice 
DocSet Slice counts 
COLOR Blue : 2 
Aggregated counts 
COLOR Blue : +1
Block Join Facet Component
BlockJoinFacetCollector
Facets counting
It works! 
q = 
{!parent 
which="scope:product"}COLOR:Blue 
child.facet.field = SIZE 
<response> 
... 
<lst name="facet_counts"> 
<lst name="facet_fields"> 
<lst name="SIZE"> 
<int name="S">14</int> 
<int 
name="L">22</int> 
<int 
name="XL">17</int> 
</lst> 
</lst> 
</lst> 
</response>
The dress is found
Further improvements 
● Thorough profiling 
● Performance improvements 
● Algorithmic improvements
References 
http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in- 
lucene 
http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html 
http://blog.griddynamics.com/2013/09/solr-block-join-support.html
Big thanks! 
Please vote for SOLR-5743. 
Do you have any questions?

Faceting with Lucene Block Join Query - Lucene/Solr Revolution 2014

  • 2.
    Faceting with Lucene Block Join Query
  • 3.
    Agenda 1. Whywe need special faceting for Block Join queries? 1. Proposed Block Join facet component.
  • 4.
    Introducing myself PRIVILEGEDAND CONFIDENTIAL Oleg Savrasov, PhD A programmer Working for Grid Dynamics (griddynamics.com) Work and live in Saint-Petersburg, Russia
  • 5.
  • 6.
    Jerrica is lookingfor a dress
  • 7.
  • 8.
    Facet filters help Facet filters Reduced amount
  • 9.
    Tasks to besolved ● Performant Search ● Facet calculation/filtering FacetComponent ?
  • 10.
  • 11.
    Aggregated facet counts Facets should count products, not SKU. Expected facets: COLOR Blue : 1 Red : 1 SIZE S : 1 M : 1
  • 12.
    Flat documents don’thelp False positive match for +COLOR:Blue +SIZE:M
  • 13.
    Separate SKU documents q = *:* facet.field = COLOR facet.field = SIZE COLOR Blue : 1 Red : 2 SIZE S : 2 M : 1 Wrong numbers! There is only one product
  • 14.
    Search products only q = *:* fq = scope:product facet.field = COLOR facet.field = SIZE COLOR : 0 SIZE : 0 No such fields in product documents
  • 15.
    Aggregated facet counts Facets should count products, not SKU. Expected facets: COLOR Blue : 1 Red : 1 SIZE S : 1 M : 1
  • 16.
    Solr Block JoinSupport (since Lucene 3.4.0) Green Blue Yello w Yello w Blue Green Product Green Yello w Product Green Blue Yello w Yello w Product docId Child docs Parent doc Query: {!parent which="scope:product"}COLOR:Blue 1 1 1 Block 1 1 scope:product COLOR:Blue 1 ToParentQuery 1 1
  • 17.
    SOLR-5743 Faceting withBlock Join support ● Create BlockJoinFacetComponent ● ToParentQuery is expected ● Facet counts should correspond to amount of parent documents ● Only DocValues fields are supported
  • 18.
    Faceting over DocSetslices Green Blue Yello w Yello w Blue Green Product Green Yello w Product Green Blue Yello w Yello w Product docId 0 1 0 0 1 0 1 DocSet Slice DocSet Slice counts COLOR Blue : 2 Aggregated counts COLOR Blue : +1
  • 19.
  • 20.
  • 21.
  • 22.
    It works! q= {!parent which="scope:product"}COLOR:Blue child.facet.field = SIZE <response> ... <lst name="facet_counts"> <lst name="facet_fields"> <lst name="SIZE"> <int name="S">14</int> <int name="L">22</int> <int name="XL">17</int> </lst> </lst> </lst> </response>
  • 23.
  • 24.
    Further improvements ●Thorough profiling ● Performance improvements ● Algorithmic improvements
  • 25.
    References http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-in- lucene http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html http://blog.griddynamics.com/2013/09/solr-block-join-support.html
  • 26.
    Big thanks! Pleasevote for SOLR-5743. Do you have any questions?